Points: 250
Tags: picoCTF 2019, Forensics
Author: JOHN HAMMOND
Description:
I stopped using YellowPages and moved onto WhitePages... but the page they gave me is all blank!
Hints:
(None)
Challenge link: https://play.picoctf.org/practice/challenge/51
Let's start by checking the file
┌──(kali㉿kali)-[/mnt/…/picoCTF/picoCTF_2019/Forensics/WhitePages]
└─$ file whitepages.txt
whitepages.txt: Unicode text, UTF-8 text, with very long lines (1376), with no line terminators
┌──(kali㉿kali)-[/mnt/…/picoCTF/picoCTF_2019/Forensics/WhitePages]
└─$ xxd -g 1 -l 128 whitepages.txt
00000000: e2 80 83 e2 80 83 e2 80 83 e2 80 83 20 e2 80 83 ............ ...
00000010: 20 e2 80 83 e2 80 83 e2 80 83 e2 80 83 e2 80 83 ...............
00000020: 20 e2 80 83 e2 80 83 20 e2 80 83 e2 80 83 e2 80 ...... ........
00000030: 83 e2 80 83 20 e2 80 83 e2 80 83 20 e2 80 83 20 .... ...... ...
00000040: 20 20 e2 80 83 e2 80 83 e2 80 83 e2 80 83 e2 80 ..............
00000050: 83 20 20 e2 80 83 20 e2 80 83 e2 80 83 20 e2 80 . ... ...... ..
00000060: 83 20 20 e2 80 83 e2 80 83 e2 80 83 20 20 e2 80 . ......... ..
00000070: 83 20 20 e2 80 83 20 20 20 20 e2 80 83 20 e2 80 . ... ... ..
It is somewhat hard to see but the file consists of two types of Unicode whitespace characters:
- Normal SPACE (U+0020, hex
20
) - EM SPACE (U+2003, hex
e2 80 83
)
Converting hex values to Unicode code points can be done with
┌──(kali㉿kali)-[/mnt/…/picoCTF/picoCTF_2019/Forensics/WhitePages]
└─$ echo -ne '\xe2\x80\x83' | iconv -f 'utf-8' -t 'utf-16be' | xxd -p
2003 ...
And converting code point to hex values can be done with
┌──(kali㉿kali)-[/mnt/…/picoCTF/picoCTF_2019/Forensics/WhitePages]
└─$ echo -n $'\u2003' | xxd -g 1
00000000: e2 80 83 ...
Lets write a Python script that assumes these spaces form a binary string of ascii characters
#!/usr/bin/python
# Convert to binary string
with open("whitepages.txt", mode="r", encoding="utf8") as f:
result = ""
text = f.read(1)
while text:
if text == u'\u2003': # EM SPACE
result += '0'
elif text == u'\u0020': # SPACE
result += '1'
text = f.read(1)
# Divide the binary string into array of 8-bit binary chunks
n = 8
split_result = [result[i:i+n] for i in range(0, len(result), n)]
# Convert to ascii text
flag = ""
for item in split_result:
flag += chr(int(str(item), 2))
print(flag)
Then we run the script to get the flag
┌──(kali㉿kali)-[/mnt/…/picoCTF/picoCTF_2019/Forensics/WhitePages]
└─$ ./decode.py
picoCTF
SEE PUBLIC RECORDS & BACKGROUND REPORT
5000 Forbes Ave, Pittsburgh, PA 15213
picoCTF{<REDACTED>}
For additional information, please see the references below.