yigitcolakoglu
/
fr1nge.xyz

+++title = "STM CTF 2021 Ear To Ear Coding Writeup"date = "2021-10-25T09:49:22+02:00"author = "Yigit Colakoglu"authorTwitter = "theFr1nge"cover = ""tags = ["ctf", "coding", "numpy", "stmctf2021"]keywords = ["ctf", "cybersecurity", "coding", "python"]description = "The writeup to the Ear to Ear coding challenge on STM CTF 2021"showFullContent = false+++
The writeup to the Ear to Ear coding challenge on STM CTF 2021  
For the challenge, we are given a web server that provides the json data:
```json{  "word": "hidden",  "sequence": "938658411126141947251193886"}
```
The server also has an endpoint where you can submit an answer and the serverwould respond with the flag.  We are also given a file, called *vectors.txt*that contains one word, and a set of signed decimal numbers.  You can downloadthe file [here](https://yeetstore.s3.eu-central-1.amazonaws.com/vectors.txt)
OK, that is a cryptic question, thankfully the creators of the questionrealized this and provided a hint.  
> Each vector represents the corresponding word in a vector space where the
> similar words are closer to each other. You can use the cosine distances
> between vectors to find the nth most similar words.  You are given a sequence
> of numbers and a word as a starting point.  To find the flag, you need to
> traverse the words using the nth most similar words, based on the given
> sequence.
> > Dummy Example (Similar words are chosen randomly, for demonstration): Word:
> Apple Sequence: 213 The 2nd most similar word of 'apple' -> 'pie' The 1st
> most similar word of 'pie' -> 'cake' The 3rd most similar word of 'cake' ->
> 'chocolate' Send chocolate to api to retrieve the flag.

Now that we know what the question is about, it is actually pretty easy. All weneed to do is save each vector on a numpy array, and compare the vector of theword that is provided to us on the web server with every single vector and getthe n+1th vector that has the smallest distance with the word(because theclosest neighbour to the word would be itself). Lucky for us, we can calculatethe cosine distance if each vector in two matrices using the(scipy.spatial.distance)[https://docs.scipy.org/doc/scipy/reference/spatial.distance.html]and numpy's(argsort)[https://numpy.org/doc/stable/reference/generated/numpy.argsort.html].Here is the function that return's n'th closest neighbour to a vector in a vector space.
```pydef getnclosest(v,n):    distances = distance.cdist([v], vectors, 'cosine')    p = np.argsort(distances)    return p[0][n-1]```
Now that we have the basic function, we can easily write some wrapper code that will allow us tofind the resulting word. Here is the full code that should do that:
```pyfrom scipy.spatial import distancefrom tqdm import tqdmimport numpy as np
def vectortostr(vec):    vs = ""    for i in range(len(vec)):        vs += str(vec[i])        if i != len(vec) - 1:            vs += " "    return vs

f = open("vectors.txt", "r")
l1 = f.readline()
y = int(l1.split(" ")[0])x = int(l1.split(" ")[1])
vectors = [None]*y
wordvectormap = {}
vectorwordmap = {}
print("[INFO]: Loading vectors from file.")for i in tqdm(range(y)):    newlist = [None]*(x)    l = f.readline().split(" ")    for j in range(1,x+1):        newlist[j-1] = float(l[j])
    vectors[i] = newlist    wordvectormap[l[0]] = newlist    vectorwordmap[vectortostr(newlist)] = l[0]
vectors = np.array(vectors)
print("[INFO]: Done loading vectors. Dropping you into a shell")
def getnclosest(v,n):    distances = distance.cdist([v], vectors, 'cosine')    np.fill_diagonal(distances, np.inf)    return p[0][n-1]
cmd = 0
while cmd != "exit":    try:        cmd = input(">> ")        cmdparts = cmd.split(" ")        if cmdparts[0] == "search":            vector = [None]*(x)            for i in range(1, x+1):                vector[i-1] = float(cmdparts[i])
            dist, index = tree.query(vector, k=[int(cmdparts[-1])])            v = tree.data[index][0]            print(vectorwordmap[vectortostr(v)])
        elif cmdparts[0] == "eartoear":            w = cmdparts[1]            vector = np.array(wordvectormap[w])            sequence = cmdparts[2]            print("START: " + w)            for i in sequence:                cindex = getnclosest(vector, int(i)+1)                vector = vectors[cindex]                w = vectorwordmap[vectortostr(vector)]            print("END: " + w)

    except KeyboardInterrupt:        break    except Exception as e:        print(str(e))
print("\nBye Bye")```
When we run the code, here is the result:
```[INFO]: Loading vectors from file.100%|██████████████████████████████████████████████████████████████████| 1193514/1193514 [00:10<00:00, 112756.85it/s][INFO]: Done loading vectors. Dropping you into a shell>> eartoear hidden 938658411126141947251193886START: hiddenbrings, 9gives, 3could, 8n't, 6if, 5know, 8n't, 4think, 1n't, 1think, 1know, 2mean, 6know, 1n't, 4think, 1either, 9unless, 4because, 7means, 2everything, 5something, 1everything, 1means, 9unless, 3matter, 8difference, 8reasons, 6END: reasons>>```
And after we submit, we get the flag: `STMCTF{Je0pardy_w@ts0n}`. Nice!