## Metadata

This challenge is from DownUnderCTF 2023. I played under NUS Greyhats instead of the usual SEE and we won 1st. However I didn’t really contribute much, I joined halfway through the CTF and had other commitments and accumulated a total of only ~1000/12000 of the eventual points the whole team made.

Regardless, I’m happy that we blooded Encrypted Mail 31 hours into the CTF (beating SEE which I betrayed, sorryyyy Neobeo and Warri). Encrypted Mail was the 2nd hardest crypto challenge, and the hardest was, unfortunately for the author Joseph, unsolved.

## Challenge

- Points: 449
- Solves: 3
- Author: joseph

Zero-knowledge authentication, end-to-end encryption, this new mail app has it all. For a limited time, admins may be sending flags to users. Sign up today to get yours!

`nc 2023.ductf.dev 30000`

### Objective

Players are given a minimal mail-server implementation to interact with. There are 4 options to choose from:

- Register
- Login
- Send Message
- View Inbox

When you register, the server asks for your username and your public key $y$ that will be used during Login. The intended usage of this mail server would require you to generate a private key $x$ and generate $y$ from it.

During Login, the server prompts you with a series of *challenges*, which should require that you know the secret private key $x$ for that account to answer the challenges correctly. This is the “zero-knowledge” part of this mail server.

Once you log in, you can send messages and view your inbox. To send a message, you would need to input the recipient’s username. The server then replies with the recipient’s public key. You are then required to encrypt your message with the recipient’s public key, sign the message with your private key and tell the server the resulting encrypted message and signature.

When you view your inbox, the server replies with all the messages you have. However, the server replies with the encrypted messages (encrypted with your public key) and the signatures (for you to verify the sender with the sender’s public key).

In addition to these functionalities, there are two bot accounts.

`admin`

`flag_haver`

The `admin`

bot checks for any new accounts and sends them an (encrypted) sweet welcome message `Welcome {username}!`

. The `flag_haver`

bot iterates all the messages they have, and if the message is from `admin`

(`flag_haver`

checks the signature) and has the following format: `Send flag to {username}`

, `flag_haver`

obliges and sends the flag as an (encrypted) message to the user.

`flag_haver`

is the only account that has access to the flag, so we must get `admin`

to send them `Send flag to {username}`

. However, there’s nothing that prompts `admin`

to do such a thing, so we probably have to log into `admin`

.

Once we log into `admin`

, it appears that we still need the `admin`

’s private key to sign the message, so that `flag_haver`

will happily read the message.

## Attack

### Logging into admin

We start with first logging into `admin`

. The code for this is implemented in `Authenticator`

```
g = 3
p = 1467036926602756933667493250084962071646332827366282684436836892199877831990586034135575089582195051935063743076951101438328248410785708278030691147763296367303874712247063207281890660681715036187155115101762255732327814001244715367
class Authenticator:
def __init__(self, pubkey):
self.y = pubkey
def generate_challenges(self, n=128):
challenges = []
answers = []
for _ in range(n):
b = round(random.random())
r = random.getrandbits(p.bit_length())
c = random.getrandbits(p.bit_length())
c1 = pow(g, r, p)
if b == 0:
c2 = pow(g, c, p)
elif b == 1:
c2 = pow(self.y, r, p)
answers.append(b)
challenges.append((int(c1), int(c2)))
self.answers = answers
return challenges
def verify_answers(self, answers):
return len(answers) == len(self.answers) and all(a == b for a, b in zip(answers, self.answers))
```

The authentication scheme is a zero-knowledge variant of Diffie-Hellman (I don’t know the actual name of this variant).

The user is expected to randomly generate a private key $x$ and compute their public key as $g^x = y ; \mathrm{mod} ; p$ where $p$ is a prime and $g = 3$. $p$ is a strong prime so the Discrete Logarithm Problem (DLP) is hard on $\mathbb{Z}/p\mathbb{Z}$ is computationally hard.

The server knows a user’s public key $y$ and wants the user to prove that they know their private key (without actually transmitting the private key). The server generates $128$ challenges, for each challenge, it generates a boolean $b$, and random numbers $r, c \in [0, p)$, and computes:

$$ \begin{aligned} c_1 &= g^r \text{ } \mathrm{mod} \text{ } p \\ c_2 &= \begin{cases} y^r = g^{x r} \text{ } \mathrm{mod} \text{ } p & \text{if b} \\ g^c \text{ } \mathrm{mod} \text{ } p & \text{otherwise} \end{cases} \end{aligned} $$

The server then returns $c_1$ and $c_2$ for each challenge, and the user is required to recover the value of $b$ for each challenge.

Suppose the user doesn’t know the private key $x$. Then $g^c$ and $y ^r = g^{xr}$ are utterly indistinguishable as, due to the difficulty of DLP, a user won’t be able to recover $x$ from $y$, and hence $x r$ might as well be a random number just like $c$. The user is forced to guess the value of $b$ for each challenge, and since there are $128$ challenges, the probability of passing them all is $2^{-128}$, which is unreasonably small.

However, if the user does know the private key $x$, the user can simply test if $c_1^x = c_2$ to get $b$.

We don’t know the admin’s private key, so we’re doomed right?

### $b$ isn’t random enough

Let’s look at how the challenges are generated:

```
import random
# ...
def generate_challenges(self, n=128):
challenges = []
answers = []
for _ in range(n):
b = round(random.random())
r = random.getrandbits(p.bit_length())
c = random.getrandbits(p.bit_length())
c1 = pow(g, r, p)
if b == 0:
c2 = pow(g, c, p)
elif b == 1:
c2 = pow(self.y, r, p)
answers.append(b)
challenges.append((int(c1), int(c2)))
self.answers = answers
return challenges
```

Recall that the user is to recover $b$ which should only be possible if they know their private key. However, $b$ is generated via Python’s `random`

module, which uses a Pseudo-random number generator (PRNG) known as MT19937.

MT19937 initialises an internal state $u$ of $624$ 32-bit integers, and at each step, it generates a “random” 32-bit integer. Note that `b = round(random.random())`

is just a fancy way of selecting a bit from one of the 32-bit outputs. Taking the $624$ 32-bit integer as a vector of $624 * 32 = 19938$ bits, i.e., $u \in F_2^{19938}$ where $F_2 = \mathbb{Z} / 2\mathbb{Z}$ is the field of boolean, each 32-bit output $x_m \in F_2^{32}$ is a linear transform of $u$. I.e., for each bit of output (say bit $b_n$ for the $n$-th challenge), we can associate a known vector $v_n$ such that $v_n^T u = b_n$.

This means that for $N$ challenges generated by the server, we can represent the vector $\textbf{b} = (b_1 ; b_2 ; \cdots ; b_N)$ with a known linear transformation $V u = \textbf{b}$, where $V = (v_1 ; v_2 ; \cdots ; v_N)$.

This is huge! If we know the value of vector $\textbf{b}$ for a large enough $N$, we can recover $u$, and compute all subsequent $b_i = v_i^T u$. I.e., we can answer all subsequent challenges and log into any user without knowing their private key!

I found this before the hint came up btw, but as I didn’t join the discord, I didn’t see the announcement calling for feedback :(.

How do we get $\textbf{b}$? Since we can answer the challenges if we log into an account we created (since we know the private key), we can simply log in as many times as needed and save the answers for the challenges.

Furthermore, $V$ is a transform $F_2^{19938} \rightarrow F_2^{N}$. So, to solve for $u$ given the equation $V u = \textbf{b}$, we need $N \ge 19938$. Since we obtain $128$ values of $b_i$ each time we log in, we would need to log in $\lceil 19938/128 \rceil = 156$ times.

Small caveat, turns out, for large enough $N$, $V$ will always have a 32-bit (or 31? I forgot) kernel, which makes recovering the exact $u$ impossible. However, since for large enough $N$ the kernel remains the same, whichever $u$ that satisfies $V u = \textbf{b}$ for $N \ge 19938$ allows us to compute subsequent $b_i$. This was found experimentally and if anybody knows why (I suspect it’s the twisting) do let me know.

We can compute $V$ in a black-box manner. We go through each of the $19938$ basis vectors of $F_2^{19938}$ by setting only one bit (say the $i$-th) of $u$ and computing $\textbf{b}$ given such $u$. $\textbf{b}$ will be the $i$-th column vector of $V$.

```
s = random.getstate()
def login_sym(bidx, n=128):
new_s = (s[0], (*[int(0) if i != bidx//32 else int(1)<<int(bidx%32) for i in range(624)], s[1][-1]), s[2])
random.setstate(new_s)
ret = []
for _ in range(156):
for _ in range(n):
b = round(random.random())
r = random.getrandbits(p.bit_length())
c = random.getrandbits(p.bit_length())
ret.append(b)
return ret
# prng is our transform V that transforms the internal state u
# into the challenge answers
prng = np.zeros((624*32, 624*32), dtype=bool)
for bidx in range(624*32):
print(bidx, end="\r")
prng[:,bidx] = login_sym(bidx)
np.save(open("prng_transform.npy", "wb"), prng)
```

Once we have $\textbf{b}$, we can recover $u$ and generate the answers for the next challenge to log into any account:

```
prng = np.load(open("prng_transform.npy", "rb"))
prng_sage = matrix(GF(2), prng.astype(int))
# bits is our b we've gotten from logging in 156 times
sol = prng_sage.solve_right(vector(GF(2), bits), check=False)
bvrec = list(map(int, sol))
rec_s = (s[0], (*[reduce(lambda x,y: (x<<1)+y, bvrec[i*32:i*32+32][::-1]) for i in range(624)], s[1][-1]), s[2])
# Assert that we recovered u correctly
random.setstate(rec_s)
rec_bits = [b for _ in range(156) for b in login_local()]
assert all(x == y for x,y in zip(bits, rec_bits))
# Locally generate the answers for the 128 challenges during the next login
bb = login_local()
myans = " ".join(map(str, map(int, bb))).encode()
```

This step was for me the most frustrating step. I recovered $V$ very quickly but spent at least 2 hours trying to coax `numpy`

to do $F_2$ arithmetic. And then I gave up and loaded the matrix into sagemath. However, due to my potato computer, it takes over 20 minutes to load the matrix and after that sagemath becomes extremely unstable and crashes all the time. I ended up re-loading the matrix into sagemath at least 7 times.

### Sending flag_haver a message from the admin account

Now that we can log into `admin`

, we have to send `flag_haver`

a message. Unfortunately, we would have to sign the message with `admin`

’s private key, which we still do not have.

```
def send(self, recipient_public_key, message):
key = secrets.randbelow(2**128)
key_enc = Encryptor(recipient_public_key).encrypt(key)
key_enc = key_enc[0].to_bytes(96, 'big') + key_enc[1].to_bytes(96, 'big')
ct = Cipher(key).encrypt(message)
sig = Signer(self.private_key).sign(ct)
return (key_enc + ct, sig)
def receive(self, sender_public_key, ciphertext, sig):
if len(ciphertext) < 192:
return False
key_enc, ct = ciphertext[:192], ciphertext[192:]
if not Verifier(sender_public_key).verify(ct, sig):
return False
key_enc0, key_enc1 = int.from_bytes(key_enc[:96], 'big'), int.from_bytes(key_enc[96:192], 'big')
key = Decryptor(self.private_key).decrypt((key_enc0, key_enc1))
message = Cipher(key).decrypt(ct)
return message
```

To send a message, we randomly generate a `key`

that gets encrypted with the recipient’s public key (we have this) into `key_enc`

. The `message`

gets encrypted with `ct`

and the `ct`

is signed with the sender’s private key (we don’t have this) to get `sig`

. `key_enc`

, `ct`

and `sig`

get sent.

Note that the recipient will require their private key to decrypt `key_enc`

to decrypt `ct`

into the message, and they will also verify that `ct`

is signed with the recipient’s private key. This should ensure that

- Only the recipient can read the message
- The recipient can verify that the message is indeed from the sender.

A key “weird” thing to note here is that the signature only verifies that `ct`

is *committed* to the sender’s private key. `key_enc`

isn’t. This means that while the recipient can verify that `ct`

came from the sender, they can’t guarantee that `key_enc`

is.

This means that we can:

- Take an existing encrypted and signed message sent by the
`admin`

to a user whose private key we know. In this case, it is the welcome message to the user`MeowMeowMeow`

. - Use the known private key to recover
`key`

. - Modify
`key`

which in turn will change the result of the`ct`

’s decryption. - Re-encrypt
`key`

with our new recipient’s public key to get`key_enc`

. - Re-use the
`ct`

and`sig`

from the message we got from`admin`

and only change the`key_enc`

. - Send the modified message to the new recipient.

The new recipient will successfully decrypt `key_enc`

into our modified key due to step 4, the new recipient will think the message is indeed from `admin`

since we did not change `ct`

, and finally, the decrypted message will look different from what the `admin`

has sent to `MeowMeowMeow`

because we’ve modified `key`

.

Now the question is whether we can change `key`

such that the decrypted message changes from `Welcome MeowMeowMeow!`

to `Send flag to {username}`

to be sent to `flag_haver`

, where `username`

can be anything matching `[a-zA-Z0-9]{8,16}`

as we can simply create an account with that username and receive the flag from `flag_haver`

. Note that the modified message has to be the same length, so `username`

should have length $8$. To do that we need to see how `Cipher(key).encrypt(message)`

works.

### Attacking the Cipher

```
class Cipher:
def __init__(self, key):
self.n = 4
self.idx = self.n
self.state = [(key >> (32 * i)) & 0xffffffff for i in range(self.n)]
def next(self):
if self.idx == self.n:
for i in range(self.n):
x = self.state[i]
v = x >> 1
if x >> 31:
v ^= 0xa9b91cc3
if x & 1:
v ^= 0x38ab48ef
self.state[i] = v ^ self.state[(i + 3) % self.n]
self.idx = 0
v = self.state[self.idx]
x0, x1, x2, x3, x4 = (v >> 31) & 1, (v >> 24) & 1, (v >> 18) & 1, (v >> 14) & 1, v & 1
y = x0 + x1 + x2 + x3 + x4
self.idx += 1
return y & 1
def next_byte(self):
return int(''.join([str(self.next()) for _ in range(8)]), 2)
def xor(self, A, B):
return bytes([a ^ b for a, b in zip(A, B)])
def encrypt(self, message):
return self.xor(message, [self.next_byte() for _ in message])
def decrypt(self, ciphertext):
return self.xor(ciphertext, [self.next_byte() for _ in ciphertext])
```

We can see that `Cipher`

generates a stream by bytes according to the `key`

and XORs it with the message. An interesting thing to note is that the `.next`

method, and hence the `.next_byte`

method, computes their output by taking a linear combination of the bits of `key`

. This is easily seen by noting that the only operations used to update `self.state`

and return the next byte only involve XOR and bitshifts.

This is extremely similar to how MT19937 works, in that for a given message of $l$ bytes, we have an associated XOR stream $x \in F_2^{8 l}$ such that $c = m + x$ where $c$ is the ciphertext and $m$ is the message. We also have a 128-bit key $k \in F_2^{128}$, and we can derive a fixed transform $V: F_2^{128}\rightarrow F_2^{8 l}$ mapping $V k = x$.

For our use-case, we want to keep $c = m + x$ constant and compute a new key $k’$ such that $V k = x’$ and $c = m’ + x’$, where $x’$ is the new byte stream and $m’$ is our target ciphertext. Note that

- We know $k$ as this $k$ is from the welcome message and we can get $k$ by decrypting
`key_enc`

with our private key. - We know $m$ as
`Welcome MeowMeowMeow!`

. - We can control $m’$ and hence know $\delta_m$ as well.

Then since:

$$ \begin{aligned} c &= m + x = m’ + x’ = (m + \delta_m) + x’ \\ x’ &= x - \delta_m \\ V k’ &= x’ = x - \delta_m \end{aligned} $$

Then since we can compute $x - \delta_m$, we can easily compute $k’$… provided $k’$ actually exists.

It turns out that while our desired $k’$ has $128$ dimensions, $V$ only has rank $121$. Practically, this means we can always find a $k’$ such that $x’$ matches $x - \delta_m$ for the first $121$ bits, or ~$15$ bytes. I.e., a $k’$ such that $m$ matches $m’$ by ~$15$-bytes.

Unfortunately, our target message `Send flag to {username}`

has a minimum length of $21$ bytes since `username`

has the constraints `[a-zA-Z0-9]{8,16}`

, so we are $6$ bytes short. Put it another way, if we determine the first $2$ bytes of `username`

, say `username=XX??????`

, we are guaranteed to find $k’$ such that $m’$ will be `Send flag to XX------`

, where `-`

are just random bytes, and so `XX------`

isn’t guaranteed to satisfy the `[a-zA-Z0-9]{8,16}`

constraint.

However, we can bruteforce `XX`

such that `XX------`

does indeed satisfy the constraint!

We can expect bruteforce to work. We bruteforce $2$ characters in `[a-zA-Z0-9]`

which amounts to $62^2 = 3844$ possibilities. Now, the probability that the remaining $6$ characters does appear in `[a-zA-Z0-9]`

is $(62/256)^{6}$ which is roughly one in $4955$, which is fairly close to $3844$. In the event that no such `XX`

exists, we can try again with a different initial username (in the welcome message, which currently is `MeowMeowMeow`

).

It turns out the username `elSvxrjZ`

does work. In fact, it works regardless of what $k$ is, since if $\exists k’$ such that $V k’ = x’ = x - \delta_m = V k - \delta_m$, then for a new encountered key $j$, we have

$$ V j - \delta_m = V k - \delta_m - V (j - k) = V k’ - V (j - k) = V (\delta_k + j) $$

So, we can set $j’ = \delta_k + j$ as our new key.

```
from itertools import product
import re
import numpy as np
import random
# Compute V
V = []
for bidx in range(128):
c = Cipher(1<<bidx)
V.append([c.next() for _ in range(8*21)])
V = matrix(GF(2), np.array(V).T)
msg = b'Welcome MeowMeowMeow!'
allowed = "1234567890qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM"
for x,y in product(allowed, repeat=2):
target = f'Send flag to {x+y}aaaaaa'.encode()
ct = Cipher(key).encrypt(msg)
bs = vector(GF(2), [*map(int, ''.join([format(m^^c, "08b") for m,c in zip(target, ct)]))])
kbits = V.solve_right(bs, check=False)
key_rec = int("".join(map(str, kbits))[::-1], 2)
ctt = Cipher(key_rec).encrypt(target)
tar = [x^^y for x,y in zip(ctt[15:], target[15:])]
nt = [x^^y for x,y in zip(tar, ct[15:])]
new_target = target[:15] + bytes(nt)
assert Cipher(key_rec).encrypt(new_target) == ct
try:
if re.fullmatch(r'[a-zA-Z0-9]{8,16}', new_target.decode()[13:]):
break
except: pass
print(new_target)
# b'Send flag to elSvxrjZ'
```

So, before we log into `admin`

, we just have to create the user `elSvxrjZ`

, log into `admin`

and send the spoofed message (`Send flag to elSvxrjZ`

) to `flag_haver`

. `flag_haver`

will send the flag to `elSvxrjZ`

and we can log into `elSvxrjZ`

and decrypt the flag with `elSvxrjZ`

’s private key.

## Summary of attack

- Create an account with the username
`elSvxrjZ`

and save the private key. - Create an account with the username
`MeowMeowMeow`

and save the private key. - Log into
`MeowMeowMeow`

156 times to get enough challenge answers to recover the random generator’s internal state. - Save the welcome message the admin sent
`MeowMeowMeow`

. - Locally generate the answers for the next login challenge to log into the
`admin`

without knowing the`admin`

’s private key. - Use the saved welcome message for
`MeowMeowMeow`

to fake a signed message from`admin`

that tells`flag_haver`

to send the flag to`elSvxrjZ`

.`flag_haver`

is prompted to send the flag to username`elSvcrjZ`

.

- Log into
`elSvxrjZ`

and use`elSvxrjZ`

’s private key to decrypt the message from`flag_haver`

, which contains the flag.

**Flag**: `DUCTF{wait_its_all_linear_algebra?...always_has_been}`