Metadata
This challenge is from DownUnderCTF 2023. I played under NUS Greyhats instead of the usual SEE and we won 1st. However I didn’t really contribute much, I joined halfway through the CTF and had other commitments and accumulated a total of only ~1000/12000 of the eventual points the whole team made.
Regardless, I’m happy that we blooded Encrypted Mail 31 hours into the CTF (beating SEE which I betrayed, sorryyyy Neobeo and Warri). Encrypted Mail was the 2nd hardest crypto challenge, and the hardest was, unfortunately for the author Joseph, unsolved.
Challenge
- Points: 449
- Solves: 3
- Author: joseph
Zero-knowledge authentication, end-to-end encryption, this new mail app has it all. For a limited time, admins may be sending flags to users. Sign up today to get yours!
nc 2023.ductf.dev 30000
Objective
Players are given a minimal mail-server implementation to interact with. There are 4 options to choose from:
- Register
- Login
- Send Message
- View Inbox
When you register, the server asks for your username and your public key $y$ that will be used during Login. The intended usage of this mail server would require you to generate a private key $x$ and generate $y$ from it.
During Login, the server prompts you with a series of challenges, which should require that you know the secret private key $x$ for that account to answer the challenges correctly. This is the “zero-knowledge” part of this mail server.
Once you log in, you can send messages and view your inbox. To send a message, you would need to input the recipient’s username. The server then replies with the recipient’s public key. You are then required to encrypt your message with the recipient’s public key, sign the message with your private key and tell the server the resulting encrypted message and signature.
When you view your inbox, the server replies with all the messages you have. However, the server replies with the encrypted messages (encrypted with your public key) and the signatures (for you to verify the sender with the sender’s public key).
In addition to these functionalities, there are two bot accounts.
admin
flag_haver
The admin
bot checks for any new accounts and sends them an (encrypted) sweet welcome message Welcome {username}!
. The flag_haver
bot iterates all the messages they have, and if the message is from admin
(flag_haver
checks the signature) and has the following format: Send flag to {username}
, flag_haver
obliges and sends the flag as an (encrypted) message to the user.
flag_haver
is the only account that has access to the flag, so we must get admin
to send them Send flag to {username}
. However, there’s nothing that prompts admin
to do such a thing, so we probably have to log into admin
.
Once we log into admin
, it appears that we still need the admin
’s private key to sign the message, so that flag_haver
will happily read the message.
Attack
Logging into admin
We start with first logging into admin
. The code for this is implemented in Authenticator
g = 3
p = 1467036926602756933667493250084962071646332827366282684436836892199877831990586034135575089582195051935063743076951101438328248410785708278030691147763296367303874712247063207281890660681715036187155115101762255732327814001244715367
class Authenticator:
def __init__(self, pubkey):
self.y = pubkey
def generate_challenges(self, n=128):
challenges = []
answers = []
for _ in range(n):
b = round(random.random())
r = random.getrandbits(p.bit_length())
c = random.getrandbits(p.bit_length())
c1 = pow(g, r, p)
if b == 0:
c2 = pow(g, c, p)
elif b == 1:
c2 = pow(self.y, r, p)
answers.append(b)
challenges.append((int(c1), int(c2)))
self.answers = answers
return challenges
def verify_answers(self, answers):
return len(answers) == len(self.answers) and all(a == b for a, b in zip(answers, self.answers))
The authentication scheme is a zero-knowledge variant of Diffie-Hellman (I don’t know the actual name of this variant).
The user is expected to randomly generate a private key $x$ and compute their public key as $g^x = y ; \mathrm{mod} ; p$ where $p$ is a prime and $g = 3$. $p$ is a strong prime so the Discrete Logarithm Problem (DLP) is hard on $\mathbb{Z}/p\mathbb{Z}$ is computationally hard.
The server knows a user’s public key $y$ and wants the user to prove that they know their private key (without actually transmitting the private key). The server generates $128$ challenges, for each challenge, it generates a boolean $b$, and random numbers $r, c \in [0, p)$, and computes:
$$ \begin{aligned} c_1 &= g^r \text{ } \mathrm{mod} \text{ } p \\ c_2 &= \begin{cases} y^r = g^{x r} \text{ } \mathrm{mod} \text{ } p & \text{if b} \\ g^c \text{ } \mathrm{mod} \text{ } p & \text{otherwise} \end{cases} \end{aligned} $$
The server then returns $c_1$ and $c_2$ for each challenge, and the user is required to recover the value of $b$ for each challenge.
Suppose the user doesn’t know the private key $x$. Then $g^c$ and $y ^r = g^{xr}$ are utterly indistinguishable as, due to the difficulty of DLP, a user won’t be able to recover $x$ from $y$, and hence $x r$ might as well be a random number just like $c$. The user is forced to guess the value of $b$ for each challenge, and since there are $128$ challenges, the probability of passing them all is $2^{-128}$, which is unreasonably small.
However, if the user does know the private key $x$, the user can simply test if $c_1^x = c_2$ to get $b$.
We don’t know the admin’s private key, so we’re doomed right?
$b$ isn’t random enough
Let’s look at how the challenges are generated:
import random
# ...
def generate_challenges(self, n=128):
challenges = []
answers = []
for _ in range(n):
b = round(random.random())
r = random.getrandbits(p.bit_length())
c = random.getrandbits(p.bit_length())
c1 = pow(g, r, p)
if b == 0:
c2 = pow(g, c, p)
elif b == 1:
c2 = pow(self.y, r, p)
answers.append(b)
challenges.append((int(c1), int(c2)))
self.answers = answers
return challenges
Recall that the user is to recover $b$ which should only be possible if they know their private key. However, $b$ is generated via Python’s random
module, which uses a Pseudo-random number generator (PRNG) known as MT19937.
MT19937 initialises an internal state $u$ of $624$ 32-bit integers, and at each step, it generates a “random” 32-bit integer. Note that b = round(random.random())
is just a fancy way of selecting a bit from one of the 32-bit outputs. Taking the $624$ 32-bit integer as a vector of $624 * 32 = 19938$ bits, i.e., $u \in F_2^{19938}$ where $F_2 = \mathbb{Z} / 2\mathbb{Z}$ is the field of boolean, each 32-bit output $x_m \in F_2^{32}$ is a linear transform of $u$. I.e., for each bit of output (say bit $b_n$ for the $n$-th challenge), we can associate a known vector $v_n$ such that $v_n^T u = b_n$.
This means that for $N$ challenges generated by the server, we can represent the vector $\textbf{b} = (b_1 ; b_2 ; \cdots ; b_N)$ with a known linear transformation $V u = \textbf{b}$, where $V = (v_1 ; v_2 ; \cdots ; v_N)$.
This is huge! If we know the value of vector $\textbf{b}$ for a large enough $N$, we can recover $u$, and compute all subsequent $b_i = v_i^T u$. I.e., we can answer all subsequent challenges and log into any user without knowing their private key!
I found this before the hint came up btw, but as I didn’t join the discord, I didn’t see the announcement calling for feedback :(.
How do we get $\textbf{b}$? Since we can answer the challenges if we log into an account we created (since we know the private key), we can simply log in as many times as needed and save the answers for the challenges.
Furthermore, $V$ is a transform $F_2^{19938} \rightarrow F_2^{N}$. So, to solve for $u$ given the equation $V u = \textbf{b}$, we need $N \ge 19938$. Since we obtain $128$ values of $b_i$ each time we log in, we would need to log in $\lceil 19938/128 \rceil = 156$ times.
Small caveat, turns out, for large enough $N$, $V$ will always have a 32-bit (or 31? I forgot) kernel, which makes recovering the exact $u$ impossible. However, since for large enough $N$ the kernel remains the same, whichever $u$ that satisfies $V u = \textbf{b}$ for $N \ge 19938$ allows us to compute subsequent $b_i$. This was found experimentally and if anybody knows why (I suspect it’s the twisting) do let me know.
We can compute $V$ in a black-box manner. We go through each of the $19938$ basis vectors of $F_2^{19938}$ by setting only one bit (say the $i$-th) of $u$ and computing $\textbf{b}$ given such $u$. $\textbf{b}$ will be the $i$-th column vector of $V$.
s = random.getstate()
def login_sym(bidx, n=128):
new_s = (s[0], (*[int(0) if i != bidx//32 else int(1)<<int(bidx%32) for i in range(624)], s[1][-1]), s[2])
random.setstate(new_s)
ret = []
for _ in range(156):
for _ in range(n):
b = round(random.random())
r = random.getrandbits(p.bit_length())
c = random.getrandbits(p.bit_length())
ret.append(b)
return ret
# prng is our transform V that transforms the internal state u
# into the challenge answers
prng = np.zeros((624*32, 624*32), dtype=bool)
for bidx in range(624*32):
print(bidx, end="\r")
prng[:,bidx] = login_sym(bidx)
np.save(open("prng_transform.npy", "wb"), prng)
Once we have $\textbf{b}$, we can recover $u$ and generate the answers for the next challenge to log into any account:
prng = np.load(open("prng_transform.npy", "rb"))
prng_sage = matrix(GF(2), prng.astype(int))
# bits is our b we've gotten from logging in 156 times
sol = prng_sage.solve_right(vector(GF(2), bits), check=False)
bvrec = list(map(int, sol))
rec_s = (s[0], (*[reduce(lambda x,y: (x<<1)+y, bvrec[i*32:i*32+32][::-1]) for i in range(624)], s[1][-1]), s[2])
# Assert that we recovered u correctly
random.setstate(rec_s)
rec_bits = [b for _ in range(156) for b in login_local()]
assert all(x == y for x,y in zip(bits, rec_bits))
# Locally generate the answers for the 128 challenges during the next login
bb = login_local()
myans = " ".join(map(str, map(int, bb))).encode()
This step was for me the most frustrating step. I recovered $V$ very quickly but spent at least 2 hours trying to coax numpy
to do $F_2$ arithmetic. And then I gave up and loaded the matrix into sagemath. However, due to my potato computer, it takes over 20 minutes to load the matrix and after that sagemath becomes extremely unstable and crashes all the time. I ended up re-loading the matrix into sagemath at least 7 times.
Sending flag_haver a message from the admin account
Now that we can log into admin
, we have to send flag_haver
a message. Unfortunately, we would have to sign the message with admin
’s private key, which we still do not have.
def send(self, recipient_public_key, message):
key = secrets.randbelow(2**128)
key_enc = Encryptor(recipient_public_key).encrypt(key)
key_enc = key_enc[0].to_bytes(96, 'big') + key_enc[1].to_bytes(96, 'big')
ct = Cipher(key).encrypt(message)
sig = Signer(self.private_key).sign(ct)
return (key_enc + ct, sig)
def receive(self, sender_public_key, ciphertext, sig):
if len(ciphertext) < 192:
return False
key_enc, ct = ciphertext[:192], ciphertext[192:]
if not Verifier(sender_public_key).verify(ct, sig):
return False
key_enc0, key_enc1 = int.from_bytes(key_enc[:96], 'big'), int.from_bytes(key_enc[96:192], 'big')
key = Decryptor(self.private_key).decrypt((key_enc0, key_enc1))
message = Cipher(key).decrypt(ct)
return message
To send a message, we randomly generate a key
that gets encrypted with the recipient’s public key (we have this) into key_enc
. The message
gets encrypted with ct
and the ct
is signed with the sender’s private key (we don’t have this) to get sig
. key_enc
, ct
and sig
get sent.
Note that the recipient will require their private key to decrypt key_enc
to decrypt ct
into the message, and they will also verify that ct
is signed with the recipient’s private key. This should ensure that
- Only the recipient can read the message
- The recipient can verify that the message is indeed from the sender.
A key “weird” thing to note here is that the signature only verifies that ct
is committed to the sender’s private key. key_enc
isn’t. This means that while the recipient can verify that ct
came from the sender, they can’t guarantee that key_enc
is.
This means that we can:
- Take an existing encrypted and signed message sent by the
admin
to a user whose private key we know. In this case, it is the welcome message to the userMeowMeowMeow
. - Use the known private key to recover
key
. - Modify
key
which in turn will change the result of thect
’s decryption. - Re-encrypt
key
with our new recipient’s public key to getkey_enc
. - Re-use the
ct
andsig
from the message we got fromadmin
and only change thekey_enc
. - Send the modified message to the new recipient.
The new recipient will successfully decrypt key_enc
into our modified key due to step 4, the new recipient will think the message is indeed from admin
since we did not change ct
, and finally, the decrypted message will look different from what the admin
has sent to MeowMeowMeow
because we’ve modified key
.
Now the question is whether we can change key
such that the decrypted message changes from Welcome MeowMeowMeow!
to Send flag to {username}
to be sent to flag_haver
, where username
can be anything matching [a-zA-Z0-9]{8,16}
as we can simply create an account with that username and receive the flag from flag_haver
. Note that the modified message has to be the same length, so username
should have length $8$. To do that we need to see how Cipher(key).encrypt(message)
works.
Attacking the Cipher
class Cipher:
def __init__(self, key):
self.n = 4
self.idx = self.n
self.state = [(key >> (32 * i)) & 0xffffffff for i in range(self.n)]
def next(self):
if self.idx == self.n:
for i in range(self.n):
x = self.state[i]
v = x >> 1
if x >> 31:
v ^= 0xa9b91cc3
if x & 1:
v ^= 0x38ab48ef
self.state[i] = v ^ self.state[(i + 3) % self.n]
self.idx = 0
v = self.state[self.idx]
x0, x1, x2, x3, x4 = (v >> 31) & 1, (v >> 24) & 1, (v >> 18) & 1, (v >> 14) & 1, v & 1
y = x0 + x1 + x2 + x3 + x4
self.idx += 1
return y & 1
def next_byte(self):
return int(''.join([str(self.next()) for _ in range(8)]), 2)
def xor(self, A, B):
return bytes([a ^ b for a, b in zip(A, B)])
def encrypt(self, message):
return self.xor(message, [self.next_byte() for _ in message])
def decrypt(self, ciphertext):
return self.xor(ciphertext, [self.next_byte() for _ in ciphertext])
We can see that Cipher
generates a stream by bytes according to the key
and XORs it with the message. An interesting thing to note is that the .next
method, and hence the .next_byte
method, computes their output by taking a linear combination of the bits of key
. This is easily seen by noting that the only operations used to update self.state
and return the next byte only involve XOR and bitshifts.
This is extremely similar to how MT19937 works, in that for a given message of $l$ bytes, we have an associated XOR stream $x \in F_2^{8 l}$ such that $c = m + x$ where $c$ is the ciphertext and $m$ is the message. We also have a 128-bit key $k \in F_2^{128}$, and we can derive a fixed transform $V: F_2^{128}\rightarrow F_2^{8 l}$ mapping $V k = x$.
For our use-case, we want to keep $c = m + x$ constant and compute a new key $k’$ such that $V k = x’$ and $c = m’ + x’$, where $x’$ is the new byte stream and $m’$ is our target ciphertext. Note that
- We know $k$ as this $k$ is from the welcome message and we can get $k$ by decrypting
key_enc
with our private key. - We know $m$ as
Welcome MeowMeowMeow!
. - We can control $m’$ and hence know $\delta_m$ as well.
Then since:
$$ \begin{aligned} c &= m + x = m’ + x’ = (m + \delta_m) + x’ \\ x’ &= x - \delta_m \\ V k’ &= x’ = x - \delta_m \end{aligned} $$
Then since we can compute $x - \delta_m$, we can easily compute $k’$… provided $k’$ actually exists.
It turns out that while our desired $k’$ has $128$ dimensions, $V$ only has rank $121$. Practically, this means we can always find a $k’$ such that $x’$ matches $x - \delta_m$ for the first $121$ bits, or ~$15$ bytes. I.e., a $k’$ such that $m$ matches $m’$ by ~$15$-bytes.
Unfortunately, our target message Send flag to {username}
has a minimum length of $21$ bytes since username
has the constraints [a-zA-Z0-9]{8,16}
, so we are $6$ bytes short. Put it another way, if we determine the first $2$ bytes of username
, say username=XX??????
, we are guaranteed to find $k’$ such that $m’$ will be Send flag to XX------
, where -
are just random bytes, and so XX------
isn’t guaranteed to satisfy the [a-zA-Z0-9]{8,16}
constraint.
However, we can bruteforce XX
such that XX------
does indeed satisfy the constraint!
We can expect bruteforce to work. We bruteforce $2$ characters in [a-zA-Z0-9]
which amounts to $62^2 = 3844$ possibilities. Now, the probability that the remaining $6$ characters does appear in [a-zA-Z0-9]
is $(62/256)^{6}$ which is roughly one in $4955$, which is fairly close to $3844$. In the event that no such XX
exists, we can try again with a different initial username (in the welcome message, which currently is MeowMeowMeow
).
It turns out the username elSvxrjZ
does work. In fact, it works regardless of what $k$ is, since if $\exists k’$ such that $V k’ = x’ = x - \delta_m = V k - \delta_m$, then for a new encountered key $j$, we have
$$ V j - \delta_m = V k - \delta_m - V (j - k) = V k’ - V (j - k) = V (\delta_k + j) $$
So, we can set $j’ = \delta_k + j$ as our new key.
from itertools import product
import re
import numpy as np
import random
# Compute V
V = []
for bidx in range(128):
c = Cipher(1<<bidx)
V.append([c.next() for _ in range(8*21)])
V = matrix(GF(2), np.array(V).T)
msg = b'Welcome MeowMeowMeow!'
allowed = "1234567890qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM"
for x,y in product(allowed, repeat=2):
target = f'Send flag to {x+y}aaaaaa'.encode()
ct = Cipher(key).encrypt(msg)
bs = vector(GF(2), [*map(int, ''.join([format(m^^c, "08b") for m,c in zip(target, ct)]))])
kbits = V.solve_right(bs, check=False)
key_rec = int("".join(map(str, kbits))[::-1], 2)
ctt = Cipher(key_rec).encrypt(target)
tar = [x^^y for x,y in zip(ctt[15:], target[15:])]
nt = [x^^y for x,y in zip(tar, ct[15:])]
new_target = target[:15] + bytes(nt)
assert Cipher(key_rec).encrypt(new_target) == ct
try:
if re.fullmatch(r'[a-zA-Z0-9]{8,16}', new_target.decode()[13:]):
break
except: pass
print(new_target)
# b'Send flag to elSvxrjZ'
So, before we log into admin
, we just have to create the user elSvxrjZ
, log into admin
and send the spoofed message (Send flag to elSvxrjZ
) to flag_haver
. flag_haver
will send the flag to elSvxrjZ
and we can log into elSvxrjZ
and decrypt the flag with elSvxrjZ
’s private key.
Summary of attack
- Create an account with the username
elSvxrjZ
and save the private key. - Create an account with the username
MeowMeowMeow
and save the private key. - Log into
MeowMeowMeow
156 times to get enough challenge answers to recover the random generator’s internal state. - Save the welcome message the admin sent
MeowMeowMeow
. - Locally generate the answers for the next login challenge to log into the
admin
without knowing theadmin
’s private key. - Use the saved welcome message for
MeowMeowMeow
to fake a signed message fromadmin
that tellsflag_haver
to send the flag toelSvxrjZ
.flag_haver
is prompted to send the flag to usernameelSvcrjZ
.
- Log into
elSvxrjZ
and useelSvxrjZ
’s private key to decrypt the message fromflag_haver
, which contains the flag.
Flag: DUCTF{wait_its_all_linear_algebra?...always_has_been}