# corCTF 2023 - Oil-Spill (crypto)

In this challenge, we are dealing with an instance of Unbalanced Oil and Vinegar (UOV) cryptosystem, with some hints given about the secret key. The authors's writeup is comprehensive and describes UOV and the attack in details. I will focus here on some small improvements.

## UOV

A public-key in UOV (as in most multivariate-quadratic cryptosystems) is simply a multivariate quadratic map

$~{}~P: \mathbb{F}_q^n \to \mathbb{F}_q^m : x \mapsto (P_1(x), P_2(x), \ldots, P_m(x)).$

A signature is simply a preimage of the hash digest of the message (and some other values), that is,

$~{}~P(\text{signature}) = \text{Hash}(\text{message}).$

In UOV, there exists a secret trapdoor - a linear subspace $O \subseteq \mathbb{F}_q^n$ such that $P(o) = 0$ for all $o \in O$. This allows the trapdoor owner to sign messages efficiently.

## The challenge

In the challenge, we are given a leak about the trapdoor, consisting of two parts:
1. the vector $\hat{o}=o+g$ for a randomly sampled $o \in O$ and a **binary** vector g;
2. the output vector $y=P(g)$.

Clearly, we can at most hope to recover the sampled $o$ vector. Luckily, it is sufficient to break the instance completely.

```
from sage.all import *
from sock import Sock
q, n, m = 251, 106, 44
F = GF(q)
```

```
f = Sock("be.ax 31105")
f.read_line()
data = bytes.fromhex(f.read_line().strip().decode())
data = [F(x) for x in data]
off = 0
def getdata(num):
global off
ret = data[off:off+num]
off += num
return ret
PM = []
z = zero_matrix(F, m, (n - m))
for i in range(m):
P1 = matrix(F, (n - m), (n - m), getdata((n-m)**2))
P2 = matrix(F, (n - m), m, getdata((n-m)*m))
P3 = matrix(F, m, m, getdata(m**2))
P = block_matrix([ [P1, P2], [z, P3]])
PM.append(P)
```

```
# Hints:
# P(g)
y = vector(F, bytes.fromhex(f.read_line().strip().decode()))
# ^o = o + g
og = vector(F, bytes.fromhex(f.read_line().strip().decode()))
```

```
# signature challenge
chall = vector(F, bytes.fromhex(f.read_line().strip().decode()))
```

## Unboxing the hint

Each quadratic map $P_i$ can be expressed by an upper-triangular matrix $P_i$ in $\mathbb{F}_q^{n\times n}$, such that

$~~P_i(x) = x^{\top} P_i x.$

Furthermore,

$~~P_i(x+y) = x^{\top} P_i x ~~+~~ x^{\top} (P_i + P_i^{\top}) y ~~+~~ y^{\top} P_i y.$

In particular,

$~~P_i(\hat{o}) = P_i(o+g) = o^{\top} P_i o ~~+~~ o^{\top} (P_i + P_i^{\top}) g ~~+~~ g^{\top} P_i g,$

where $o^{\top} P_i o = P_i(o) = 0$ and $g^{\top} P_i g = P_i(g) = y_i$:

$~~P_i(\hat{o}) = o^{\top} (P_i + P_i^{\top}) g ~~+~~ y_i.$

This seems to multiply the two unknowns $o$ and $g$, but we know their sum $\hat{o} = o + g$. Therefore,

$~~P_i(\hat{o}) - y_i = \hat{o}^{\top} (P_i + P_i^{\top}) g,$

where the left-hand side is known and on the right we get a known linear map applied to the binary secret $g$. Writing as a matrix-vector equation,

$~~M \times g = v,$

where

$$M = \begin{bmatrix} \hat{o}^{\top} (P_1 + P_1^{\top})\\ \hat{o}^{\top} (P_2 + P_2^{\top})\\ \vdots\\ \hat{o}^{\top} (P_m + P_m^{\top})\\ \end{bmatrix} \in \mathbb{F}_q^{m\times n}, ~~~~ v = \begin{bmatrix} P_1(\hat{o}) - y_1\\ P_2(\hat{o}) - y_2\\ \vdots\\ P_m(\hat{o}) - y_i \end{bmatrix} \in \mathbb{F}_q^{m}.$$

## LLL

Since we have a linear equation with a "small" (binary) solution, we can use the ubiquitous LLL algorithm. The author considers the lattice

$~{}~\begin{bmatrix}Id_n & M^{\top}\\0 & q\cdot Id_m\\0 & v^{\top}\end{bmatrix}$

and hopes to find the short vector

$~{}~\begin{bmatrix}g & 0\end{bmatrix}.$

This requires to use the heavier BKZ reduction to ensure that the shorest vector is found. However, we can notice that the zero-part of the vector does not bear any uncertainty. We can eliminate it with the following trick and reduce the lattice's dimension.

- Find an arbitrary solution $\overline{g}$ to $M\times \overline{g} = v$ (linear algebra over $\mathbb{F}_q$).
- Find the right kernel of $M$, denote it by $R$ (linear algebra over $\mathbb{F}_q$).
- Since $M \times g = v$ for the target binary $g$, we know that $g$ can be expressed as $\overline{g} + r$ for some $r \in R$. We are now searching for a combination of basis vectors of $R$ that is close to $\overline{g}$, up to the binary vector $g$.

We can now use the Kannan's embedding technique to solve the resulting CVP. Even without using the embedding factor, we can search for short vectors in the lattice spanned by the rows of the following matrix over integers:

$$~{}~L = \begin{bmatrix} R \in \mathbb{Z}^{(n-m) \times n}\\ q \cdot Id_n \in \mathbb{Z}^{n \times n}\\ \overline{g}^{\top}\in \mathbb{Z}^{1 \times n} \end{bmatrix}$$

(the identity matrix multiplied by $q$ corresponds the modeling the reduction modulo $q$ over the integers). Clearly, $g$ is a short vector in the lattice, since it's binary. We can make it even more efficient by replacing $\overline{g}^{\top}$ with $\overline{g}^{\top} - (1/2,1/2,\ldots,1/2)$, essentially removing the bias towards the 0 values in the target vector (we know that 0 and 1 are equally probable, so that we replace them by -1/2 and 1/2 having the same amplitude). This is a well-known trick from the subset-sum problem.

With reduced dimension of $n$ and the 1/2 trick we can use LLL instead of BKZ (with slightly increased delta):

```
M = matrix(F, [og*(Pi+Pi.T) for Pi in PM])
v = vector(F, [og*Pi*og + yi for (Pi, yi) in zip(PM, y)])
g0 = M.solve_right(v)
R = M.right_kernel().matrix()
L = R.change_ring(ZZ).stack(251 * identity_matrix(106))
L = (L*2).stack(vector(ZZ, [int(a)*2-1 for a in g0])) # 1/2 trick
Lll = L.LLL(0.99999)
```

```
for row in Lll:
if row and -1 <= min(row) <= max(row) <= 1:
sol = row
print(row)
```

(1, 1, 1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, -1, 1, 1, -1, -1, -1, 1, -1, 1, -1, 1, -1, -1, -1, 1, 1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, 1, -1, -1, -1, 1, -1, -1, -1, 1, -1, -1, 1, 1, 1, -1, 1, 1, 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, -1, -1, 1, -1, -1, 1, 1, -1)

```
g = vector(F, [(a+1)//2 for a in sol])
g
```

(1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0)

```
o = og - g
o
```

(249, 3, 228, 154, 116, 4, 184, 238, 249, 137, 23, 154, 133, 88, 58, 44, 139, 27, 230, 245, 191, 57, 66, 38, 67, 235, 236, 144, 166, 202, 21, 145, 190, 201, 214, 114, 177, 170, 225, 138, 207, 124, 145, 162, 67, 147, 111, 227, 47, 32, 38, 169, 176, 191, 110, 170, 73, 68, 79, 85, 213, 77, 28, 121, 220, 9, 18, 211, 109, 107, 89, 97, 113, 88, 67, 80, 14, 38, 120, 72, 216, 106, 52, 138, 217, 191, 94, 34, 166, 109, 180, 170, 176, 216, 131, 28, 104, 215, 14, 16, 217, 221, 73, 8, 237, 77)

```
for Pi in PM:
assert o * Pi * o == 0
```

## Recover full $O$

Now that we have $o$, we can use an (un)luckily recently published paper and code (by Pierre Pébereau) for this problem.

```
def attack(G,v) :
"""
This function is the one that completes the attack.
Given G a set of m quadratic forms admitting a common totally isotropic subspace O
of dimension at least (n-m)/2, and v in O, find a basis of O as a whole.
"""
m = len(G)
J = matrix([v*g for g in G])
B = matrix(J.right_kernel().basis())
B2 = []
for g in G :
ghat = B*g*B.transpose() #restriction of G to the kernel of J
for b in ghat.kernel().basis() :
if len(B2) == 0 or b not in span(B2) :
B2.append(b)
if len(B2) == m :
break
B3 = matrix(B2)
C = B3*B
return C
O = attack([P + P.T for P in PM], o)
O
```

44 x 106 dense matrix over Finite Field of size 251

```
for o in O:
for P in PM:
assert o*P*o == 0
```

## Forge signature

The legitimate (since we know the trapdoor $O$) forging approach follows similar ideas. Fix random $c \in \mathbb{F}_q^{n}$ and observe that

$~~P_i(c + \Delta) = P_i(c) + P_i(\Delta) + c^{\top}P_i\Delta.$

If we choose $\Delta$ from $O$, we get

$~~P_i(c + \Delta) = P_i(c) + c^{\top}P_i\Delta$

which is a linear function in $\Delta$ (when $c$ is fixed). So that we can quickly check if $P_i(c + \Delta)$ equals the target challenge for any $\Delta$ in the subspace $O$. Due to large size of $O$, we have very high chances of finding such.

```
c = random_vector(F, n)
system = matrix(F, [O*(P + P.T)*c for P in PM])
target = vector(F, [t - c * P * c for t, P in zip(chall, PM)])
s = system.solve_right(target)
answer = c + s*O
assert [answer * P * answer for P in PM] == list(chall)
```

```
f.send_line("".join("%02x" % a for a in answer))
f.read_all()
```

b'corctf{https://www.youtube.com/watch?v=eoV7lw7YBG4 "just a splash of olive oil" instead of spilling oil lets drink it out of the bottle}\n'