Introduction
This note describes some connections between algorithms for decoding linear codes and for solving LPN problems. They are quite simple but many lectures on decoding skip them, so I decided to write it down.
Let's start with definitions.
LPN problem: given a matrix , , a vector and an error rate ,
find such that . Note: usually, LPN allows an arbitrary number of random queries, in this note I'll stick to a fixed number .
Syndrome decoding: given a parity check matrix , , a syndrome and the number of errors , find such that and .
The LPN problem can be interpreted as solving a system of linear equations minimizing errors, while syndrome decoding is about finding a "short" but exact solution (short in the Hamming weight metric). Although LPN traditionally implies the binary field (), all these ideas apply to any field.
By transposing the LPN problem into with , we get a usual notation for the problem of decoding a linear code with the generator matrix , for which syndrome decoding is an equivalent (dual) formulation. This post explores this connection with relation to decoding algorithms in the two formulations.
Prange vs PooledGauss
There is a basic algorithm for each of the two problems.
- PooledGauss: For LPN, the algorithm is considered folklore and was called PooledGauss and analyzed in details in
[EKM17]. The idea is to randomly select (out of ) independent equations (rows of ), hoping that these equations have no errors in the target solution. The equations can be sampled from a small prepared set of equations (the "pool"). Solving the corresponding system , where denotes the chosen rows, yields a full candidate for which it only remains to test the number of errors on the full system .
- Prange: For syndrome decoding, the classic algorithm is due to Prange
[Pra62]. The idea is to randomly select (out of ) columns of , hoping that the target solution's columns fall inside the selection. Then, solving yields a candidate solution (by filling with zeroes), the weight of which can be checked directly (here denotes the selected columns).
Remark on Prange's method: often in lectures Prange's algorithm is presented as inverting the selected columns and computing . But solving directly (instead of matrix inversion and multiplication) can be faster in practice by a small factor. On the other hand, inversion-based approach is more useful for further optimizations and can be amortized, so it's not important in the end.
These two problems and algorithms are in fact almost the same things, and can be viewed as duals of each other. To see that, let's rephrase the LPN problem in terms of usual linear code notation. Let . Then,
which is clearly the problem of decoding a corrupted codeword from the linear -code with the generator matrix . Now, a corresponding parity check matrix is determined by , so that for all . It has dimensions .
Let be the syndrome. Note that we can compute since we know and . As a result, we converted the original (pooled) LPN problem (low-weight ) into syndrome decoding (low-weight ).
To see the connection between the two algorithms, it is convenient to stack over to form a square matrix:
Figure: Generator matrix and parity check matrix of a linear code
Then, it becomes clear that a choice of rows of (columns of ) in the PooledGauss algorithm is in fact complementary to the choice of columns of in the Prange's algorithm:
Figure: Relation between square submatrices selected in PooledGauss and Prange's method
In this picture, the "*" symbols denote potential nonzero entries of the error vector . The PooledGauss algorithm suceeds when all the errors are outside of the chosen invertible equation matrix , while Prange's algorithm succeeds when all the errors are inside the chosen invertible parity check submatrix . Clearly, such successful choices are in bijection by .
This also shows that PooledGauss is faster than Prange's algorithm whenever is long (, as in the figure), and vice versa, because the respective matrix inversion steps cost and respectively. Again, better algorithms amortize the inversion costs and it becomes irrelevant.
Blum-Kalai-Wasserman (BKW) vs Wagner's generalized birthday
Another closely related methods are BKW
[BKW00] and Wagner's generalized birthday / -sum algorithm
[Wag02].
BKW algorithm for LPN
Similarly to the previous algorithms, the idea of BKW departs from the Gaussian elimination. But now, instead of starting with potentially error-free samples, we will use more samples and allow errors in them. The idea is to add as few samples together as possible to arrive at a chosen weight-1 sample. The resulting equation would be of the form where is a constant and is a combined error (equal to to the sum of errors of the added sampleds). The combined error will be worse than the original one, but now we essentially have a 1-variable LPN instance.
After collecting enough such equations for the same variable we can learn it by taking the most frequent value . The number of samples is polynomial in , where is the probability of .
How do we perform Gaussian elimination with small number of additions? The idea of BKW is to simply perform block-wise elimination: instead of eliminating 1 variable at a time, we can eliminate variables at time given that we have many collisions in values of these variables (meaning that we need quite more than samples). Then, each reduced vector would only take additions to make.
Remark: the large sample requirement means that BKW is only applicable to decoding -codes with large ratio. However, one can "inflate" the code by randomly adding pairs of samples, at the cost of increasing the error probability.
Remark: the last row addition can be saved by not reducing the last block and using fast Walsh-Hadamard transform (FWHT) to recover the solution in time .
Wagner's -sum problem
Wagner considered the following problem as a generalization of the birthday paradox.
-sum problem. Given lists , find such that .
Wagner proposes the following recursive approach (called -tree algorithm) to solve this problem. Merge lists in pairs by adding all possible pairs of elements from the two lists and filtering them in the first bits:
Then, repeat the procedure on the resulting lists to match bits, bits, etc. For a correct relation between and , this would result in a solution to the problem.
The resulting structure of list merges form a binary tree.
Figure: Wagner's -sum algorithm
(credits:
[Wag02])
Extension 1: it is easy to see that solving for any chosen constant vector is not harder: simply replace the list with and solve the zero-sum problem for lists .
Extension 2: often in applications all the lists are the same (). Here, the zero-sum problem becomes trovial when is even. But the case of non-zero target is still similarly hard as the original problem. Furthermore, it allows an optimization of the algorithm: instead of the tree-like structure depicted above, the zero-sum
Although not usually stated as such, the -sum problem (more precisely, the two extensions/simplifications described above) can be viewed as the syndrome decoding problem: the list corresponds to the columns of the parity check matrix , and we aim to find a weight- vector such that . However, the -tree algorithm targets the "dense" case, where the number of solutions is very large and it is sufficient to get any of them, while most decoding problems only have a unique solution. Therefore, the -tree algorithm is often not directly applicable, but is still used as a subprocedure in more advanced decoding algorithms.
The relation
It may be difficult to see that the first part of the BKW algorithm and the -tree algorithm are equivalent. The problem is that (block-wise) Gaussian elimination seems iterative in nature and hides the tree-like structure of additions. However, careful observation shows that the BKW algorithm simply performs the -tree algorithm with being the rows of the equation matrix and the target vector being a chosen unit vector, e.g. .
Note that Wagner writes that BKW is closely related to his -tree algorithm in
[Wag02].