Improved (Provable) Algorithms for the Shortest Vector Problem via Bounded Distance Decoding

The most important computational problem on lattices is the Shortest Vector Problem ( SVP ). In this paper, we present new algorithms that improve the state-of-the-art for provable classical/quantum algorithms for SVP . We present the following results. 1. A new algorithm for SVP that provides a smooth tradeoff between time complexity and memory requirement. For any positive integer 4 ≤ q ≤ √ n , our algorithm takes q 13 n + o ( n ) time and requires poly ( n ) · q 16 n/q 2 memory. This tradeoff which ranges from enumeration ( q = √ n ) to sieving ( q constant), is a consequence of a new time-memory tradeoff for Discrete Gaussian sampling above the smoothing parameter. 2. A quantum algorithm that runs in time 2 0 . 9533 n + o ( n ) and requires 2 0 . 5 n + o ( n ) classical memory and poly( n ) qubits. This improves over the previously fastest classical (which is also the fastest quantum) algorithm due to [2] that has a time and space complexity 2 n + o ( n ) . 3. A classical algorithm for SVP that runs in time 2 1 . 741 n + o ( n ) time and 2 0 . 5 n + o ( n ) space. This improves over an algorithm of [15] that has the same space complexity. The time complexity of our classical and quantum algorithms are expressed using a quantity related to the kissing number of a lattice. A known upper bound of this quantity is 2 0 . 402 n , but in practice for most lattices, it can be much smaller and even 2 o ( n ) . In that case, our classical algorithm runs in time 2 1 . 292 n and our quantum algorithm runs in time 2 0 . 750 n .

The most important computational problem on lattices is the Shortest Vector Problem (SVP).Given a basis for a lattice L ⊆ R n , SVP asks us to compute a non-zero vector in L with the smallest Euclidean norm.Starting from the '80s, the use of approximate and exact solvers for SVP (and other lattice problems) gained prominence for their applications in algorithmic number theory [41], convex optimization [32,34,20], coding theory [17], and cryptanalysis tool [56,14,40].The security of many cryptographic primitives is based on the worst-case hardness of (a decision variant of) approximate SVP to within polynomial factors [6,44,53,52,45,23,13] in the sense that any cryptanalytic attack on these cryptosystems that runs in time polynomial in the security parameter implies a polynomial time algorithm to solve approximate SVP to within polynomial factors.Such cryptosystems have attracted a lot of research interest due to their conjectured resistance to quantum attacks.
The SVP is a well studied computational problem in both its exact and approximate (decision) versions.By a randomized reduction, it is known to be NP-hard to approximate within any constant factor, and hard to approximate within a factor n c/ log log n for some c > 0 under reasonable complexity-theoretic assumptions [42,35,27].For an approximation factor 2 O(n) , one can solve SVP in time polynomial in n using the celebrated LLL lattice basis reduction algorithm [41].In general, the fastest known algorithm(s) for approximating SVP within factors polynomial in n rely on (a variant of) the BKZ lattice basis reduction algorithm [54,55,7,21,25,3], which can be seen as a generalization of the LLL algorithm and gives an r n/r approximation in 2 O(r) poly(n) time.All these algorithms internally use an algorithm for solving (near) exact SVP in lower-dimensional lattices.Therefore, finding faster algorithms to solve SVP is critical to choosing security parameters of cryptographic primitives.
As one would expect from the hardness results above, all known algorithms for solving exact SVP, including the ones we present here, require at least exponential time.In fact, the fastest known algorithms also require exponential space.There has been some recent evidence [4] showing that one cannot hope to get a 2 o(n) time algorithm for SVP if one believes in complexity theoretic conjectures such as the (Gap) Exponential Time Hypothesis.Most of the known algorithms for SVP can be broadly classified into two classes: (i) the algorithms that require memory polynomial in n but run in time n O(n) and (ii) the algorithms that require memory 2 O(n) and run in time 2 O(n) .
The first class, initiated by Kannan [34,28,26,22,48], combines basis reduction with exhaustive enumeration inside Euclidean balls.While enumerating vectors requires 2 O(n log n) time, it is much more space-efficient than other kinds of algorithms for exact SVP.
Another class of algorithms, and currently the fastest, is based on sieving.First developed by Ajtai, Kumar, and Sivakumar [7], they generate many lattices vectors and then divideand-sieve to create shorter and shorter vectors iteratively.A sequence of improvements [51,49,46,50,2,5], has led to a 2 n+o(n) time and space algorithm by sieving the lattice vectors and carefully controlling the distribution of output, thereby outputting a set of lattice vectors that contains the shortest vector with overwhelming probability.
An alternative approach using the Voronoi cell of the lattice was proposed by Micciancio and Voulgaris [47] and gives a deterministic 2 2n+o(n) -time and 2 n+o(n) -space algorithm for SVP (and many other lattice problems).

4:3
There are variants [49,46,39,11] of the above mentioned sieving algorithms that, under some heuristic assumptions, have an asymptotically smaller (but still 2 Θ(n) ) time and space complexity than their provable counterparts.

Algorithms giving a time/space tradeoff
Even though sieving algorithms are asymptotically the fastest known algorithms for SVP, the memory requirement, in high dimension, has historically been a limiting factor to run these algorithms.Some recent works [18,8] have shown how to use new tricks to make it possible to use sieving on high-dimensional lattices in practice and benefit from their efficient running time [57].
Nevertheless, it would be ideal and has been a long standing open question to obtain an algorithm that achieves the "best of both worlds", i.e. an algorithm that runs in time 2 O(n)  and requires memory polynomial in n.In the absence of such an algorithm, it is desirable to have a smooth tradeoff between time and memory requirement that interpolates between the current best sieving algorithms and the current best enumeration algorithms.
To this end, Bai, Laarhoven, and Stehlé [10] proposed the tuple sieving algorithm, providing such a tradeoff based on heuristic assumptions similar in nature to prior sieving algorithms.They conjectured a running time k n+o(n) and space complexity k n/k+o (n) .One can vary the parameter k to obtain a smooth time/space tradeoff.Nevertherless, it is still desirable to obtain a provable variant of this algorithm that does not rely on any heuristics.The complexity of this algorithm was later proven, under the same heuristic assumptions [29], but only for constant k, therefore leaving the subexponential memory regime open.
Kirchner and Fouque [36] attempted to do this.They claim an algorithm for solving SVP in time q Θ(n) and in space q Θ(n/q) for any positive integer q > 1.Unfortunately, their analysis falls short of supporting their claimed result, and the correctness of the algorithm is not clear.We refer the reader to the full version of the paper for more details.
In addition to the above, Chen, Chung, and Lai [15] propose a variant of the algorithm based on Discrete Gaussian sampling in [2].Their algorithm runs in time 2 2.05n+o(n) and the memory requirement is 2 0.5n+o (n) .The quantum variant of their algorithm runs in time 2 1.2553n+o(n) time and has the same space complexity.Their algorithm has the best space complexity among known provably correct algorithms that run in time 2 O(n) .
A number of works have also investigated the potential quantum speedups for lattice algorithms, and SVP in particular.A similar landscape to the classical one exists, although the quantum memory model has its importance.While quantum enumeration algorithms only require qubits [9], sieving algorithms require more powerful QRAMs [39,37].

Our results
We first present a new algorithm for SVP that provides a smooth tradeoff between the time complexity and memory requirement of SVP without any heuristic assumptions.This algorithm is obtained by giving a new algorithm for sampling lattice vectors from the Discrete Gaussian distribution that runs in time q O(n) and requires q O(n/q 2 ) space.
▶ Theorem 1 (Time-space tradeoff for smooth discrete Gaussian, informal).There is an algorithm that takes as input a lattice L ⊂ R n , a positive integer q, and a parameter s above the smoothing parameter of L, and outputs q 16n/q 2 samples from D L,s using q 13n+o(n) time and poly(q) • q 16n/q 2 space.S TA C S 2 0 2 1 4:4

Improved (Provable) Algorithms for the Shortest Vector Problem
Using the standard reduction from Bounded Distance Decoding (BDD) with preprocessing (where an algorithm solving the problem is allowed unlimited preprocessing time on the lattice before the algorithm receives the target vector) to Discrete Gaussian Sampling (DGS) from [16] and a reduction from SVP to BDD given in [15], we obtain the following.
▶ Theorem 2 (Time-space tradeoff for SVP).Let n ∈ N, q ∈ [4, √ n] be a positive integer.Let L be the lattice of rank n.There is a randomized algorithm that solves SVP in time q 13n+o(n) and in space poly(n) If we take k = q 2 , then the time complexity of the previous SVP algorithm becomes k 6.5n+o(n) and the space complexity poly(n) • k (8n/k) .Our tradeoff is thus the same (up to a constant in the exponents) as what was claimed by Kirchner and Fouque [36] and proven in [29] under heuristic assumptions.
Our second result is a quantum algorithm for SVP that improves over the current fastest quantum algorithm for SVP [2] (Notice that the algorithm in [2] is still the fastest classical algorithm for SVP).
▶ Theorem 3 (Quantum Algorithm for SVP).There is a quantum algorithm that solves SVP in 2 0.9533n+o(n) time and classical 2 0.5n+o(n) space with an additional number of qubits polynomial in n.
Our third result is a classical algorithm for SVP that improves over the algorithm from [15] and results in the fastest classical algorithm that has a space complexity 2 0.5n+o(n) .
The time complexity of our second and third results are obtained using a quantity related to the kissing number of a lattice.A known upper bound of this quantity is 2 0.402n , but in practice for most lattices, it can be much smaller and even 2 o(n) .In that case, our classical algorithm runs in time 2 1.292n and our quantum algorithm runs in time 2 0.750n .See Section 5 of the full version of the paper for more details [1].
We summarize known provable Classical and Quantum algorithms in Table 1.Note that all the classical algorithms are also quantum algorithms but they don't use any quantum power.
Table 1 Comparison of algorithms for the Shortest vector problem.[39] uses the quantum RAM model.[15] and our quantum algorithm need only polynomial qubits and 2 0.5n+o(n) classical space.

Classical Algorithms Time
Space Reference This paper

Quantum Algorithms
This paper ▶ Remark 5 (Magic constants).Most of the constants that appear in this paper were calculated by optimising the complexity with respect to a quantity related to the kissing number and then instantiating with b = 0.402, the best known upper-bound on this quantity.The details of these calculations are available in the full version, section 5.

Roadmap
In the following, we give a high-level overview of our proofs in Section 1.2.Section 2 contain some preliminaries on lattices.The proofs of the time-space tradeoff for Discrete Gaussian sampling above the smoothing parameter and the time-space tradeoff for SVP are given in Section 3. Our classical and quantum algorithms for solving SVP with space complexity 2 0.5n+o(n) are presented in Section 4. We also shows how the time complexity of our algorithms varies with a quantity related to the kissing number in Section 5 of the full version of the paper [1].

Proof overview
We now include a high-level description of our proofs.Before describing our proof ideas, we emphasize that it was shown in [16,2] that given an algorithm for DGS a constant factor c above the smoothing parameter, we can solve the problem of BDD where the target vector is within distance αλ 1 (L) of the lattice, where the constant α < 0.5 depends on the constant c.Additionally, using [15], one can enumerate all lattice points within distance pδ to a target t by querying p n times a BDD oracle with decoding distance δ (or p n/2 times if we are given a quantum BDD oracle).Thus, by choosing p = ⌈λ 1 (L)/δ⌉ and t = 0, an algorithm for BDD immediately gives us an algorithm for SVP.Therefore, it suffices to give an algorithm for DGS above the smoothing parameter.

Time-space tradeoff for DGS above smoothing
Recall that efficient algorithms are known for sampling from a discrete Gaussian with a large enough parameter (width) [38,24,12].In [2], the authors begin by sampling N = 2 n+o(n) vectors from the Discrete Gaussian distribution with (large) parameter s and then look for pairs of vectors whose sum is in 2 L, or equivalently pairs of vectors that lie in the same coset c ∈ L /2 L. Since there are 2 n cosets, if we take Ω(2 n ) samples from D L,s , almost all of the resulting vectors (except at most 2 n vectors) will be paired and are statistically close to independent samples from the distribution D L,s/ √ 2 , provided that the parameter s is sufficiently above the smoothing parameter.
To reduce the space complexity, we modify the algorithm by generating random samples and checking if the sum of d of those samples is in q L for some integer q.Intuitively, if we start with two lists of vectors (L 1 and L 2 ) of size q O(n/d) from D L,s , where s is sufficiently above the smoothing parameter, each of these vectors is contained in any coset q L +c for any c ∈ L /q L with probability roughly 1/q n .We therefore expect that the coset of a uniformly random d-combination of vectors from L 2 is uniformly distributed in L /q L. The proof of this statement follows from the Leftover Hash Lemma [31].We therefore expect that for any vector v ∈ L 1 , with high probability, there is a set of d vectors x 1 , . . ., x d in L 2 that sum to a vector in q L +v, and hence Micciancio and Peikert ([43]) shows that this vector is statistically close to a sample from the distribution D L,s √ d+1/q .We can find such a combination by trying all subsets of d vectors.We would like to repeat this and find q O(n/d) (nearly) independent vectors in q L. It is not immediately clear how to continue since, in order to guarantee independence, one would not want to reuse the already used vectors x 1 , . . ., x d and conditioned on the choice of these vectors, the distribution of the cosets containing the remaining vectors is disturbed and is no longer nearly uniform.By using a simple combinatorial argument, we show that even after removing any 1/ poly(d) fraction of vectors from the list L 2 , the d-combination of vectors in L 2 has at least cq n different cosets.This is sufficient to output q O(n/d) independent vectors in q L with overwhelming probability.

A new algorithm for BDD with preprocessing leads to a faster quantum algorithm for SVP
This result improves the quantum algorithm from [15].As mentioned above, a BDD oracle from discrete Gaussian sampling can have a decoding distance αλ 1 (L) with α < 0.5, and, using [15], one needs to enumerate all lattice points within distance pαλ 1 (L) to a target t by querying p n times a BDD oracle with decoding distance αλ 1 (L) (or p n/2 times if we are given a quantum BDD oracle).Hence, we need to take p = 3 so that pαλ 1 (L) ⩾ λ 1 (L), and the search space is at least 3 n , or 3 n/2 quantum queries.Thus, towards optimizing the algorithm for SVP, one should aim to solve α-BDD for α slightly larger than 1/3 since a larger value of α will still lead to the same running time for SVP.Using known bounds, it can be shown that such an algorithm requires 2 0.1605n+o(n) independent (preprocessed) samples from D L,ηε(L)1 for ε = 2 −cn for some constant c.
In [2], the authors gave an algorithm that runs in time 2 n/2+o(n) and outputs 2 n/2+o(n) samples from D L,s for any s ≥ √ 2η 0.5 (L), i.e. a factor √ 2 above the smoothing parameter).In order to obtain samples at the smoothing parameter, we construct a dense lattice L ′ of smaller smoothing parameter than L. We then sample 2 0.5n+o(n) vectors from D L ′ ,s and reject those that are not in L. Using the reduction from BDD to DGS, and by repeating this algorithm, we obtain a 2 0.661n+o(n) time and 2 0.5n+o(n) -space algorithm to solve 1/3-BDD with preprocessing, where each call to BDD requires 2 0.161n+o(n) time.Thus, the total time complexity of the classical algorithm is 3 n • 2 0.161n+o(n) , and that of the corresponding quantum algorithm is 3 n/2 • 2 0.161n+o(n) .

Covering surface of a ball by spherical caps
As we mentioned above, one can enumerate all lattice points within a pδ distance to a target t by querying p n times a BDD oracle with decoding distance δ.Our algorithm for BDD is obtained by preparing samples from the discrete Gaussian distribution.However, note that the decoding distance of BDD oracle built by discrete Gaussian samples as shown in [16] is successful if the target vector is within a radius αλ 1 (L) for α < 1/2 (there is a tradeoff between α and the number of DGS samples needed), and therefore, if we choose t to be 0, as we do in the other algorithms mentioned above, then p has to be at least 3 to ensure that the shortest vector is one of the vectors output by the enumeration algorithm.We observe here that if we choose a target t to be a random vector "close to" but not at the origin, then the shortest vector will be within a radius 2δ from the target t with some probability P , and thus we can find the shortest vector by making 2 n /P calls to the BDD oracle.An appropriate choice of the target t and the factor α gives an algorithm that runs in time 2 n • 2 0.74n+o(n) , which is faster than the algorithm (running in time 3 n 2 0.161n+o(n) ) mentioned above.
We note that the corresponding quantum algorithm runs in time 2 n/2 • 2 0.74n+o(n) , which is significantly slower than the quantum algorithm mentioned above.
We also note that the running time of this algorithm crucially depends on a quantity related to the kissing number of a lattice.Since a tight bound on this quantity is not known, the actual running time of this algorithm might be smaller than that promised above.For a more elaborate discussion on this, see Section 5 of the full version [1].

Preliminaries
Let N = {1, 2, . . ., }.We use bold letters x for vectors and denote a vector's coordinates with indices x i .We use log to represent the logarithm base 2 and ln to represent the natural logarithm.Throughout the paper, n will always be the dimension of the ambient space R n .

Lattices
A lattice L is a discrete subgroup of R n , i.e. the set The lattice L is said to be full-rank if n = m.We denote by λ 1 (L) the first minimum of L, defined as the length of a shortest non-zero vector of L.
For a rank n lattice L ⊂ R n , the dual lattice, denoted L * , is defined as the set of all points in span(L) that have integer inner products with all lattice points,

Probability distributions
Given two random variables X and Y on a set E, we denote by d SD the statistical distance between X and Y , which is defined by We write X is ε-close to Y to denote that the statistical distance between X and Y is at most ε.Given a finite set E, we denote by U E a uniform random variable on E, i.e., for all x ∈ E,

Discrete Gaussian Distribution
For any s > 0, define ρ s (x) = exp(−π∥x∥ 2 /s 2 ) for all x ∈ R n .We write ρ for ρ 1 .For a discrete set S, we extend ρ to sets by ρ s (S) = x∈S ρ s (x).Given a lattice L, the discrete Gaussian D L,s is the distribution over L such that the probability of a vector y ∈ L is proportional to ρ s (y): Pr X∼D L,s [X = y] = ρs(y) ρs(L) .

Lattice problems
The following problem plays a central role in this paper.
▶ Definition 6.For δ = δ(n) ≥ 0, σ a function that maps lattices to non-negative real numbers, and m = m(n) ∈ N, δ-DGS m σ (the Discrete Gaussian Sampling problem) is defined as follows: The input is a basis B for a lattice L ⊂ R n and a parameter s > σ(L).The goal is to output a sequence of m vectors whose joint distribution is δ-close to m independent samples from D L,s .
We omit the parameter δ if δ = 0, and the parameter m if m = 1.We stress that δ bounds the statistical distance between the joint distribution of the output vectors and m independent samples from D L,s .We consider the following lattice problems.

▶ Definition 7. The search problem SVP (Shortest Vector Problem) is defined as follows:
The input is a basis B for a lattice L ⊂ R n .The goal is to output a vector y ∈ L with ∥⃗ y∥ = λ 1 (L).
▶ Definition 8.The search problem CVP (Closest Vector Problem) is defined as follows: The input is a basis B for a lattice L ⊂ R n and a target vector ⃗ t ∈ R n .The goal is to output a vector ⃗ y ∈ L with ∥⃗ y − ⃗ t∥ = dist( ⃗ t, L).
▶ Definition 9.For α = α(n) < 1/2, the search problem α-BDD (Bounded Distance Decoding) is defined as follows: The input is a basis B for a lattice L ⊂ R n and a target vector Note that while our other problems become more difficult as the approximation factor γ becomes smaller, α-BDD becomes more difficult as α gets larger.
For convenience, when we discuss the running time of algorithms solving the above problems, we ignore polynomial factors in the bit-length of the individual input basis vectors (i.e.we consider only the dependence on the ambient dimension n).
For a lattice L and ε > 0, the smoothing parameter η ε (L) is the smallest s such that The smoothing parameter has the following well-known property.
The following lemma gives a bound on the smoothing parameter.
▶ Theorem 13 ([43,Theorem 3.3]).Let L be an n dimensional lattice, z ∈ Z m a nonzero integer vector, , and L +c i arbitrary cosets of L for i = 1 • • • , m.Let y i be independent vectors with distributions D L +ci,si , respectively.Then the distribution of We will need the following reduction from α-BDD to DGS that was shown in [16]. .Then, there exists a randomized reduction from CVP ϕ to 0.5-DGS m ηε , where m = O( n log(1/ε) √ ε ) and CVP ϕ is the problem of solving CVP for target vectors that are guaranteed to be within a distance ϕ(L) of the lattice.The reduction preserves the dimension, makes a single call to the DGS oracle, and runs in time m • poly(n).Furthermore, the reduction always reduces an instance of CVP ϕ on a lattice L to an instance of DGS on the dual lattice L * .
We need the following relation between the first minimum of lattice and the smoothing parameter of dual lattice.We will use this to compute the decoding distance of BDD oracle.
▶ Theorem 16 ([15,Theorem 8]).Given a basis matrix B ⊂ R n×n for lattice L(B) ⊂ R n , a target vector t ∈ R n , an α-BDD oracle BDD α with α < 0.5, and an integer scalar p > 0. Let } contains all lattice points within distance pαλ 1 (L) to t.We will need the following theorems to sample the DGS vectors with a large width.
▶ Theorem 17 ([2],Proposition 2.17).For any ε ≤ 0.99, there is an algorithm that takes as input a lattice L ∈ R n , M ∈ Z >0 (the desired number of output vectors), and s > 2 n log log n/ log n • η ε (L), and outputs M independent samples from D L,s in time M • poly(n).

Probability
We need the following lemma on distribution of vector inner product which directly follows from the Leftover Hash Lemma [31].
▶ Lemma 20.Let G be a finite abelian group, and let f be a positive integer.
Let X, Y be independent and uniformly random variables on G f , Y, respectively.Then , where U G is uniform in G and independent of X.We will also need the Chernoff-Hoeffding bound [30].
▶ Lemma 21.Let X 1 , . . ., X M be the independent and identically distributed random boolean variables of expectation p. Then for ε > 0, Pr For preliminaries on quantum computing, see [

Algorithms with a time-memory tradeoff for lattice problems
In this section, we present a new algorithm for Discrete Gaussian sampling above the smoothing parameter.

Algorithm for Discrete Gaussian Sampling
We now present the main result of this section.
n] be positive integers, and let ε > 0. Let C be any positive integer.Let L be a lattice of rank n, and let s ≥ 2 There is an algorithm that, given The algorithm runs in time C • (10e • d) 8d • q 8n+n/d+o(n) and requires memory poly(d) • q n/d excluding the input and output memory.
Proof.We prove the result for C = 1, and the general result follows by repeating the algorithm.Let {x 1 , . . ., x N } be the N input vectors and let {c 1 , . . ., c N } be the corresponding cosets in L /q L. The algorithm does the following: , . . ., x N } each with N 2 input vectors, and let Q = 0. 2. Let v be the first vector in L 1 .

Find 8d vectors (by trying all 8d-tuples
If no such vectors exist go to step(6).

Output the vector
Remove vector ⃗ v from L 1 and repeat Steps (2) to (5).

The time complexity of the algorithm is
and memory requirement of the algorithm is immediate.We now show correctness.Let ε ′ = ε 2d so that s ≥ √ 2η ε ′ (q L) by Lemma 12. Without loss of generality, we can assume that the vectors x i for i ∈ [N ] are sampled by first sampling c i ∈ L/q L such that Pr[c i = c] = Pr[D L,s ∈ q L +c] and then sampling the vector x i according to D q L +ci,s .Moreover, by Corollary 11, this distribution is 2ε ′ N -close to sampling c i for i ∈ [N ], independently and uniformly from L /q L, and then sampling the vectors x i according to D q L +ci,s .We now assume that the input is sampled from this distribution.
Without loss of generality, we can assume that the algorithm initially gets only the corresponding cosets as input, and the vectors x ij ∈ q L +c ij for j ∈ [8d], and v ∈ q L +c are sampled from D q L +ci j ,s and D q L +c,s only before such a tuple is needed in Step 4 of the algorithm.Since any input vector is used only once in Step 4, these samples are independent of all prior steps.This implies, by Theorem 13, that the vector obtained in Step 4 of the algorithm is ε ′ (8d + 1)-close to being distributed as D L,s √ 8d+1 q .It remains to show that our algorithm finds q n/d vectors (with high probability).Let N ′ = N 2 be an integer, X be a random variable uniform over (L /q L) N ′ , and let Y be a random variable independent of X and uniform over vectors in {0, 1} N ′ with Hamming weight 8d.The number of such vectors is Let U be a uniformly random coset of L /q L. By Lemma 20 and (3), we have for a large enough value of n.By Markov inequality, with probability greater than 1 − (10 • q −5n/2 ) over the choice of x ← X, we have that the statistical distance between ⟨x, Y ⟩ and U is less than q −n 10 , which implies for any v ∈ L /q L, We assume that the input vectors in list L 2 satisfy (4), introducing a statistical distance of at most 10 • q −5n/2 .Notice that after the algorithm found i vectors for any i < q n/d , it has removed 8id vectors from L 2 .We will show that for each vector from L 1 (which is uniformly sampled from L /q L) with constant probability we will find 8d-vectors in Step (3).After i < q n/d output vectors have been found, there are M = N ′ − 8id vectors remaining in the list L 2 .There are M 8d different 8d-combinations possible with vectors remaining in L 2 .
At the beginning of the algorithm, there are N ′ 8d combinations, and hence by (4), each of the q n cosets appears at least 0.9q −n N ′ 8d times.After i < q n/d output vectors have been found, there are only M 8d combinations left, and N ′ 8d − M 8d possible combinations have been removed.We say that a coset c disappears if there is no set of 8d vectors in L 2 that add to c.In order for a coset to disappear, all of the at least 0.9q −n N ′ 8d combinations from the initial list must be removed.Hence, the number of cosets that disappear is at most < 3/5 0.9 q n = 2 3 q n distinct cosets by (5).Hence with probability at least 1/3, we find 8d vectors x i1 , . . ., x i 8d from L 2 such that x i1 + • • • + x i 8d − v ∈ q L. By Chernoff-Hoeffding bound with probability greater than 1 − e −d 2 q n/d , the algorithm finds at least q n/d vectors.In total, the statistical distance from the desired distribution is √ n] be an integer, and let ε = q −32n/q 2 .Let L be a lattice of rank n, and let s ≥ η ε (L).There is an algorithm that outputs a list of vectors that is q −Ω(n) -close to q 16n/q 2 independent vectors from D L,s .The algorithm runs in time q 13n+o(n) and requires memory poly(n) • q 16n/q 2 .Proof.Choose d so that 16d − 16 < q 2 ⩽ 16d, which is possible when q ⩾ 4, and let α = q/ √ 8d + 1 -this is the ratio by which we decrease the Gaussian width in Theorem 22and note that α ≥ 1.2.
Let p = ⌈2 √ dq⌉ < q 2 and k be the smallest integer such that α k • p ≥ 2 n log log n/ log n .Thus k = O(n log log n/ log n).Let g = α k ps ≥ 2 n log log n/ log n • η ε (L).By Theorem 17, in time N 0 • poly(n), we get N 0 = (160d 2 ) k q n/d samples from D L,g .
We now iterate k times the algorithm from Theorem 22. Initially we have N 0 vectors.At the beginning of the i-th iteration for i ≤ k − 1, we have N i := N 0 • (160d 2 ) −i vectors that are ∆ i -close to being independently distributed from D L,α −i g , where α −i g ⩾ αp • η ε (L).Hence, we can apply Theorem 22 and get N i+1 = N i /160d 2 vectors that are ∆ i+1 -close to being S TA C S 2 0 2 1

4:12
Improved (Provable) Algorithms for the Shortest Vector Problem independently distributed from D L,α −(i+1) g , where ∆ i+1 ⩽ ∆ i + 4ε 2d N i + 11(160d 2 ) k−i q −5n/2 .At each iteration we had N i ≥ 160d 2 q n/d vectors, a necessary condition to apply Theorem 22. Therefore after k iterations, we have at least N k = N 0 /(160d 2 ) k = q n/d samples that are ∆ k -close to being independently distributed from D L,α −k g , where Any vector distributed as D L,ps is in p L with probability at least p −n .We repeat the algorithm 2p n = O(q 2n ) times to obtain p n • 2 • q n/d vectors that are 2p n q −5n/2+o(n) = q −n/2+o(n) close to 2p n • q n/d independent samples from D L,ps .Of these samples obtained, we only keep vectors that fall in p L and divide them by p.Let M = p n • 2 • q n/d .By Chernoff-Hoeffding (Lemma 21) with P = p −n , and δ = 1 2 , the probability to obtain less than (1 − δ)P M = q n/d samples is at most . Furthermore, d ⩽ q 2 +16 16 and q → ln q 16+q 2 is decreasing for q ⩾ 4, hence for q ⩽ √ n, (1) = Ω(n 8 ).
Hence with probability greater than 1 − e − 1 10 q n/d = 1 − q −Ω(n 8 ) , we get q n/d vectors from the distribution D L,s .The statistical distance from the desired distribution is q −Ω(n 8 ) + q −n/2+o(n) ≤ q −n/2+o(n) .We repeat this for q 16n/q 2 q n/d times, to get q 16n/q 2 vectors.The total statistical distance from the desired distribution is q 16n/q 2 q n/d • q −n/2+o(n) ≤ q −Ω(n) .The total running time is bounded by The memory usage is slightly more involved: we can think of the k iterations as a pipeline with k intermediate lists and we observe that as soon as a list (at any level) has more than 160d 2 q 16n/q 2 elements, we can apply Theorem 22 to produce q 16n/q 2 vectors at the next level.Hence, we can ensure that at any time, each level contains at most 160d 2 q 16n/q 2 vectors, so in total we only need to store at most k • 160d 2 q 16n/q 2 = poly(n)q 16n/q 2 vectors, to which we add the memory usage of the algorithm of Theorem 22 which is bounded by poly(n) • q n/d ⩽ poly(n) • q 16n/q 2 .Finally, we run the filter (p L) on the fly at the end of the k iterations to avoid storing useless samples.◀ This tradeoff works for any q ≥ 4, and the running time can be bounded by c n+o(n) 1 • q c2n for some constants c 1 and c 2 that we have not tried to optimize.

Algorithms for BDD and SVP
▶ Theorem 24.Let n ∈ N, q ∈ [4, √ n] be a positive integer.Let L be a lattice of rank n, there exists an algorithm that creates a 0.1/q-BDD oracle in time q 13n+o(n) and space poly(n) • q 16n/q 2 .Every call to this oracle takes time poly(n)q 16n/q 2 time.
and s = η ε (L * ).From corollary 23, there exists an algorithm that outputs q 16n/q 2 vectors whose distribution is statistically close to time D L * ,s in q 13n+o(n) and space poly(n) • q 16n/q 2 .By Theorem 14, there is a reduction from α-BDD to 2ηε(L * )λ1(L) .By repeating poly(n) times the algorithm from Corollary 23, we get m vectors from D L * ,ηε(L * ) .By Lemma 15, we get Note that here we are using the fact that the reduction in Theorem 14 always reduces an instance on a lattice L to an instance on the dual lattice L * : this is why we generate samples from D L * ,ηε(L * ) in the preprocessing phase, even before any call to the oracle is made.Finally, by Theorem 14, each call to the oracle takes time m • poly(n) = O(q 16n/q 2 poly(n)).◀ ▶ Theorem 25.Let n ∈ N, q ∈ [4, √ n] be a positive integer.Let L be a lattice of rank n.There is a randomized algorithm that solves SVP in time q 13n+o(n) and in space poly(n) • q 16n q 2 .Proof.By Theorem 24, we can construct a 0.1 q -BDD oracle in time q 13n+o(n) and in space poly(n) • q 16n q 2 .Each execution of the BBD oracle now takes O(poly(n)q 16n/q 2 ) time.By Theorem 16, with (10q) n queries to 0.1 q -BDD oracle, we can find the shortest vector.The total time complexity is q 13n+o(n) + poly(n)q 16n/q 2 • (10q) n = q 13n+o(n) .◀ ▶ Remark 26.If we take q = √ n, Theorem 25 gives a SVP algorithm that takes n O(n) time and poly(n) space.The constant in the exponent of time complexity is worse than the best enumeration algorithms.When q is a large enough constant, for any constant ε > 0, there exists a constant C = C(ε) > 2, such that there is a 2 Cn time and 2 εn space algorithm for DGS, and SVP.In particular, the time complexity of the algorithm in this regime is worse than the best sieving algorithms.

4
New space efficient algorithms for SVP In this section, we present relatively space-efficient classical and quantum algorithms to find a shortest nonzero lattice vector.Our quantum algorithm is the first provable algorithm for exact-SVP that takes less than O(2 n ) time.Recall that there exists an algorithm [15] that, given a lattice L and a target vector t, outputs all lattice vectors within distance pαλ 1 (L) to t, by making p n calls to an α-BDD oracle.We present a quantum algorithm for SVP that takes 2 0.9532n+o(n) time and 2 0.5n+o(n) space with poly(n) qubits.We also present a classical algorithm for SVP that takes 2 1.741n+o(n) time and 2 0.5n+o(n) space.The strategy followed by [15] is to choose p = ⌈1/α⌉, the target vector t to be the origin, and sequentially compute the candidate vectors for SVP.There are two ways to reduce the time complexity: one can improve the BDD oracle or reduce the number of queries.We will show how to improve both aspects.

Quantum algorithm for SVP
In order to solve SVP by the method in [15], it is sufficient to use a BDD oracle with decoding coefficient α slightly greater than 1/3.In [15], the authors use a reduction from BDD to DGS by [16] and use the Gaussian sampler of [2] to obtain many samples with standard deviation equal to √ 2η 1/2 .This allows them to construct a 0.391-BDD but each call to the BDD oracle uses many DGS samples.This is wasteful since we really only need a 1/3-BDD.The reason why it is so expensive is that in the analysis they need to find ε such that η ε > √ 2η 1/2 to apply the reduction, and it requires them to take ε much smaller than would be strictly necessary to construct a 1/3-BDD oracle; this smaller ε explains the bigger decoding radius.
We obtain a BDD oracle with decoding distance 1/3 by using the same reduction but making each call cheaper.This is achieved by building a sampler that directly samples at the smoothing parameter, hence avoiding the √ 2 factor, allowing us to take a bigger ε.In [2], it was shown how to construct a dense lattice L ′ whose smoothing parameter η(L ′ ) is √ 2 times smaller than the original lattice, and that contains all lattice points of the original lattice.Suppose that we first use such a dense lattice to construct a corresponding discrete Gaussian sampler with standard deviation equal to s = √ 2η(L ′ ).We then do the rejection sampling on condition that the output is in the original lattice L. We thus have constructed a discrete Gaussian sampler of L whose standard deviation is √ 2η(L ′ ) = η(L).Nevertheless, | L ′ / L | will be at least 2 0.5n , which implies that this procedure needs at least 2 0.5n input vectors to produce an output vector.We use this idea to obtain the following lemma.
Proof.Let a = n 2 + 4. We repeat the following until we output m vectors.We use the algorithm in Lemma 19 to obtain a lattice L ′ ⊃ L of index 2 a .We then run the algorithm from Theorem 18 with input (L ′ , s) to obtain a list of vectors from L ′ .We output the vectors in this list that belong to L.
By Theorem 18, we obtain, in time and space 2 (n/2)+o(n) , M = 2 n/2 vectors that are 2 −Ω(n 2 ) -close to M vectors independently sampled from D L ′ ,s .Also, by Lemma 19, with probability at least 1/2, we have s ≥ η 1/3 (L) ≥ √ 2η 1/2 (L ′ ).From these M vectors, we will reject the vectors which are not in lattice L. It is easy to see that the probability that a vector sampled from the distribution D L ′ ,s is in L is at least ρ s (L)/ρ s (L ′ ) ≥ 1 2 a using Lemma 10.Thus, the probability that we obtain at least one vector from L (which is distributed as D L,s ) is at least 16 ).
It implies that after rejection of vectors, with constant probability we will get at least one vector from D L,s .Thus, the expected number of times we need to repeat the algorithm is O(m) until we obtain vectors y 1 , . . ., y m whose distribution is statistically close to being independently distributed from D L,s .The time and space complexity is clear from the algorithm.◀ ▶ Theorem 28.For any sufficiently large integer n, any integer m > 0, and a lattice L ⊂ R n , there exists an algorithm that creates a 1/3-BDD oracle in 2 0.6608n+o(n) time and 2 0.5n+o(n) space.Every call to this oracle takes 2 0.1608n+o(n) time and space.
Proof.See full version [1,Theorem 30]; it is similar to Theorem 24 but using Lemma 27. ◀ From [15], we can enumerate all vectors of length p• 1 3 λ 1 (L) by making p n calls to 1/3-BDD oracle.Although naively searching for the minimum in the set of vectors of length less than or equal to p • 1 3 λ 1 (L), will find the origin with high probability, one can work around this issue by shifting the zero vector.Choosing an arbitrary nonzero lattice vector as the shift, we are guaranteed to obtain a vector of length at least λ 1 for p ≥ 3. Hence by combining the 1/3-BDD oracle from Theorem 28 and the quantum minimum finding algorithm from [19,

1 IntroductionA
lattice L = L(b 1 , . . ., b n ) := { n i=1 z i b i : z i ∈ Z} is the set of all integer combinations of linearly independent vectors b 1 , . . ., b n ∈ R n .We call n the rank of the lattice and (b 1 , . . ., b n ) a basis of the lattice.
for a lattice basis B = ( ⃗ b 1 , . . ., ⃗ b n ), we define the dual basis B * = ( ⃗ b * 1 , . . ., ⃗ b * n ) to be the unique set of vectors in span(L) satisfying ⟨ ⃗ b * i , ⃗ b j ⟩ = 1 if i = j, and 0, otherwise.It is easy to show that L * is itself a rank n lattice and B * is a basis of L * .Given a lattice B = ( ⃗ b 1 , . . ., ⃗ b n ), we denote ∥ B ∥ 2 = max i ∥b i ∥.