Learning with Errors Augmented with Auxiliary Data

I. Lattice-based Encryption 16

3.3. Learning with Errors Augmented with Auxiliary Data

In this section, we show how to inject further useful information in the error vectors of LWE instances without necessarily changing its distribution. We call this technique

”message embedding” and formulate a modified LWE problem definition, namely the Augmented LWE (A-LWE) problem, which involves an error term produced by use of this new approach. We show that certain instantiations of the A-LWE problem are as hard as the original LWE problem.

3.3.1. Message Embedding

We start explaining the core functionality of our work leading to conceptually new cryptographic applications such as encryption schemes. In particular, we show how to generate vectors that encapsulate an arbitrary message while simultaneously fol-lowing the discrete Gaussian distribution D_Zm,r. More specifically, Lemma 3.2 and

3. Augmented LWE and its Hardness

Lemma 3.4 are used, which essentially state that a discrete Gaussian over the in-tegers can be simulated by sampling a coset b ∈Zⁿ

p uniformly at random for any preimage sampleable full-rank matrixB∈Zⁿ

0×m

p and then invoking a discrete Gaus-sian sampler outputting a vector from Λ^⊥_b(B) = c+ Λ^⊥_p(B) for B·c ≡ bmodq.

However, this requires the knowledge of a suitable basis for Λ^⊥_p(B). In fact, the random selection of the coset b can be made deterministical by means of a random oracle H or PRNG taking a random seed with enough entropy as input. The fact that xoring a messagemto the output of a random functionF does not change the distribution, allows to hide the message within the error vector without changing its distribution. As a result, we obtaine←_RD_Λ^⊥

b(B),r, which is indistinguishable from D_Z^m_,r forb=F(seed)⊕musing a random seedand properly chosen parameters.

More formally, let the very simple operations encode : {0,1}ⁿ⁰^log^p → Zⁿ

p and decode : Zⁿ

p → {0,1}ⁿ⁰^log^p allow to bijectively switch between the bit and vec-tor representations. The embedding approach is realized by use of any preimage sampleable full-rank matrix B ∈ Zⁿ

0×m

p . A first idea of doing this is to sample a preimage x ←_R D_Λ^⊥

v(B),r with v = encode(m) for r > η(B) and an arbitrary message m ∈ {0,1}ⁿ⁰^log^p such that B·x mod q = encode(m) holds. Sampling fromD_Λ⊥

v(I),r forB=Iis performed very efficiently and can be reduced to samples from D_p_Z_+v_i_,r. However, since the target Gaussian distribution of many crypto-graphic schemes, such as the LWE encryption schemes, requires to have support Z^m, we have to modify the coset selection to m⊕r for a randomly chosen vector r←_R {0,1}ⁿ⁰^log^p prior to invoking the preimage sampler. Below in Lemma 3.2 we show that given this setup we indeed obtain a sample x that is distributed just as D_Z^m_,r with overwhelming probability. To illustrate this approach exemplarily, let e∈Z^m denote the error term. We then split the error term e= (e1,e2)∈Z^m¹^+m² into two subvectors, each serving for a different purpose. The second parte₂ is used for message embedding, whereas e₁ provides enough entropy in order to sample a random vector r. To this end, one has to find a proper trade-off for the choice of m₁ and m₂, since a too large value for m₂ implies low entropy ofe₁. A reasonable small lower bound is given by m₁ ≥ n, since the discrete Gaussian vector e₁ has min-entropy of at leastn−1 bits as per [GPV08, Lemma 2.10].

The message embedding functionality comes at almost no cost, since it does not involve any complex procedures. One proceeds as follows. First, it is required to sample e₁ ← D_Z^m1,r using an ordinary discrete Gaussian sampler such as the novel one that we introduce in Chapter 5, then one computes u = encode(F(e₁)) for some random function F : {0,1}^∗ → {0,1}^m²^log^p and finally samples a preimage e2 ←_R D_Λ⊥

v(B),r for the syndrome v = encode(m⊕u) ∈ Zⁿ

p using a preimage sampleable matrixB∈Zⁿ

0×m2

p . Following this approach, the message is recovered by computingm=F(e1)⊕decode(B·e2 mod q). In many cryptographic applications there are different random sources available, which can replace the role of e₁ such that the complete bandwidth of e is exploited for data injection. In the following theorems we prove that it is possible to simulate the discrete Gaussian distribution

3. Augmented LWE and its Hardness

D_Zm,r (statistically or computationally) by use of a preimage sampler for any full-rank matrix B. For uniformly distributed error vectors, for which there exist also worst-case reductions [DMQ13, MP13], the discrete Gaussian step is omitted and the error vector is simply obtained via e=encode(m⊕u)∈Z^mp², where p denotes the interval width of its components.

Lemma 3.1. ([MR04, Lemma 4.4]). Let Λ be any n-dimensional lattice. Then for any ∈(0,1),s≥η(Λ), and c∈Rⁿ, we have

ρ_s,c(Λ)∈[1−

1 +,1]·ρ_s(Λ).

Lemma 3.2 (Statistical). Let B ∈ Z^n×mp be an arbitrary full-rank matrix and =negl(n). The statistical distance ∆(D_Zm,r,D_Λ⊥

v(B),r) for uniform v←_R Zⁿ_p and r≥η(Λ^⊥(B))is negligible.

Proof. Consider the statistical distance betweenD_Z^m_,r and D_Λ⊥

v(B),r, wherev∈Zⁿp

is chosen at random. Since B is a full-rank matrix, we have Z^m =

b∈Zⁿp

Λ^⊥_b(B) and ρ_r(Z^m) = P

b∈Zⁿp

ρ_r(Λ^⊥_b(B))∈[¹⁻₁₊,1]·pⁿ·ρ_r(Λ^⊥_p(B)). In the latter distribution D_Λ⊥

v(B),r the process of sampling z ∈ Z^m can be reduced to the tasks of selecting the correct partition Λ^⊥_v(B) with probability _p¹n and subsequently samplingz from Λ^⊥_v(B) with probability _ρ ^ρ^r^(z)

r(Λ^⊥_B·z(B)). Following this, D_Λ⊥

v(B),r outputs a sample z with probability P[X=z] = _p¹n · ^ρ^r^(z)

ρr(Λ^⊥_B·z(B)).

∆(D_Z^m_,r,D_Λ⊥

v(B),r) = X

z∈Z^m

ρr(z) ρr(Z^m) − 1

pⁿ · ρr(z) ρr(Λ^⊥_Bz(B))

Lemma3.1

∈ X

z∈Z^m

ρr(z)

ρr(Z^m) − ρr(z)

pⁿ·[¹⁻₁₊,1]·ρ_r(Λ^⊥_p(B))

Lemma3.1

∈ X

z∈Z^m

ρ_r(z)

ρr(Z^m) − ρ_r(z) [¹⁻₁₊,¹⁺₁₋]· P

b∈Zⁿp

ρr(Λ^⊥_b(B))

= X

z∈Z^m

ρr(z)

ρ_r(Z^m) −[¹⁻₁₊,¹⁺₁₋]·ρr(z) ρ_r(Z^m)

∈ [0, 2

1−]· X

z∈Z^m

ρr(z) ρ_r(Z^m)

≤ 2

1−

3. Augmented LWE and its Hardness

Lemma 3.3. Let X₁ be a distribution that is indistinguishable from X₂ and M is an efficient non-uniform PPT operation. Then, M(X₁) is indistinguishable from M(X₂).

Lemma 3.4 (Computational). Let B ∈ Z^n×mp be an arbitrary full-rank matrix.

If the distribution of v ∈Zⁿp is computationally indistinguishable from the uniform distribution overZⁿ_p, thenD_Λ⊥

v(B),r is computationally indistinguishable from D_Z^m_,r for r ≥η(Λ^⊥(B)).

Proof. Let v⁰ ∼ U(Zⁿp) be a vector chosen at random. By contradiction, we as-sume that e ∼ D_Λ⊥

v(B),r is distinguishable from e⁰ ∼ D_Λ⊥ v0(B),r

Lemma3.2

≈_s D_Z^m_,r in polynomial time for the given parameters and v chosen as above. Then, v is com-putationally distinguishable fromv⁰ by Lemma 3.3 with M(v_i) =D_Λ⊥

vi(B),r. Hence, we have a contradiction. Therefore, the distribution D_Λ⊥

v(B),r is computationally indistinguishable fromD_Zm,r.

3.3.2. Augmented LWE - A Generic Approach

Based on the message embedding approach as described above, we introduce an al-ternative LWE definition that extends the previous one in such a way that the error term is augmented with additional information. We show how the modified error distribution still coincides withD_Z^m_,r in order to allow for a reduction from LWE to our new assumption. We start with a generalized description of the A-LWE distri-bution, whereF stands for a random function. Below, in Section 3.4 and Section 3.5 we give a description of how to instantiate F in order to obtain a random oracle or standard model representation of the A-LWE problem and the related hardness statements.

In the following, we introduce the A-LWE distribution and the computational problems arising from this construction similar to LWE.

Definition 3.5(Augmented LWE Distribution). Letn, n⁰, m, m₁, m₂, k, q, pbe inte-gers withm=m1+m2, where αq≥η(Λ^⊥(B)). LetF :Zⁿ_q×Z^m¹ → {0,1}ⁿ⁰^·log(p)be a function. Let B ∈ Zⁿ

0×m₂

p be a preimage sampleable full-rank matrix (such as B =I). For s ∈ Zⁿq, define the A-LWE distribution L^A-LWE_n,m₁_,m₂_,αq(m) with m∈ {0,1}ⁿ⁰^log^p to be the distribution over Z^n×mq ×Z^mq obtained as follows:

• Sample A←_RZ^n×mq and e₁←_RD_Z^m1,αq .

• Set v=encode(F(s,e₁)⊕m)∈Zⁿ

p .

• Sample e₂ ←_RD_Λ⊥ v(B),αq .

• Return (A,b^>) where b^>=s^>A+e^> with e= (e₁,e₂).

Accordingly, we define the augmented LWE problem(s) as follows. As opposed to traditional LWE, augmented LWE blinds, in addition to the secret vector s ∈Zⁿq,

3. Augmented LWE and its Hardness

also some (auxiliary) data m ∈ {0,1}^m². Thus, we have an additional assump-tion that the message m is hard to find given A-LWE samples. Note that the decision version requires that any polynomial bounded number of samples (A,b^>) from the A-LWE distribution is indistinguishable from uniform random samples in Z^n×m_q ×Z^m_q . Its hardness implies that no information about s and m is leaked through A-LWE samples. In some scenarios, e.g., in security notions of an encryp-tion scheme, the adversary may even choose the message m. Hence, we require in the corresponding problems that their hardness holds with respect to A-LWE dis-tributions with adversarially chosen message(s)mexcept for the search problem of m.

Definition 3.6 (Augmented Learning with Errors (A-LWE)).

Let n, n⁰, m₁, m₂, p, q be integers andB∈Zⁿ

0×m2

p be a preimage sampleable full-rank matrix. Let P (placeholder) stand for the model underlying the respective setting, where P is replaced either by RO for a random oracle model instantiation or S in case of the standard model variant.

The Decision Augmented Learning with Errors (decision A-LWE^P_n,m₁_,m₂_,αq) problem asks upon input m ∈ {0,1}ⁿ⁰^log^p to distinguish in polynomial time (in n) between samples (A_i,b^>_i ) ←_R L^A-LWE_n,m₁_,m₂_,αq(m) and uniform random samples from Z^n×mq ×Zⁿq for a secret s←_RZⁿq.

TheSearch-Secret Augmented Learning with Errors (search-s A-LWE^P_n,m₁_,m₂_,αq) prob-lem asks upon input m∈ {0,1}ⁿ⁰^log^p to output in polynomial time (inn) the vector s ∈ Zⁿ_q given polynomially many samples (Ai,bi) ←_R L^A-LWE_n,m₁_,m₂_,αq(m) for secret s←_RZⁿq.

The Search-Message Augmented Learning with Errors (search-m A-LWE^P_n,m₁_,m₂_,αq) problem asks to output in polynomial time (inn) the vectormgiven polynomi-ally many A-LWE samples(A_i,b_i)for a secrets←_RZⁿq andm∈ {0,1}ⁿ⁰^log^p. We say that decision/search-s/search-m A LWE^P_n,m₁_,m₂_,αq is hard if all polynomial time algorithms solve the decision/search-s/search-m A LWEn,m1,m2,αq problem only with negligible probability.

We note thatBcan be specified to be the identity matrixI∈Z^mp²^×m² forn⁰ =m2, which has some very nice properties as we will point out in the next chapter. In the following sections, we show that if the function F is instantiated by a random oracle or a PRNG in combination with a deterministic function, the hardness of LWE is reducible to the hardness of A-LWE. To this end, we show that the LWE and A-LWE distributions are computationally indistinguishable if we assume that the former search problem is hard and the inputs to the function F have sufficient entropy in each sample given previous samples.

3. Augmented LWE and its Hardness

Im Dokument On the Design and Improvement of Lattice-based Cryptosystems (Seite 38-43)