Main results - Compressed Sensing and ΣΔ-Quantization

In this chapter, we prove the following theorem, which is the main result of this chapter.

Theorem 18. Denote by Q^r_Σ∆ a stable rth order Σ∆ quantizer. Let Φ be an m×N partial random circulant matrix associated to a vector with independentL-subgaussian entries with mean 0 and variance 1. Suppose that N ≥ m ≥ (Cη)^1−2α¹ slog^1−2α² Nlog^1−2α² s, for some η > 1 and α ∈ [0,1/2). With probability exceeding1−e^−η, the following holds:

For all x∈R^N with kΦxk_∞≤µ <1 and alle∈R^mwith kek_∞≤ <1−µ the estimatexˆ obtained by solving (1.27) satisfies

kˆx−xk2≤C1

−r+1/2

δ+C2

σk(x)

√

k +C3

` .

Here C, C₁, C₂, C₃ are constants that only depend onr andL.

Proof. Theorem 18 can be immediately obtained from Theorem 7, which requires a bound on the re-stricted isometry constants of P`V^∗RΩCξ where `=m(_m^s)^α, and Proposition 4 below, which provides the required bound.

Proposition 4. Consider the same set-up and assumptions as Theorem 18; in particular assume that m≥(Cη)^1−2α¹ slog^1−2α² Nlog^1−2α² s, for someη >1 andα∈[0,1/2). Setting`=m(_m^s)^α, we have

sup

|k 1

√

`P`V^∗RΩCxξk²₂−1|>1 9

< e^−η,

where the supremum is over all s-sparse vectors. In other words, with probability exceeding1−e^−η, the matrix ^√¹

`P_`V^∗R_ΩC_ξ satisfies the restricted isometry property of orders, with constant1/9.

Proof. Note that by the triangle inequality,

sup

Thus, the proof of Proposition 4 is divided into controlling these three summands in (4.6). First, Lemma 3 (below) shows that by direct computation the third summand is bounded by ^sm_`N, while Lemma 4 and Lemma 5 bound the probability that the remaining summands exceed ₁₈¹ and ₃₆¹ respectively. Our bound on m (potentially with an increased value ofC) ensures that ^sm_`N ≤ ^s_` = _m^s1−α

≤ ₃₆¹ and the result follows using a union bound.

Lemma 3. Given the same set-up as in Theorem 18 and Proposition 4, one has

|Ek 1

√

`P`V^∗RΩCxξk²₂−1| ≤ (s−1)(m−`)

`(N−1) ≤ sm

`N.

Proof. Denoting byc_i,j the (i, j)-th entry of C_x and noting that we are sampling without replacement, we observe that forp6=q∈[m]

The last two equalities both use the fact that each row ofCx is a shifted copy ofx. Furthermore

where in the last equality we used (4.7) and the fact that the rows of bothC_x and V are normalized.

Using thatxiss-sparse, it follows that

Lemma 4. Consider again the set-up of Theorem 18 and Proposition 4 and denote by D_N,s the set of alls-sparse vectors in R^N. Then

Proof. We will apply Theorem 17 conditionally given Ω withC={^√¹

`P_`V^∗R_ΩC_x:x∈D_s,N}. This set is almost the same as the one considered in the proof of Theorem 4.1 in [39], the only differences being the additional projectionP` and our normalization factor of ^√¹

` (instead of ^√¹_m in [39]). Indeed, since

kP`k2→2≤1 we can estimate the necessary parameters for applying Theorem 17 exactly as in the proof of Theorem 4.1 in [39]. This yields

d_2→2(C)≤

Here, the second inequality follows from our choice of`and the last inequality follows from our assumption onmin Theorem 18 (potentially adjusting the constantC). Again adjusting the constant, we similarly obtain taking the expectation over Ω.

Lemma 5. With the same notation as before, we have

P( sup

wherec,C⁰ are constants that depends only onL.

Proof. The proof is a direct application of Theorem 16 for the random variable

Zx:=E

to find the supremum of the deviation. Since Theorem 16 requires the covering number with respect to the metric d(x, y) :=kZx−Z_ykΨ2 we need a bound for d(x, y), which we provide in Lemma 7 below.

Specifically, the first inequality in Lemma 5 follows from Theorem 16 together with Lemma 3 and Lemma 4 above. Indeed, applying Lemma 7 withy= 0 yields

sup

x,y

kZxkΨ₂ ≤

√m

` kxk∞ˆ ≤

√m

` kF(x)k_∞≤

√m

` kxk1≤

√sm

` kxk2≤

√sm

` . (4.8)

To bound the integral in Theorem 16, we note that

N(D_s,N,

√m

` k · k∞ˆ, ) =N(D_s,N, 1

√mk · k∞ˆ, ` m),

and hence applying the argument in [39, Section 4] scaled by ^m_`, (a detailed calculation stated in Lemma 8)

Z sup_xkZxk_Ψ2 0

logN(Ds,N, 1

√mk · k∞ˆ, ` m)d .

√sm

` logNlogs.

For the second inequality note that by the definition of`and the assumed lower bound onm

(

√sm

` logNlogs)²=s m

1−2α

log²Nlog²s (4.9)

≤C⁻¹η⁻¹. (4.10)

The result follows from the assumption thatη≥1 as in the proof of Lemma 4.

All that remains now is to prove Lemma 7. Before that, we derive a technical bound required for its proof.

Lemma 6. Let Ω,Ω⁰ ∈Ξ ={Ω∈[N]^m: Ωi6= Ωj fori6=j} be such thatΩ differs from Ω⁰ in at most two components. Then the function

f(Ω) :=k 1

√

`P_`V^∗R_ΩC_xk²_F − k 1

√

`P_`V^∗R_ΩC_yk²_F

satisfies

|f(Ω)−f(Ω⁰)| ≤24

` kx−yk∞ˆ, wherekxk_∞_ˆ :=kF xk_∞.

Proof. Note that, as a circulant matrix is diagonalized by the Fourier transform,

f(Ω) =k 1

whereF denotes the non-normalized Fourier transform,F_k^T itsk-th row, and ˆx=F x.

We first consider the case that Ω and Ω⁰ differ only in one component, say the first (without loss of generality). To bound|f(Ω)−f(Ω⁰)|for this case, we note that forV_j^T denoting thej-th row ofV, and

Combining this with (4.11), we obtain

Observe that the right hand side is a sum of four different rescaled Fourier coefficients of the vector u∈R^N given byuk :=|ˆxk|²− |ˆyk|², as for example in this step, it is crucial to sample without replacement, as otherwise, the bound would no longer hold.

Consequently, using the Cauchy-Schwartz inequality, and the fact that

|uk|=

Similar to the other three summands in (4.12), which yields the result for Ω and Ω⁰ differing in only one component with a scale 8 to the result (4.13), i.e.,

|f(Ω)−f(Ω⁰)| ≤ 8

`kx−yk_∞_ˆ.

If Ω and Ω⁰ differ in two components (without loss of generality say in first and second component), one can add and minus two middle terms, (say Ω^HALLO and Ω^HELLO) each of which differs from Ω and Ω⁰ in only one component,

Ω = (Ω1,Ω2, ...) Ω⁰ = (Ω⁰₁,Ω⁰₂, ...) Ω^HALLO = (Ω⁰₁,Ω2, ...) Ω^HELLO = (Ω1,Ω⁰₂, ...),

then

|f(Ω)−f(Ω⁰)| ≤ 8

`kx−yk∞ˆ

≤ |f(Ω)−f(Ω^HALLO) +f(Ω^HALLO)−f(Ω^HELLO) +f(Ω^HELLO)−f(Ω⁰)|

≤ |f(Ω)−f(Ω^HALLO)|+|f(Ω^HALLO)−f(Ω^HELLO)|+|f(Ω^HELLO)−f(Ω⁰)|

≤24

` kx−yk∞ˆ,

i.e., use triangle inequality we have a scale of 24 (3 times 8).

We are now ready to bound the distance d(x, y) =kx−yk_Ψ₂.

Lemma 7. For allx, y∈R^N it holds that

d(x, y)≤24√ m

` kx−yk∞ˆ.

Proof. By (4.2), it suffices to show that for allt≥0,

PΩ(|Z_x−Z_y|> t)≤exp

1−t²/ 24√ m

` kx−yk_∞_ˆ²

, (4.14)

where again with

Zx:=E

k 1

√`P`V^∗RΩCxξk²₂−Ek 1

√`P`V^∗RΩCxξk²₂ Ω

=k 1

√`P`V^∗RΩCxk²_F −Ek 1

√`P`V^∗RΩCxk²_F.

It is proved by applying Theorem 15 withFk, theσ-algebra generated by Ω1, ...,Ωk to the functionf(Ω) as defined above

f(Ω) =k 1

√`P_`V^∗R_ΩC_xk²_F− k 1

√`P_`V^∗R_ΩC_yk²_F.

Assuming (Ω⁰_k, ...,Ω⁰_m) an independent copy of (Ωk, ...,Ωm), we denote Ω⁰ = (Ω1, ...Ω_k−1,Ω⁰_k, ...,Ω⁰_m), and Ω = (Ωm, ..., ,Ωk−1,Ωk, ...,Ω1). And then we need to bound the sum of squared ranges

R²= sup

j=1

ran²_k

By definition,

Xk :=E(X|Fk) =E(f(Ω)|Ωk, . . . ,Ω1),

and

ran_k := sup

Ωk∈{Ω/ 1,...,Ωk−1}

X_k

Ω_k−1, . . . ,Ω₁

+ sup

Ωk∈{Ω/ 1,...,Ωk−1}

−X_k

Ω_k−1, . . . ,Ω₁

= sup

Ω_k∈{Ω/ 1,...,Ω_k−1}

E(f(Ω)|Ωk, . . . ,Ω1)

Ω_k−1, . . . ,Ω1

+ (4.15)

sup

Ω_k∈{Ω/ ₁,...,Ωk−1}

E(−f(Ω⁰)|Ωk, . . . ,Ω1)

Ωk−1, ...,Ω1

= sup

Ωk,Ω⁰_k∈{Ω/ 1,...,Ωk−1}

E(f(Ω)|Ω_k,Ω_k−1...,Ω₁) +E(−f(Ω⁰)|Ω⁰_k,Ω_k−1, ...,Ω₁)

Ω_k−1, ...,Ω₁

(4.16)

= sup

Ω_k,Ω⁰_k∈{Ω/ ₁,...,Ωk−1}

E(f(Ω)|Ωk,Ωk−1...,Ω1)

Ωk−1, ...,Ω1

E(−f(Ω⁰)|Ω⁰_k,Ωk−1, ...,Ω1)i .

(4.17)

Now it is essential if we can boundE(f(Ω)|Ω_k,Ω_k−1...,Ω₁) +E(−f(Ω⁰)|Ω⁰_k,Ω_k−1, ...,Ω₁),conditional on Ω_k−1, ...,Ω₁from above. It is expected that we can bound the term by bounding the combination of the two summands by

E[f(Ω)−f(Ω⁰)|Ωk,Ω⁰_k,Ω_k−1, ...,Ω1].

However this cannot be done in one glance (at least for me), since we are sampling without replacement, while calculating the expectation over Ω⁰_js form ≥j > k, the space {Ωk+1, . . . ,Ω_m} is different from {Ω⁰_k+1, . . . ,Ω⁰_m}, for there can be somei > k, such that Ω_i= Ω⁰_k, and vice versa for somei > k, Ω⁰_i= Ω_k. Therefore E(f(Ω)|Ω_k,Ω_k−1...,Ω₁) cannot immerses with E(f(Ω⁰)|Ω⁰_k,Ω_k−1...,Ω₁) in one step. This is then analysed by dividing the space generated by (Ωi)i>k into partition events (Ej)^m−k_j=1 and (E_j⁰)^m−k_j=1 defined in the next paragraph.

Define the events E0 = {Ωj 6= Ω⁰_k ∀j > k}, E₀⁰ = {Ω⁰_j 6= Ωk ∀j > k}, and, for j ∈ [m−k], Ej={Ωk+j = Ω⁰_k},E_j⁰ ={Ω⁰_k+j = Ωk}and note that

P[∪^m−k_j=0 Ej|Ω1, ...,Ωk,Ω⁰_k] =P[∪^m−k_j=0 E_j⁰|Ω1, ...,Ωk,Ω⁰_k] = 1. (4.18)

Which says that events (Ej)^m−k_j=1 and (E_j⁰)^m−k_j=1 are two partitions of the probability space conditional

on{Ω1, . . . ,Ω⁰_k,Ωk}, and the measure (probability) of each pair of events

(E_j|Ω₁, ...,Ω_k,Ω⁰_k),(E_j⁰|Ω₁, ...,Ω_k,Ω⁰_k)m−k j=1

is the same, i.e.,

P[Ej|Ω1, ...,Ωk,Ω⁰_k] =P[E_j⁰|Ω1, ...,Ωk,Ω⁰_k], forj= 1, ..., m−k. (4.19)

Now, we can write

E[f(Ω)|Ω1, ...,Ω_k−1,Ω_k] =

m−k

j=0

E[f(Ω)1Ej|Ω1, ...,Ω_k−1,Ω_k,Ω⁰_k], (4.20)

and similarly

E[f(Ω⁰)|Ω1, ...,Ω_k−1,Ωk] =

m−k

j=0

E[f(Ω⁰)1E_j⁰|Ω1, ...,Ω_k−1,Ω⁰_k,Ωk]. (4.21)

Put (4.20), (4.21) together, we have

E[f(Ω)|Ω1, ...,Ω_k−1,Ω_k]−E[f(Ω⁰)|Ω1, ...,Ω_k−1,Ω_k]

m−k

j=0

E[f(Ω)1Ej−f(Ω⁰)1E_j⁰|Ω1, ...,Ωk−1,Ωk,Ω⁰_k]. (4.22)

It remains to bound the term

f(Ω)1Ej−f(Ω⁰)1E_j⁰|Ω1, ...,Ωk−1,Ω⁰_k,Ωk.

Note that due to the partition of the events, forj = 0, i.e., in eventsE₀, E₀⁰, Ω and Ω⁰ differ at most in one component, i.e., thekth component,or equivalently Ω_k and Ω⁰_k can be different. Thus by Lemma 6,

f(Ω)1Ej −f(Ω⁰)1E_j⁰|Ω1, ...,Ω_k−1,Ω⁰_k,Ωk ≤24

` kx−yk∞ˆ. (4.23)

Forj >0, in eventsEj,E_j⁰, Ω and Ω⁰ differ at most in two components, i.e., thekth and thek+jth.

Thus by Lemma 6,

f(Ω)1Ej −f(Ω⁰)1E_j⁰|Ω₁, ...,Ω_k−1,Ω⁰_k,Ω_k ≤24

` kx−yk_∞_ˆ. (4.24)

Hence for allj= 1, ..., m−k,

E[f(Ω)1E_j −f(Ω⁰)1E_j⁰|Ω1, ...,Ωk−1,Ωk,Ω⁰_k]≤24

` kx−yk∞ˆP[Ej|Ω1, ...,Ωk,Ω⁰_k]. (4.25)

The above inequality (4.25) and the fact from space partition (4.18), (4.19), we have

ran_k = sup

Ωk,Ω⁰_k∈{Ω/ 1,...,Ωk−1}

E(f(Ω)|Ω_k,Ω_k−1...,Ω₁)

Ω_k−1, ...,Ω₁ +

E(−f(Ω⁰)|Ω⁰_k,Ω_k−1, ...,Ω₁)i

≤ sup

Ω_k,Ω⁰_k∈{Ω/ 1,...,Ω_k−1}

h^m−kX

j=0

E[f(Ω)1Ej −f(Ω⁰)1E⁰_j|Ω1, ...,Ω_k−1,Ωk,Ω⁰_k]i

≤ sup

Ω_k,Ω⁰_k∈{Ω/ 1,...,Ωk−1}

h^m−kX

j=0

` kx−yk∞ˆP[Ej|Ω1, ...,Ωk,Ω⁰_k]i

≤ 24

` kx−yk∞ˆ.

Now applying Theorem 15 with ˆr²:= supR²≤Pm

k=1ran²_k≤(²⁴

√m

` kx−yk∞ˆ)², one obtains

P(|Zx−Z_y|> t)≤2 exp(−t²/(24√ m

` kx−yk∞ˆ)²),

which implies (4.14). We conclude

d(x, y) :=kZx−ZykΨ₂ ≤24√ m

` kx−yk∞ˆ,

as desired.

Now the details for calculating the Dudley’s integral is demonstrated below.

Lemma 8.

Proof. The integral is calculated by three parts. First the Maurey’s method for larger integral factor, secondly the volumetric argument for smaller, and insert both the above into the Dudley’s integration.

In Maurey’s method, set U ={±√

Since by H¨older’s inequality |h√

N F^p, Zki| ≤ k√

N F^pk_∞kZkk1 ≤ 1·√

2. Note that the norm k · k∞ˆ

is defined by nonnormalized discrete Fourier matrix because k√

N F^pk_∞, and it should be a constant independent ofN.

k(h√

N F^p, Zki)^M_k=1k2≤√ 2M ,

fork∈[N] By H¨offding’s inequality, conditional onZ_k,

P(|

By Proposition 3, we have

Emax

then by Fubini’s theorem,

Emax

p∈[N]

k=1

_kh√

N F^p, Z_ki| ≤3/2√ 2√

ln 8N√ M .

Hence by lettingA= 3/2√ 2√

ln 8N .√

lnN, Maurey’s method yields

logN(conv(U),k · kX, ).(1

)²ln²N.

Therefore

logN(Ds,N,k · kΨ₂, ).logN(Ds,N,

√m

` k · k∞ˆ, )

≤logN(√

sB₁^N(0,1),

√m

` k · k∞ˆ, ) = logN(B₁^N(0,1),k · k∞ˆ, `

√s√ m)

≤logN(conv(U),k · k∞ˆ, `

√s√ m) .(

√sm

` )²ln²N. (4.26)

Volumetric argument reveals that

logN(Ds,N, d(x, x0), )≤logN(√

sD¹_s,N,k · k∞ˆ, )

≤log N

N(B₁^s(0,1),k · k_∞, )

≤log(eN

s )^sN(B_∞^s_ˆ(0,1),k · k∞ˆ, )

≤log(eN

s )^s(1 +2 )^s

= log(eN s +2eN

s )^s

=slog(eN + 2eN

s )

.slog(N

s). (4.27)

B₁^s⊂1B_∞^s_ˆ holds becausek · k∞ˆ ≤max_p∈[N]k√

N F^pk∞k · k1= 1k · k1.

With the two results 4.26 and 4.27 from above, we are ready to bound the Dudley’s integration.

First by definition we have

Then insert the above to the integration

Z e₀

now changing variable with setting ¹_t =^N_s one reaches

Z e₀

where the latest equality stands by writing `=m(^s_`)^α. Note that, to make the inequality reasonable, i.e., (_m^s)¹²^−α>1,αis naturally restricted asα∈[0,1/2).

Chapter 5

Restricted Isometry Property of discrete Fourier matrix

Due to popularity of the discrete Fourier matrix, it is important to prove the RIP of it. There are already papers on this topic. We here do not aim to improve the bound rather than that, we apply again our method from Chapter 4 with more details to prove the RIP of the discrete Fourier matrix, and then summarize it to be a quick test for proving RIP in Chapter 5.1.

The discrete Fourier matrix used here is nonnormalized, and the normalization after randomly choos-ingm rows is then by scale ^√¹_m. By using McDiarmid’s inequality, the restricted isometry property of partial random discrete Fourier matrix can be shown. Again for clarity the restricted isometry property is stated here below.

δs= sup

x∈Ds,N

{k 1

√mRΩF xk²₂−1}

= sup

x∈Ds,N

{k 1

√mRΩF xk²₂−Ek 1

√mRΩF xk²₂}.

The main theorem is thus stated below.

Theorem 19. Form≥δ⁻²ln⁻²s²log²slog²N, F has the restricted isometry property of orderswith constant δwith probability larger than 1−.

The proof of Theorem 19 is derived directly from Lemma 11.

Lemma 9. Let ΩandΩ⁰ differ at one component. Then the function

f(Ω) :=k 1 matrixRΩ, by assumption and without loss of generality they differ in the first component. Then

m Z_yk_ψ₂ is thus bounded in the following lemma.

Lemma 10.

√mkx−yk∞ˆ, McDiarmid’s inequality yields the result.

Below volumetric argument together with Maurey’s method will be applied to bound the covering numberN(Ds,N,

√s

√mk · k∞ˆ, ).

Dudley’s inequality then reveals the property of the restricted isometry property as below.

Lemma 11.

Proof. As setting beforeZx=k^√¹_mRΩF xk²₂−Ek^√¹_mRΩF xk²₂, this lemma is proved by bound the supre-mum ofZ_x overx∈D_s,N by Dudley’s inequality Theorem 11

P( sup

x∈Ds,N

|Z_x−Z_x₀|> u).exp( −u² Re₀

plogN(D_s,N, d(x, x₀), )d², (5.2)

whered(x, x0) :=kZx−Zx₀kΨ₂ is the (up to a absolute constant) smallest value such that

P(|Zx−Zy|> ud(x, y)).exp( −u²

d²(x, y)). (5.3)

This bound is derived in Lemma 10.

Secondly, we need to bound the integral in denominator, i.e. Re₀ 0

plogN(Ds,N, d(x, x0), )d by Maurey’s method and Volumetric argument. First use Lemma 10 sety= 0,

e₀= sup

x∈D_s,N

kZ_xk_Ψ₂ ≤2√

√2s

m kxk_∞_ˆ =2√

√2s

m kF xk_∞≤2√

√2s

m kxk₁≤2√

√2s

m kxk₂≤2√

√2s

m . (5.4)

Secondly, applying Maurey’s method Lemma 2, by setting U = {±√

2e1,±√

2e2,· · · ,±√

2eN}, and k · kX=k · k_∞=kF xk_∞= max_p∈[N]hF^p, xi. ThenB^N(0,1)⊂conv(U), and

k=1

kZkkX (5.5)

=E max

p=1...N|

k=1

_khF^p, Z_ki|. (5.6)

Note that the normk · k∞ˆ is defined by nonnormalized discrete Fourier matrix becausekF^pk∞should be a constant independent ofN. By H¨older’s inequalityhF^p, Z_ki ≤ kF^pk∞kZkk1≤1√

2, we have

k(hF^p, Z_ki)^M_k=1k₂≤√

2M , (5.7)

fork∈[N] By H¨offding’s inequality, conditional onZk,

P(|

k=1

khF^p, Zki| ≥√

2M t)≤2e^t²^/2, t >0. (5.8)

By union bound

lnN, Maurey’s method yields

logN(conv(U),k · kX, ).(1

By a volumetric argument, we have

logN(Ds,N, d(x, x0), )≤logN(B^s₂,

B₁^s⊂1B_∞^s_ˆ holds becausek · k∞ˆ ≤max_p∈[N]kF^pk∞k · k1= 1k · k1.Then combining (5.12)(5.13),

5.1 Quick test of RIP

In this section I summarise the above method of the combination of Dudley’s inequality with McDiarmid’s inequality to a quick test for proving the RIP of partial random matrices^√¹_mR_ΩA(the randomness occurs at drawing themrows out ofN rows), forAany arbitrary matrix Ek^√¹_mRΩAxk²₂= 1.

(In case with Σ∆-quantization, multiply a term (

√m

Lemma 12. Let ΩandΩ⁰ differ at one component. If the function

f(Ω) :=k 1

√mR_ΩAxk²₂− k 1

√mR_ΩAyk²₂

satisfies

|f(Ω)−f(Ω⁰)|. 1

K(s, `, m)kx−yk∞ˆ,

for some functionK of variables s, `, m, then

P(δ_s> t).exp(−t²/(

√sm

K(s, `, m)logNlogm)²).

Im Dokument Compressed Sensing and ΣΔ-Quantization (Seite 53-75)