On two Random Models in Data Analysis

(1)

(2)

(3)

Analysis

Dissertation

zur Erlangung des

mathematisch-naturwissenschaftlichen Doktorgrades

„Doctor rerum naturalium“

derGeorg-August-Universität Göttingen im PromotionsstudiengangMathematical Sciences derGeorg AugustUniversity School of Science (GAUSS)

vorgelegt von

David James

aus Fulda

Göttingen, 2016

(4)

Prof. Dr. D. RussellLuke(NAM³) Mitglieder der Prüfungskommission

Referent: Prof. Dr. FelixKrahmer¹ Korreferent: Prof. Dr. MatthiasHein²

Weitere Mitglieder: Prof. Dr. D. RussellLuke(NAM³) Prof. Dr. GerlindPlonka-Hoch(NAM³) Prof. Dr. AnjaSturm(IMS⁴)

Prof. Dr. StephanWaack(IFI⁵)

1Fakultät für Mathematik,

(Lehrstuhl M15, Technische Universität München)

2Fakultät für Mathematik und Informatik

(Fachrichtung Informatik, Universität des Saarlandes)

3Institut für Numerische und Angewandte Mathematik,

4Institut für Mathematische Stochastik,

5Institut für Informatik,

(jeweils Fakultät für Mathematik und Informatik,Georg-August-Universität Göttingen)

Tag der mündlichen Prüfung: 12. Januar 2017

(5)

(6)

(7)

is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable.

(8)

(9)

DouglasAdams, The Restaurant at the End of the Universe

(10)

(11)

In this thesis, we study two random models with various applications in data analysis.

For our first model, we investigate subspaces spanned by biased random vectors. The underlying random model is motivated by applications in computational biology, where one aims at computing a low-rank matrix factorization involving a binary factor. In a random model with adjustable expected sparsity of the binary factor, we show for a large class of random binary factors that the corresponding factorization problem is uniquely solvable with high probability. In data analysis, such uniqueness results are of particular interest; ambiguous solutions often lack interpretability and do not give an insight into the structure of the underlying data. For proving uniqueness in this random model, small ball probability estimates are a key ingredient. Since to the best of our knowledge, there are no such estimate suitable for our application, we prove an extension of the famous Lemma ofLittlewoodandOfford. Hereby, we also discover a connection between the matrix factorization problem at hand and the notion ofSper- ner families.

In the second part of this thesis, we will investigate a model for randomized ultrasonic data in nondestructive testing. Here, we aim at accelerating the data acquisition pro- cess by superposing ultrasonic measurements with random time shifts. To this end, we will first study the effects of randomized ultrasonic measurements in the context of the Synthetic Aperture Focusing Technique (SAFT), a widely used defect imaging method.

By adapting SAFT to our random data model, we will significantly improve its perform- ance for randomized data. In this way, for sparse defects and with high probability, we achieve better defect reconstructions as with SAFT applied to deterministic ultrasonic data acquired in the same amount of time.

(12)

(13)

After three and a half years of working on this thesis, I want to thank everyone who made it possible.

First of all, I would like to thank my advisor FelixKrahmerfor his genuine support and many inspiring discussions. Whenever I felt lost, he was there to help me back on my way. Thank you for your confidence and guidance. Moreover, I am indebted to my co- supervisor RussellLuke, who was very supportive and always had a sympathetic ear.

Many thanks to MatthiasHein for helpful discussions and challenging (but solvable) problems, and for taking the position as co-referee without hesitation.

Moreover, I highly acknowledge the financial support of theFederal Ministry of Edu- cation and Research(BMBF), and the Research Training Group (GRK 2088)Discovering structure in complex data: Statistics meets Optimization and Inverse Problems for travel support and interesting workshops.

I owe my thanks to GerlindPlonka-Hochand all my (former) colleagues from theMath- ematical Signal and Image Processingresearch group. You provided a warm and pleasant atmosphere, interesting talks and discussions in our research seminar, and a regular supply of tea and cake. Besides, I want to thank the research group forApplied Numer- ical Analysis and Optimization and Data Analysis at TU Munich. During my numerous visits, I always felt very welcome.

For providing experimental data and parts of their software, I would like to thank Oliver Nemitzand his colleagues fromSalzgitter Mannesmann Forschung GmbH. In this context, I am also very thankful for fruitful and encouraging discussions with MartinSpies and HansRiederfromFraunhofer Institute for Nondestructive Testing IZFP. You all con- tributed to my understanding of nondestructive testing.

I highly appreciate the proofreading by RobertBeinert, SinaBittens, RenatoBudinich, and ChristianKruschel. My special thanks go to RobertBeinert, who was also of great help for typesetting this thesis.

I want to thank my family, my parents, Marion and Peter, and my sister Lisa, for encouraging and supporting me, especially in the last months of my work on this thesis.

Last but not least, I thank my wife Steffi, and my children, Linus and Ella, for their love, support, and endless patience. You are my true source of inspiration and motivation.

(14)

(15)

Abstract . . . i

Acknowledgement . . . iii

I. Subspaces Spanned by Biased Random Vectors 1 1. Introduction . . . 1

2. From the Span of Binary Matrices toSpernerFamilies . . . 8

3. The Span of Random Binary Matrices and the Lemma of Littlewoodand Offord . . . 17

4. BoundingP_k,n andP_±_,k,n using Cardinality Estimates . . . 32

5. BoundingPk,n andP_±,k,n using the LYM-inequality . . . 39

II. Ultrasonic Nondestructive Testing with Random Measurements 53 6. Introduction . . . 53

7. Model . . . 55

8. Synthetic Aperture Focusing Technique . . . 62

9. Iterative Synthetic Aperture Focusing Technique . . . 73

10. Numerical Results . . . 77

Bibliography . . . 89

Curriculum vitae . . . 93

(16)

(17)

(18)

(19)

Subspaces Spanned by Biased Random Vectors

1. Introduction

The Span of Random Binary Matrices

Bernoullirandom matrices have recently gained a lot of attention. Arguably, the most prominent problem in this field is to estimate the probability that ann ×n Bernoulli random matrix is singular asn goes to infinity. There has been tremendous progress in proving the conjecture that this probability is dominated by the probability that two columns or rows coincide [KKS95, BVW10, TV07]. A closely related problem con- cerns the investigation of the span ofBernoulli random matrices. Motivated by an application in neural networks [KS87], this problem was first investigated byOdlyzko in [Odl88]. He found that the probability that the linear span of the columns of a rect- angularN×nBernoullimatrix does not contain any{±1}-vector besides its columns is dominated by the probability of the corresponding event for just three of its columns.

Theorem 1.1 (Odlyzko[Odl88]). Let T be an N × n random matrix whose entries are independent copies of a Bernoulli random variable ϵ with ^P[ϵ = 1] = ¹/2 and P[ϵ =−1]=¹/2. If

n ≤ 1− 10 log(N)

! N,

then the probabilityP that there exists a vectorx ∈ ^Rⁿ with at least two non-vanishing entries such thatT x is a{±1}-vector can be bounded from above by

P ≤ 4 n 3

! 3 4

N

+^O* ,

7 10

!N

+ -, asN goes to infinity.

(20)

The result, however, only treats the case where all entries ofT are unbiased, i.e., they attain the values±1with equal probability. Motivated by applications related to matrix factorization (see below), we aim to transfer this result to the case of biasedBernoulli random variables; that is, random variables where the values±1are attained with un- equal probabilities. We will show that the asymptotic behavior, i.e., the dominance by the probability that there exists a linear combination of three columns resulting in a {±1}-vector, carries over to biasedBernoullirandom matrices. The main result of this chapter reads as follows:

Theorem 1.2. LetT be anN ×nrandom matrix whose entries are independent copies of aBernoullirandom variableϵ with^P[ϵ = 1]=pand^P[ϵ = −1]= 1−p. If there exists δ ∈(0,1)with

min{p,1−p} ≥N⁻⁽¹⁻^δ⁾,

then there exists an absolute constantC >0depending only onδ such that for

n ≤ 1− C log(N)

! N,

the probabilityP that there exists a vectorx ∈^Rⁿ with at least two non-vanishing entries such thatT x is a{±1}-vector can be bounded from above by

P ≤ 4 n 3

!

(1−p(1−p))^N +o

(1−p(1−p))^N , asN goes to infinity.

Note that forp = ¹/2, we recover the asymptotic behavior of Theorem 1.1. Our result also covers the observation that the probability that there exists a vectorx with two or more non-vanishing entries such thatT x ∈ {±1}^N is dominated by the corresponding event for just three columns ofT. To see this, one can check that (1−p(1−p))^N is exactly the probability thatT x ∈ {±1}^N for a vectorxin^Rⁿwith the entries1,1and−1 on its support of length3. We will later see that for vectorsx whose support set is of cardinality larger than4or equal to2, this probability is of higher order.

A NewLittlewood-Offord-type Inequality

The proof of Theorem 1.1 relied on estimating small ball probabilities of the form

P

n

X

j=1

ϵ_jx_j=1

, (1.1)

(21)

where x is a vector of non-vanishing entries andn ≥ 2. In the unbiased case, the probability in (1.1) can be treated via the Lemma ofLittlewood and Offord, which was proven byErdősin [Erd45].

Theorem 1.3 (Littlewood-Offord[Erd45]). Let x ∈ ^Rⁿ be a vector with minj |xj| ≥ c > 0and letϵj,1 ≤ j ≤ n be independent copies of aBernoullirandom variableϵtaking the values+1and−1with equal probability. Then for any open intervalI of length at most2c, it holds that

P

n

X

j=1

ϵ_jx_j ∈I ≤ n

⌊ⁿ/2⌋

!

2⁻ⁿ. (1.2)

In contrast to the original result ofLittlewoodandOfford[LO43], which was only optimal up to a logarithmic factor, the estimate in(1.2)is sharp; one can easily verify that the right hand side of (1.2) is indeed achieved. For that, choose all entries ofxto have the same modulusc > 0andIto be the open interval of length2ccentered aty=0 forneven ory =±cfornodd.Erdős’ proof is based on a connection between random sums as in (1.2) andSpernerfamilies in combination withSperner’s Lemma [Spe28]. In the case where all entries ofxare positive, a generalization of the Lemma ofLittlewood andOffordto biased random variables can be proven using the LYM-inequality [Bol65, Lub66, Meš63, Yam54] instead ofSperner’s Lemma. The LYM-inequality was proven by Bollobás,Lubell,Meshalkin, andYamamoto; it is named by the initials of the latter three.

Theorem 1.4 (BiasedLittlewood-Offord[LL70]). Letx ∈^Rⁿbe a real vector with min_jx_j ≥c > 0, let furtherϵ_j,1 ≤ j ≤ nbe independent copies of aBernoullirandom variableϵwith^P[ϵ =1]=pand^P[ϵ = −1]=1−p, and letI ⊂ ^Rbe an open interval of length at most2c. Then

P Xn

j=1

ϵjxj ∈I ≤ max

0≤k≤n

n k

!

p^k(1−p)ⁿ⁻^k. (1.3)

As in Theorem 1.3, the estimate in(1.3)also is sharp for vectorsxwith constant entries, but it is not applicable if one aims to bound a probability like the one in(1.1). This is due to two reasons: While the assumption in Theorem 1.4 that all entries ofxhave a positive sign can easily be dropped in the case ofp = ¹/2, the same is not true in the unbiased

(22)

case. Additionally, the problem of estimating the absolute value of the sum in(1.1)can be solved forp = ¹/2using Theorem 1.4 via a union bound, but the same method does not give meaningful probability estimates when considering highly biasedBernoulli random variables withpclose to0or1. In order to handle the first issue, the following lemma was proven by Costello et al. in [CV08] usingChebyshev’s inequality.

Lemma 1.5 ([CV08]). Letx ∈ ^Rⁿ be arbitrary withmin_j|x_j| ≥ c > 0, let furtherϵ_j, 1≤j ≤nbe independent copies of aBernoullirandom variableϵwith^P[ϵ=1]=pand P[ϵ =−1]= 1−pand letIbe an open interval of length at most2c. Then there exists an absolute constantC >0such that

P

n

X

j=1

ϵ_jx_j ∈I ≤ C

√nµ,

whereµ =min{p,q}.

In contrast to the Theorem 1.4, Lemma 1.5 also allows us to treat vectorsxwith varying signs.

Note that this bound is not meaningful for smalln or values ofp which are close to0 or1, due to the constant factorC, which is not sharp. The following tighter variant of Lemma 1.5 can be derived from Theorem 1.4.

Corollary 1.6. Letx ∈^Rⁿwithmin_j |x_j| ≥c >0have exactlyn₊strictly positive andn₋ strictly negative entries and letϵj,1≤j ≤nbe independent copies of aBernoullirandom variableϵ with ^P[ϵ = 1] = p and^P[ϵ = −1] = 1−p. LetI ⊂ ^R be an arbitrary open interval of length at most2c and letn¯ ≤max{n₋,n₊}. Then

P

n

X

j=1

ϵ_jx_j ∈I ≤ max

0≤k≤n¯

¯ n k

!

p^k(1−p)ⁿ^¯⁻^k.

Corollary 1.6 follows from Theorem 1.4 mainly by conditioning on the random variables whose coefficients have the less frequent sign. Nevertheless, since we did not find this bound in the literature, a proof will be provided in Section 5. Note that Theorem 1.6 still does not give meaningful results when used together with a union bound to estimate(1.1)whenp is close to0or 1. A main result of this chapter is the following symmetric version of Corollary 1.6, which takes into account that, for biasedBernoulli

(23)

random variables, the sum in (1.1) typically does not attain the values±1with equal probability. As we will see, the proof is considerably more involved than the proof of Corollary 1.6.

Theorem 1.7. Letx ∈^Rⁿwithminj |xj| ≥c > 0haven₊strictly positive andn₋strictly negative entries and letϵ_j,1≤ j ≤nbe independent copies of aBernoullirandom variable ϵwith ^P[ϵ = 1] =p and^P[ϵ = −1] = 1−p. LetI ⊂ ^R be an open interval of length at most2cand letn¯ ≤max{n₋,n₊}. Then

P

n

X

j=1

ϵ_jx_j∈I ≤ max

0≤k≤n¯

¯ n k

!

p^k(1−p)ⁿ^¯⁻^k+pⁿ^¯⁻^k(1−p)^k .

The inequality is tight, as equality is achieved for oddnand vectorsxwith constant entries. While it is neither a strict inequality for evennnor for vectorsx with varying sign, it still gives meaningful estimates for these cases even whenn is small or the Bernoullirandom variablesϵ_j are extremely biased.

Matrix Factorization with Binary Components

Low rank matrix factorization is an important tool in data analysis, which allows us to represent data as linear combinations of a small number of building blocks, often referred to ascomponents. In matrix factorization with binary components, one aims to factor a given data matrixD ∈^R^N^×ⁿinto the productBA, whereB∈ {0,1}^N^×^ris a binary matrix, andA ∈^R^r^×ⁿ is an arbitrary matrix whose rows sum to1andr ≪ min{N,n}. To be more precise, matrix factorization with binary components considers the problem findB ∈ {0,1}^N^×^r andA∈^R^r^×ⁿ, A^T1r =1n,such thatD=BA, (1.4) where 1_r, 1_n denote the vectors of lengthr,n, respectively, with all entries equal to 1. Motivated by numerous applications, such as blind source separation in wireless communications with binary source signals [Vee97], network inference from gene ex- pression data [LBY⁺03, TCX12], unmixing of cell mixtures from DNA methylation sig- natures [HAK⁺12], or clustering with overlapping [BKG⁺05, SBK03], it gained a lot of attention in recent years. Similar factorization problems involving binary matrices have for example been studied in [SSU03, KB08, MGNR06, ZLDZ07, MMG⁺08]. Note that, if we additionally demand that all entries ofAare non-negative, the problem(1.4)is an instance of the non-negative matrix factorization problem, see, e.g., [PT94, LS99].

(24)

In [SHL13],Slawski,HeinandLutsikproposed an algorithm to solve this problem by computing the intersection of the affine hull of the data matrixD, i.e.,

aff(D)= 



Dxx ∈^Rⁿ,

n

X

j=1

x_j =1



, (1.5)

with the set of vertices{0,1}^N. Their algorithm provably finds the solution to(1.4)ifA has full rank, the columns ofBare affinely independent, i.e.,∀x ∈^R^r,P

jxj =0,Bx =0 implies thatx=0, and the uniqueness condition

aff(B)∩ {0,1}^N =

B:,1, . . . ,B:,n (1.6)

is satisfied. Here,B_:,j,1≤j ≤ndenotes thejthcolumn ofB. By a direct calculation, we can see that the union of both conditions onBis equivalent to

∄x ∈^R^r, Xr

j=1

x_j =1, kxk⁰ ≥2 : Bx ∈ {0,1}^N, (1.7)

where kxk⁰ denotes the number of nonzero entries ofx. Combining this observation with properties of modulated symmetricSperner-2families, a notion that we will introduce in Definition 2.14 below, allows us to prove the following result.

Theorem 1.8. LetBbe anN×nrandom matrix whose entries are independent copies of aBernoullirandom variableϵ with^P[ϵ = 0] = pand^P[ϵ = 1] = 1−p. If there exists δ ∈(0,1)with

min{p,1−p} ≥N⁻⁽¹⁻^δ⁾,

then there exists an absolute constantC >0depending only onδ such that for

n ≤ 1− C log(N)

! N, it holds that

P∃x ∈^R^r,

r

X

j=1

x_j =1, kxk⁰≥ 2 : Bx ∈ {0,1}^N

≤ 4 n 3

!

(1−p(1−p))^N +o

(1−p(1−p))^N , (1.8) asN goes to infinity.

(25)

As we will see later, Theorem 1.8 is a direct consequence of Theorem 1.2. Together with(1.7), it implies that the matrix factorization problem with binary components can be solved for a large class of random matrices. Note that the parameterpin Theorem 1.8 now also allows us to model sparse matricesBwith just a few non-zero entries, which often occur in practice in the matrix factorization problem(1.4).

Organization of the Chapter

In Section 2, we will first consider deterministic versions of Theorem 1.2 and Theorem 1.8, and show a deep relation between both problems and the notion ofSperner-k families, which we will also introduce in that section. In Section 3, we will pass on to the random setting and consider Theorem 1.2 in terms of probabilities involving Sperner families. Afterwards, we aim to bound these probabilities in Section 4 and Section 5, which also contains the proofs of Theorem 1.6 and Theorem 1.7.

Notation

Throughout this chapter,[n]will denote the integers from1ton, and2ⁿwill denote the power set of[n]. The symmetric differenceA∆B of two setsA,B ⊂ [n]is defined by A∆B := (A\B)∪(B\A). For two setsA,B, we will writeA∪·B instead of the union A∪B if the two sets are disjoint. Similarly, forN setsAi, S

·

ⁱ^∈^[N^]^Aⁱ will denote the the union of the setsAi if they are pairwise disjoint. SubsetsA ⊂ 2ⁿ will be referred to as families and will be denoted by calligraphic capital letters. When dealing with matrices, we denote, for anN ×nmatrixT and arbitrary setsR ⊂ [N]andC ⊂ [n], by T_R,C the submatrix ofT which arises by restrictingT to the rows indexed byRand the columns indexed byC. Furthermore, we will writeT_:,C instead ofT_[N],C andT_R,:instead ofTR,[n], and we will denote byTi,j the entry ofT with row indexi ∈ [N]and column indexj ∈[n]. The restriction of ann-dimensional vectorxto its entries indexed by a set J ⊂ [n]will be denoted byx_J. For an arbitraryn-dimensional vectorx, we will denote bysupp(x) ⊂ [n]the set containing all indicesj ∈[n]such that|xj| >0and refer to it as the support ofx. We call a vectorx ∈^Rⁿs-sparseif|supp(x)| =s; theℓ₀-norm ofx is defined via kxk0 := |supp(x)|. The sign patternsgn(x)of an arbitraryx ∈ ^Rⁿ with non-vanishing entries will refer to the vector in{±1}ⁿdefined by

sgn(x)j = 



1 x_j >0,

−1 xj <0.

For two sequences (a_n)_n_∈N and (b_n)_n_∈N, we write a_n = o(b_n) if ^aⁿ/bn → 0 asn →

∞. In order to to highlight that two random variablesX,Y have the same probability distribution, we will writeX ∼Y.

(26)

2. From the Span of Binary Matrices to Sperner Families

In this section, we will establish the connection between the deterministic version of Theorem 1.2 forN ×n matricesT with values in {±1} and a special class of families A⊂ 2ⁿ, which was first studied bySpernerin [Spe28]. Our result generalizes the connection betweenSpernerfamilies and random sums as in Theorem 1.3 and Theorem 1.4, which was discovered byErdősin [Erd45], in multiple ways. Later, our generalization will allow us to prove our main results, Theorem 1.2 and Theorem 1.7, and also yields a uniqueness condition for matrix factorization with binary components. We first recall the definition of aSperner-kfamily.

Definition 2.1 (Sperner-kfamily [Spe28]). We call any familyA ⊂2ⁿaSperner-k familyif there does not exist a chain ofk+1setsA₁, . . . ,A_k+1 ∈^Awith

A1 (A2 (· · · (A_k+1. (2.1) For notational brevity and historical reasons,Sperner-1families will simply be called Spernerfamilies.

Example 2.2. The family A₁ =

({1,2},{2,3},{1,3})

⊂ 2³ is a Sperner family, since no set contained inA₁ is a proper subset of another set contained inA₁. The family A₂= ^A1∪(

{1,2,3})

is not aSpernerfamily. It holds for instance that{1,2} ⊂ {1,2,3}. A₂is however aSperner-2family, as the longest chain of inclusionsA₁ (· · · (A_kwith setsA1, . . . ,Ak ∈^A²is of lengthk=2.

WhileSperneronly consideredSpernerfamilies fork = 1,Sperner-k families are well known to connect to this basis case via the following lemma.

Lemma 2.3. A familyA ⊂ 2ⁿ is aSperner-k family if and only if it is the union ofk Spernerfamilies.

Proof. If the familyAis the union ofk Spernerfamilies but not aSperner-k family, there must exist a chain ofk+1subsetsA₁, . . . ,A_k₊₁ ∈^Awith

A₁ (A₂( · · ·(A_k₊₁,

and the pigeonhole principle implies that at least two of them have to be contained in the sameSperner-1family, which yields a contradiction.

(27)

For the reverse implication, letA = {A1, . . . ,Aℓ} be aSperner-k family. Without loss of generality we may assume that |A_j| ≤ |A_j+1| forj ∈ [l −1]. We will now construct k Sperner-1families in an iterative way. Let A₁⁰, . . . ,A_k⁰be empty families. For each j ∈[ℓ]and eachi ∈[k]define iteratively

A_i^j =



A_i^j⁻¹ if

∃B ∈^Ai^j⁻¹s. th.B (Aj

, A_i^j⁻¹∪ {Aj} if

∄B ∈^Ai^j⁻¹ s. th.B (Aj

∧

Aj <A_i^j_′ for1≤i^′ ≤i−1 . By construction, each of the familiesA^ℓ_i,i ∈ [k]is aSperner-1family. We now claim thatA=Sk

i=1A_i^ℓ. Suppose for contradiction that there exists a setBk+1 ∈^Awhich is not assigned to any familyA^ℓ_i,i ∈[k]. Hence, there existsB_k ∈^A_k^ℓsuch thatB_k (B_k+1. Since for arbitraryi ∈[k],i ,1and arbitraryB_i ∈A^ℓ_i, there must exist a setB_i₋1 ∈A^ℓ_i₋₁ withB_i₋₁ (B_i, we can therefore construct a chain of inclusions as in(2.1), contradicting the assumption thatAis aSperner-kfamily. This completes the proof.

Note that Lemma 2.3 also implies that everySperner-kfamily also is aSperner-ℓfamily forℓ ≥ k. To establish the connection betweenSpernerfamilies and binary matrices, we will introduce the following operation.

Definition 2.4. For anyA⊂[n]and anyξ ∈ {±1}ⁿ, define themodulation A^ξ = (

j|j ∈A,ξj =1)

∪(

j|j<A,ξj =−1)

⊂ [n];

for any familyA ⊂2ⁿ,A={A1, . . . ,Am}, denote byA^ξ the family given by A^ξ = (

A^ξ₁, . . . ,A^ξ_m) .

Remark 2.5. By definition, it holds for arbitraryξ ∈ {±1}ⁿthat

∅^ξ ={j|ξ_j =−1}.

Also note that for any A ⊂ [n], it holds thatA⁻¹ = A^c, where −1 denotes the n- dimensional vector with constant entries equal to −1. For this reason, we will denote byA⁻¹the family of all sets complementing the sets ofA.

Remark 2.6. By definition, the operation (·)^ξ for a sign-pattern ξ ∈ {±1}ⁿ is union compatible, i.e., for two familiesA,B⊂ 2ⁿ, it holds that

(A∪^B)^ξ =^A^ξ ∪^B^ξ

(28)

and

(A\^B)^ξ =^A^ξ \^B^ξ.

Example 2.7. For the family A₁ ⊂ 2³ as in Example 2.2 andξ = (−1,1,−1)^T, it holds that

A₁^ξ = (

{1,2}^ξ,{2,3}^ξ,{1,3}^ξ)

= (

{2,3},{1,2},∅) ,

which is not aSpernerfamily, since∅ ⊂ {1,2}.

We now present some useful properties of the operation defined in Definition 2.4.

Proposition 2.8. LetA⊂[n]andξ ∈ {±1}ⁿbe arbitrary. Then A^ξ =A∆{j|ξj =−1}.

Consequently, forA,B ⊂[n]and anyξ ∈ {±1}ⁿ, it holds that A^ξ∆B^ξ =A∆B

and

(A^ξ)^ν =A^{ξ ν} = (A^ν)^ξ, whereξν ∈ {±1}ⁿdenotes the entrywise product ofξ andν.

Remark 2.9. It is a direct consequence of Definition 2.4, that for arbitrary sign pattern ξ ∈ {±1}ⁿ, all properties in Proposition 2.8 concerning a setA ⊂ [n] can be lifted to analogous properties of a familyA⊂ 2ⁿ.

Proof of Proposition 2.8. First, we observe that, for anyA ⊂ [n]andξ ∈ {±1}ⁿ, the setA^ξ can be written as

A^ξ = (

j|j ∈A,ξj =1)

∪(

j|j<A,ξj =−1)

=A\ {j|ξj =−1} ∪ {j|ξj =−1} \A

=A∆{j|ξj =−1}=A∆Nξ, where

Nξ := {j|ξj =−1}

for arbitrary ξ ∈ {±1}ⁿ. This establishes the first claim of the proposition. By the associativity and commutativity of the symmetric difference, it follows for anyA,B ⊂ [n]andξ ∈ {±1}ⁿ that

A^ξ∆B^ξ = (A∆N_ξ)∆(B∆N_ξ)=A∆B∆(N_ξ∆N_ξ)=A∆B∆∅=A∆B,

(29)

which establishes the second claim of the proposition. Furthermore, we observe that for anyA⊂[n]and anyξ,ν ∈ {±1}ⁿit holds that

(A^ξ)^ν =(A∆N_ξ)∆N_ν =A∆(N_ξ∆N_ν)

=A∆{j|eitherξj =−1orνj =−1}=A∆{j|(ξν)j =−1}

=A∆N_{ξ ν} =A^{ξ ν}.

Since the entrywise product is commutative, the last claim now follows by interchan-

ging the roles ofξ andν.

Based on the notion of modulation, we now introduce a variant ofSperner-kfamilies, which will play an essential role in the proof of our main results.

Definition 2.10 (SymmetricSperner-k family). LetA ⊂ 2ⁿ be arbitrary. For even k, we callAasymmetricSperner-kfamily ifAadmits a decomposition of the form

A=

k/²

[

l=1

A_l ∪^Al⁻¹

,

whereA_l ⊂2ⁿis aSpernerfamily for eachl ∈[^k/2].

For oddk, we callAasymmetricSperner-k family if

A=^A0∪

⌊^k/²⌋

[

l=1

A_l ∪^A_l⁻¹ ,

whereA_l ⊂ 2ⁿ is aSpernerfamily for each0≤l ≤ ⌊^k/2⌋andA₀additionally satisfies thatA₀ =^A₀⁻¹.

Note that by Lemma 2.3, every symmetricSperner-k family is indeed aSperner-k family.

Example 2.11. Considering again the familyA₁ = (

{1,2},{2,3},{1,3})

⊂ 2³defined in Example 2.2, it follows that

A₃=^A1∪^A1⁻¹ =(

{1,2},{2,3},{1,3})

∪(

{3},{1},{2}) is a symmetricSperner-2family.

The next lemma establishes a link between{±1}-valued matrices andSpernerfamilies.

It generalizes the observations ofErdősin [Erd45].

(30)

Lemma 2.12. LetTbe anN×nbinary matrix with values in{±1}andx ∈^Rⁿ,min_j |x_j| ≥ c > 0be a vector such thatT x ∈V^N, whereV is the union ofk open intervals of length at most2c. LetA ⊂2ⁿbe the family containing the sets

Ai ={j|Ti,j =1}, i∈[N]. (2.2) ThenA^sgn(x⁾is aSperner-kfamily. IfV additionally is symmetric, i.e.,V =−V, it follows thatA^sgn(x⁾is a subfamily of a symmetricSperner-kfamily.

Proof. Setξ := sgn(x). We may assume thatk < N, since otherwise the first assertion of the lemma is trivial. Suppose for contradiction thatA = {A^ξ₁, . . . ,A_n^ξ} is not a Sperner-k family. Then, after a possible permutation of the indices, it must hold that A^ξ₁ (A^ξ₂ (· · · (A^ξ_k

+1. We define for1≤i≤k+1, yi :=(T x)i =*.

, X

j∈Ai

xj − X

j∈A^c_i

xj+/

- ∈V. (2.3)

Since all entries ofx which make a positive contribution to this sum are contained in A^ξ_i and all entries ofxwhich make a negative contribution to this sum are contained in (A^c_i)^ξ =A⁻_i^ξ, one can write

yi =*.. ,

X

j∈A^ξ_i

|xj| − X

j∈A⁻_i^ξ

|xj|+// -

∈V.

Recall thatV is the union ofk intervals of length at most2c. Therefore, by the pigeonhole principle, there must bev <wsuch thaty_v,y_ware contained in the same interval.

This, in turn, implies that |y_v −y_w| < 2c. AsA_v^ξ ( A^ξ_w, there exists a non-empty set S ⊂[n]\A^ξ_v such thatA^ξ_v ∪S =A^ξ_w, and thusA⁻_w^ξ ∪S =A⁻_v^ξ. It follows that

y_w = X

j∈A_w^ξ

|x_j| − X

j∈A⁻_w^ξ

|x_j|

= X

j∈Av^ξ

|x_j|+ X

j∈S

|x_j| − X

j∈A⁻v^ξ

|x_j|+ X

j∈S

|x_j|

=*.. ,

X

j∈A^ξ_v

|x_j| − X

j∈A⁻_v^ξ

|x_j|+// -

+2X

j∈S

|x_j|

=y_v+2X

j∈S

|x_j|,

(31)

which translates to

yw −yv =2X

j∈S

|xj| ≥ 2|S|min

j |xj| ≥2c,

contradicting our finding that |y_v −y_w| < 2c. The family A^ξ therefore must be a Sperner-k family, which proves the first part of the lemma.

It remains to show that, ifVis symmetric, thenA^ξ is contained in a symmetricSperner- kfamily. To see this, letY = {y1, . . . ,y_k}be the set of the centers of thekopen intervals ofV. We can assume without loss of generality that they are distinct and thatY =−Y is a symmetric set. Therefore, there exists a permutationπ of[k]such thaty_i =−y_π_(i) andπ has at most one fixpoint, i.e., a1-cycle corresponding toyi = 0. All remaining cycles have length2. Hence, ifk is even, one can write

V =

k/²

[

ℓ=1

Vℓ∪(−Vℓ), (2.4)

where the setsVℓ,ℓ ∈ [^k/2]are open intervals of length at most2c whose centers are contained in the positive real axis. Ifk is odd, one can write

V =V0∪

⌊^k/²⌋

[

ℓ=1

Vℓ∪(−Vℓ), (2.5)

whereV₀is an open interval of length at most2ccentered at zero and the setsVℓ,ℓ∈[k]

are intervals of length at most2c whose centers are contained in the positive real axis.

The same decomposition can now be applied to the matrixT, and forℓas in(2.4)or (2.5), we denote byT^(ℓ) the submatrix ofT containing the maximum number of rows t ofT such thatht,xi ∈ Vℓ. By permuting the rows of each of the matricesT^(ℓ) and possibly adding further rows, we may assume thatT^(ℓ) = −T⁽⁻^ℓ). Denote byA^(ℓ)the family which arises by applying the construction described in(2.2) to the matrixT^(ℓ). We now claim that−T^(ℓ) = T⁽⁻^ℓ) also implies thatA₋ℓ = ^A_ℓ⁻¹. This directly follows from the observation that, for arbitraryU ⊂^RandB ⊂[n], multiplying

*., X

j∈B

x_j − X

j∈B^c

x_j+/

- ∈U,

by(−1)corresponds to exchanging the roles ofBandB^c =B⁻¹.

The first part of the proof now implies thatA_ℓ^ξ is aSperner-1family for eachℓ∈ ⌊^k/2⌋ and ξ = sgn(x). Since we only applied row permutations or added further rows in order to construct the matricesT^(ℓ)fromT, the decomposition in(2.4),(2.5)resp., now

(32)

translates to

A ⊂

k/²

[

ℓ=1

A_ℓ∪(A₋_ℓ) =

k/²

[

ℓ=1

A_ℓ∪(A_ℓ⁻¹), in the case wherek is even, and

A⊂ ^A0∪

⌊^k/²⌋

[

ℓ=1

Aℓ∪(A_ℓ⁻¹),

in the case wherekis odd. As the operation(·)^ξ is union compatible, see Remark 2.6,

this completes the proof.

The assertions of Lemma 2.12 can also be transferred to binary matrices with values in{0,1}.

Lemma 2.13. LetBbe anN×nbinary matrix with values in{0,1}andx ∈^Rⁿ,min_j |xj| ≥ c > 0be a vector such thatBx ∈V^N, whereV is the union ofk open intervals of length at mostc. LetA⊂2ⁿbe the family containing the sets

A_i = (

j|B_i,j =1)

, i∈[N]. (2.6)

ThenA^sgn(x⁾is aSperner-kfamily. If, in addition, the set

V^′:=V − 1 2

Xn

j=1

xj (2.7)

is symmetric, it follows thatA^sgn(x⁾is a subfamily of a symmetricSperner-kfamily.

Proof. LetT be theN ×mmatrix defined by Ti,j := 



−1 B_i,j =0, 1 Bi,j =1.

Since, in matrix form,

B = ₂¹(1N×n+T), (2.8)

where1_N_×_n is theN ×nmatrix where all entries are equal to1, for arbitraryx ∈^Rⁿ it follows that

(Bx)_i = 1 2*.

,

n

X

j=1

x_j +T_i,:x+/

- .

(33)

With2V := {2v|v ∈V}, which is the union ofk open intervals of length at most2c, it therefore holds for arbitraryi∈[N]that

(T x)_i ∈2V ⇔ (Bx)_i ∈*.

, V +

Xn

j=1

xj+/

-

. (2.9)

The result now directly follows from Lemma 2.12 by noting that2V on the right hand

side of(2.9)is symmetric if and only if(2.7)holds.

We will now introduce a second variant ofSpernerfamilies.

Definition 2.14. A familyA⊂ 2ⁿis amodulatedSperner-kfamilyif there exist a sign patternξ ∈ {±1}ⁿand aSperner-kfamilyB ⊂2ⁿsuch thatA=^B^ξ. IfBis a symmetric Sperner-k family, we callAamodulated symmetricSperner-kfamily.

We will now prove a result which lays the foundation to the proof of Theorem 1.2. It is based on the following definition.

Definition 2.15. For arbitraryA⊂ 2ⁿ andJ ⊂[n], letA⊓J ⊂2^J be defined as A⊓J ={A∩J|A∈^A}.

Corollary 2.16. LetT be anN ×nbinary matrix with values in{±1}andA ⊂ 2ⁿ be the family defined in Lemma 2.12. If there exist ans-sparse vectorx ∈^Rⁿwith

j∈suppmin(x)|x_j| ≥c

and a setV ⊂ ^R which is the union ofk open intervals of length at most2c such that T x ∈Vⁿ, then there exists a setJ ⊂[n],|J|=ssuch thatA⊓J is a modulatedSperner-k family. IfV is symmetric, it follows thatA⊓Jis a modulated symmetricSperner-kfamily.

Proof. We will present the proof only for the symmetric case; the general case is similar.

Suppose that there exist ans-sparsex ∈^Rⁿ with

j∈suppmin(x)|x_j| ≥c

(34)

and a symmetric setV ⊂ ^Rwhich is the union ofkopen intervals of length at most2c such thatT x ∈Vⁿ. Set J = supp(x). ThenT_:,_Jx_J ∈ V^N and Lemma 2.12 implies that (A⊓J)^ξ^J is a subfamily of a symmetricSperner-kfamily. Since((A⊓J)^sgn(x^J⁾)^sgn(x^J⁾= A⊓J (Proposition 2.8), it follows thatA⊓J is a modulated symmetric Sperner-kfamily.

Remark 2.17. Note that for arbitrarys-sparsex ∈^Rⁿ, any discrete setV with|V|=k is a subset of

[

y∈V

(y−c,y+c),

wherec = min_j|x_j|. Therefore, the assertion of Corollary 2.16 also holds fors-sparse x ∈^Rⁿand discrete setsV with|V|=k.

With Corollary 2.16, we are able to derive the following condition implying(1.7). It can be read directly from the matrixBwithout considering the affine hull.

Theorem 2.18. LetB be anN ×nbinary matrix with values in{0,1}and letA ⊂ 2ⁿ be the family containing the sets

Ai = (

j|Bi,j =1)

, i ∈ {1, . . . ,N}. If none of the families

{^A⊓J,|J ⊂ {1, . . . ,n},2≤ |J| ≤n} is a subfamily of a modulated symmetricSperner-2family, then

∄x ∈^R^r, Xr

j=1

xj =1, kxk⁰≥ 2 : Bx ∈ {0,1}^N. (2.10)

Proof. We can write the affine hullaff(B)as

aff(B) =



Bxx ∈^Rⁿ, Xn

j=1

xj =1

= [n

s=0

affs(B),

where

aff_s(B)= 



Bxx ∈^Rⁿ, kxk0=s,

n

X

j=1

x_j =1

 .

(35)

In order to show(2.10), it suffices to prove

affs(B)∩ {0,1}^N =∅ for allswith2≤s ≤n. (2.11) Lets ≥ 2be arbitrary, and letT ∈ {±1}^N^×ⁿ as in the proof of Lemma 2.13. Corollary 2.16 together with Remark 2.17 now implies that there do not exist ans-sparse vector x ∈^Rⁿand a symmetric setV with|V| =2such thatT x ∈V^N. In particular,

∅= {T x|x ∈^Rⁿ, kxk⁰=s} ∩ {±1}^N =aff_s(B)∩ {0,1}^N;

the last equality holds by(2.9). Asswas arbitrary, this now implies(2.11)and completes

the proof.

Remark 2.19. From now on, we will only consider {±1}-valued binary matrices. By Lemma 2.13, all results for{±1}-valued binary matrices carry over to{0,1}-valued binary matrices by adjusting the right hand sides accordingly.

3. The Span of Random Binary Matrices and the Lemma of Littlewood and Offord

In this section, we pass from the setting of deterministic binary N ×n matrices to Bernoullirandom matrices. Similarly as in the previous section, they induce random sets, which we will callBernoullirandom sets.

Definition 3.1. • ABernoullirandom vectorϵ⁽ⁿ⁾of parameterp ∈(0,1)is a random vector whose entriesϵj,j ∈[n]are independent copies of aBernoulliran- dom variable taking the values1,−1with probabilityp,(1−p), respectively.

• ABernoullirandom matrix E^(N,n) of parameterp ∈ (0,1) is anN ×n random matrix where each row is an independent copy of aBernoulli random vector ϵ⁽ⁿ⁾of parameterp.

• For any finite set J, theBernoullirandom set S^(J⁾ of parameterp ∈ (0,1)is a random subset ofJ, such that for anyA⊂ J,

Pf

S^(J⁾=Ag

=p^|^A^|(1−p)^|^J^|−|^A^|. (3.1) That is, each element is included with probabilityp.

(36)

Remark 3.2. For all random variables described above, we will sometimes omit the upper indices if they are clear from the context. Furthermore, we writeqinstead of1−p andS⁽ⁿ⁾instead ofS^([n]).

The connection betweenBernoullirandom vectors and matrices andBernoulliran- dom sets is evident from their definition.

Remark 3.3. Letϵ⁽ⁿ⁾ be aBernoullirandom vector with parameterp. Then the random set

S⁽ⁿ⁾ =(

j|ϵ_j =1)

⊂ [n]

is aBernoulli random set with the same parameter; we will callS⁽ⁿ⁾ theBernoulli random set corresponding toϵ⁽ⁿ⁾. Also note that, since for arbitrary finite set J and a Bernoullirandom setS^(J⁾of parameterpit holds that

X

A∈2^J

Pf

S^(J⁾=Ag

=

|J|

X

s=1

|J| s

!

p^sq^|^J^|−^s = (p+q)^|^J^| =1,

it follows that (3.1) actually defines a probability distribution.

With the next lemma we can transfer the problem of bounding small ball probabilities as in(1.1)to the domain ofBernoullirandom sets and (symmetric)Sperner-kfamilies;

it generalizes the ideas used byErdősto prove the Lemma ofLittlewoodandOfford in [Erd45].

Lemma 3.4. Letϵ⁽ⁿ⁾be aBernoullirandom vector with parameterpand letS⁽ⁿ⁾be the correspondingBernoullirandom set. IfV is the union ofk open intervals of length at most2c andx ∈^Rⁿis an arbitrary vector withmin_j_∈[n]|x_j| ≥c >0, then

Pf

hϵ⁽ⁿ⁾,xi ∈Vg

≤ max

A⊂2ⁿ ASperner-k

Pf

S⁽ⁿ⁾ ∈^A^sgn(x⁾g

≤P_k,n(p),

where we set

P_k,n(p) := max

ξ∈ {±1}ⁿ max

A⊂2ⁿ ASperner-k

Pf

S⁽ⁿ⁾ ∈^A^ξg .

IfV is symmetric, we only need to consider symmetricSperner-k families; we denote the corresponding probability byP_±_,k,n(p).

(37)

Proof. We will only prove the theorem in the case whereV is symmetric; the general case follows analogously. Forx ∈^Rⁿwithmin_j |x_j| ≥c, letEbe the matrix whose rows are all vectorse ∈ {±1}ⁿwithhe,xi ∈V. For the familyB={Bi|i ∈[N]}with

B_i = (

j|E_i,j =1)

⊂ [n],

Lemma 2.12 now implies thatB^sgn(x⁾ is a subfamily of a symmetricSperner-k family.

Since by construction, it holds that

he,xi ∈V ⇔ eis a row ofE ⇔ {j|e_j =1} ∈^B, it follows with Remark 3.3 that

Pf

=^Pf

S⁽ⁿ⁾ ∈^Bg

. (3.2)

Bearing in mind that the familyB^sgn(x⁾is a subfamily of a symmetricSperner-kfamily and that

A^ξξ

=^AforA ⊂2ⁿandξ ∈ {±1}ⁿ, we can bound(3.2)from above by Pf

S⁽ⁿ⁾∈^Bg

=^Pf

S⁽ⁿ⁾∈ (B^sgn(x⁾)^sgn(x)g

≤ max

A⊂2ⁿ AsymmetricSperner-k

Pf

S⁽ⁿ⁾ ∈^A^sgn(x⁾g

≤ max

ξ∈ {±1}ⁿ max

A⊂2ⁿ AsymmetricSperner-k

Pf

S⁽ⁿ⁾ ∈^A^ξg .

This completes the proof.

Remark 3.5. As in Remark 2.17, Lemma 3.4 implies for aBernoullirandom vectorϵ⁽ⁿ⁾ of parameterp, arbitrarys-sparsex ∈ ^Rⁿ, and arbitrary discrete setV ⊂ ^R, |V| = k that

Pf

≤ max

A⊂2^J ASperner-k

Pf

S^(J⁾∈^A^sgn(x⁾g

≤Pk,s(p),

whereJ =supp(x). IfV is symmetric, we similarly obtain Pf

≤ max

A⊂2^J AsymmetricSperner-k

Pf

S^(J⁾∈^A^sgn(x⁾g

≤P_±,k,s(p).

The quantitiesP_k,n(p)andP_±_,k,n(p) will play an important role in the remainder of the chapter. A key distinction between the general and the symmetric case is that for the latter case, the following basic montonicity property is no longer true in general, see Remark 3.11 below.

(38)

Lemma 3.6. For arbitrary integerskandm ≤n and arbitraryp ∈(0,1), it holds that P_k,m(p) ≥ P_k,n(p).

Proof. By induction, it is enough to prove the statement form = n−1. For arbitrary A⊂ 2ⁿ, define

A₀={A⊂[n−1]|A∈^A} ⊂2ⁿ⁻¹ and A₁ ={A⊂ [n−1]|A∪ {n} ∈^A} ⊂2ⁿ⁻¹. IfAis aSperner-kfamily, then bothA₀ ⊂ 2ⁿ⁻¹andA₁ ⊂2ⁿ⁻¹areSperner-kfamilies.

Now letS⁽ⁿ⁾andS⁽ⁿ⁻¹⁾beBernoullirandom sets with parameterp,ξ ∈ {±1}ⁿbe a sign pattern andν ∈ {±1}ⁿ⁻¹the restriction ofξ to the firstn−1entries. Ifξ_n =1, we have

Pf

S⁽ⁿ⁾ ∈^A^ξg

=q·^Pf

S⁽ⁿ⁻¹⁾ ∈^A0^ν

g

+p·^Pf

Sⁿ⁻¹ ∈^A1^ν

g ≤pPk,n−1 +qPk,n−1 =Pk,n−1. (3.3) Ifξ_n =−1, inequality(3.3)holds true with interchanged roles ofpandq. This completes

the proof.

The next definition is required in order to be able to transfer the ideas of Corollary 2.16 to the random matrix case.

Definition 3.7. LetF_k,n ⊂ 2²ⁿ be the set of allmaximal modulated Sperner-kfamilies, that is, the set of all modulatedSperner-k familiesA ⊂ 2ⁿ which are not a proper subfamily of any other modulatedSperner-k family.

Furthermore, denote by F_±_,k,n ⊂ 2²ⁿ the set of all maximal modulated symmetric Sperner-kfamilies.

Remark 3.8. Definition 3.7 also enables us to rewrite the probabilityP_k,n(p) in terms of the setF_k,n, since for aBernoullirandom setS⁽ⁿ⁾with parameterpit holds that

P_k,n(p) = max

ξ∈ {±1}ⁿ max

A⊂2ⁿ ASperner-k

Pf

S⁽ⁿ⁾∈^A^ξg

= max

A∈F_k,n

Pf

S⁽ⁿ⁾ ∈^Ag ,

and similarly forP_±_,k,n(p)andF_±_,k,n.

For arbitraryp ∈ (0,1) we will now compute the probabilitiesP_1,2(p) andP_±_,2,2(p), which we will need later.

(39)

Lemma 3.9. For arbitraryp ∈ (0,1), it holds that

P_±_,2,2(p)=P_1,2(p)=p²+q².

Proof. Letp ∈ (0,1) be arbitrary and letFbe the set of allSperner familiesA ⊂ 2². Then

F= (

∅,{∅},{{1}},{{2}},{{1,2}},{{1},{2}}) .

Direct calculations yield that the set of all modulatedSperner families of subsets of {1,2}is given by

F^′ =F∪(

{∅,{1,2}}) .

Consequently, the set of all maximal modulatedSpernerfamilies is given by F_1,2= (

{{1},{2}},{∅,{1,2}}) .

Next, we will computeF_±_,2,2. By Definition 2.10, Definition 2.14 and Remark 2.6, every modulated symmetricSperner-2family of subsets of[n]is of the form

A∪^A⁻¹ξ

=^A^ξ ∪^A⁻^ξ,

whereA ⊂2ⁿis aSperner-1family andξ ∈ {±1}ⁿ is a sign pattern. Since forA∈F_1,2, one hasA=^A⁻¹and hence

F_±_,2,2= (

A∪^A⁻¹^A ∈F_1,2)

=F_1,2,

we can conclude thatP_±,2,2(p) =P1,2(p). It remains to show thatP1,2(p) =p²+q². For that, note that

P_1,2(p) = max

A∈F1,2

Pf

S⁽²⁾∈^Ag

=max(

2pq,p²+q²)

=p²+q²,

where the last equality is implied by

p²+q²−2pq =(p−q)² ≥0.

This completes the proof.

(40)

Lemma 3.10. We have

|F_±_,2,3| =4, and for arbitraryp ∈(0,1), it holds that

P_±,2,3(p)=1−pq.

Remark 3.11. While the probabilitiesP_k,n(p)are non-increasing with respect ton(Lemma 3.6), the same does not necessarily hold true forP_±_,k,n(p). Lemma 3.9 and Lemma 3.10 imply for arbitraryp ∈(0,1)that

P_±_,2,3(p)−P_±_,2,2(p) =1−pq−(p²+q²)= (p+q)²−pq−(p²+q²)=pq >0.

Proof of Lemma 3.10. Since the families A₁= (

{1},{2},{3})

and A₁⁻¹ = (

{2,3},{1,3},{1,2}) areSpernerfamilies, the set

F^′= (

A₁^ξ ∪^A1⁻^ξξ ∈ {±1}³)

consists of modulated symmetricSperner-2families. We claim thatF^′=F_±_,2,3. To this end, observe that by union compatibility, it holds for arbitraryξ ∈ {±1}³that

A₁∪^A1⁻¹

ξ

=

2³\ {∅,{1,2,3}}ξ

= (2³)^ξ \ {∅,{1,2,3}}^ξ = (2³)\ {∅,{1,2,3}}^ξ. Since{∅^ξ|ξ ∈ {±1}ⁿ}=2ⁿ, it therefore follows that

F^′= (

2³\ {A,A^c}A∈2³ )

= (

2³\ {A,A^c}A∈2³, |A| ≤ 1 )

, (3.4)

where the last equality holds by symmetry of the set{^A,A⁻¹}. Consequently, all the modulated symmetricSperner-2families contained inF^′are maximal. Indeed, for any A ∈ F^′, adding some A ∈ 2³ \^A results in a family which is not symmetric, and adding both missing sets results in2³, which is not a modulated symmetricSperner-2 family. The same arguments also show that there are no modulated symmetricSperner- 2families of cardinality larger than 6. For a modulated symmetric Sperner-2family A ∈ 2³ of cardinality smaller than6, its complement must also be symmetric, which shows thatAis a subfamily of someA^′ ∈F^′. Hence,A cannot be maximal and one hasF_±_,2,3 =F^′. Since eachA ∈ 2³with|A| ≤ 1yields a different set2³\ {A,A^c},(3.4)