Alternating Projections and Sparse-Polyhedral Feasibility

6 Angles, Polyhedral Sets, and Sparsity

6.3 Alternating Projections and Sparse-Polyhedral Feasibility

Define for every J ∈ J the gap vector

gJ BPC−AJ(0). (6.11)

Then, for every J ∈ J, the intersection (C −g_J)∩A_J is linearly regular. Further, for every J ∈ J, the intersection(C −g_J)∩Asis locally linearly regular.

Proof. Choose an arbitrary J ∈ J and an arbitrary x ∈ _Rⁿ. Then, at the intersection (C −g_J)∩A_s, we have the intersection of finitely many faces of a convex polyhedron.

Each of these faces is locally an affine subspace. Hence, their intersection with a linear subspace is linearly regular. Further, the intersection with a union of finitely many

linear subspaces is locally linearly regular.

6.3 Alternating Projections and Sparse-Polyhedral Feasibility

We want to give a complete anaysis of the behavior of alternating projections applied to problem (6.7), starting with preparatory lemmata.

Lemma 6.3.1. LetΩ1,Ω2⊂ _Rⁿbe closed subsets. If there exists x∈ _Ω₁such that d_Ω₂(x) = d_Ω₁(P_Ω₂x), then x∈Fix(P_Ω₁P_Ω₂).

Proof. This follows from the definitions of the projector (2.17) and fixed points

(Defini-tion 2.1.4).

Lemma 6.3.2. Let(x,y)∈ C ×A_sbe a local best approximation pair (Definition 2.3.7). Then y ∈ StFix(P_A_sPC). Further, for all x ∈ Fix(P_A_sPC), the point x is a local best approximation point toCin A_s. Not every local best approximation point toCin A_sis contained inFix(P_A_sPC). Proof. Let (x,y) ∈ C ×A_s be a local best approximation pair. Then y ∈ P_A_sx and x∈ P_Cy, and we have neighborhoods aroundxandysuch that no point in these neigh-borhoods is closer to the other set thanxandy, respectively. Hence,yis a stable fixed point. Now, let x ∈ Fix(P_A_sPC). Then, for ally ∈ A_s and sincex ∈ P_A_sPCx, we know thatky−P_Cxk ≥ kx−P_Cxk. We consider two cases:

1. Ifx∈ As∩ C, thenxis by definition a local best approximation point.

2. If x < A_s∩ C, then x has sparsity exactly equal to s since it is in the image of the projection onto A_s. Hence, there exists some δ > 0 such that A_s∩_B_δ(x) = AJ ∩_B_δ(x) where J = I(x) (see (3.10)). In other words, the set AJ ∩_B_δ(x) is convex. The vectorx−PCxis, sincex ∈ Fix(P_A_sPC), normal toA_J. Since the sets A_J andC are convex, there exists a separating hyperplane Hbetween A_J andC, which is normal to x−PCx. This hyperplane can be shifted such that PCx ∈ H.

Because x−PCx is also normal to A_J, we know that PCx+A_J ⊂ H, i.e., A_J is parallel to H, andx−P_Cxis the gap betweenHand A_J. From this we conclude that any pointz ∈_B_δ(_x)∩AJmust satisfydC(z)≥ dC(x). Hence,xis a local best approximation point.

6 Angles, Polyhedral Sets, and Sparsity

We give an example that shows that not every local best approximation point toCinA_s is a fixed point ofP_A_sP_C. LetA_sbe the union of thex-axis and they−axis inR², and let C ={(2, 1)}. Then the point(0, 1)is a local best approximation point toC inAsbut

P_A_sP_C(0, 1) = (2, 0)_, (0, 1).

This completes the proof.

The following theorem gives an overview of the possibilities that can occur when dealing with alternating projections and the set of sparse vectors. A note on the alter-nating projections sequence in the following theorem: it may happen that the projection P_A_sxof a pointxonto the setA_sis not single-valued. Then it is important to have a look ateverypoint inP_A_sxbecause at one of them the sequence may continue in a good way.

See Figure 6.1 for an example that illustrates this issue.

Figure 6.1: If we start at the pointx in the graphic and project onto the set of 1-sparse vectors, then we can choose between the pointsyandz. If we choosez, then projecting back onto the polyhedron C results in x again, while choosing y would lead to an alternating projections sequence that converges to the intersectionC∩A₁. This example shows the necessity of different kinds of fixed points in Definition 2.1.4.

Theorem 6.3.3. Let C be a polyhedral set as defined in (6.5), and let s ∈ _N be such that 0 ≤ s ≤ min{m,n}. Let{xⁿ}_n_∈_N be a sequence generated by the alternating projections operator P_A_sPC, with x⁰arbitrary. Then the sequence{xⁿ}_n_∈_Nhas finitely many cluster points

x₁, . . . , ¯xνsatisfying PCx¯₁ =· · · =PCx¯ν.

Proof. First, we remind the reader of the result in Theorem 4.2.3 and that the setAscan be written as a union of finitely many linear subspaces, i.e.,

As = ^[

J⊂{1,...,n},#J=s

AJ, (6.12)

6.3 Alternating Projections and Sparse-Polyhedral Feasibility where A_J is the span of vectors of the standard basis ofRⁿindexed by J. Proceeding, we note that the polyhedronChas finitely many faces. These faces will be denoted by FI withI ∈2^{^1,...,m^}whereIis an index set corresponding to lines of the matrixM, i.e.,

F_I BC ∩B_IBC ∩ {x∈_Rⁿ|M_Ix = p_I}. (6.13) Further, define the sets

B_I BaffF_I for allI ∈₂^{^1,...,m^}_. _(6.14) The dimension of F_I will be defined as the dimension of its affine hull B_I. For every index set I ∈2^{^1,...,m^}and for every index setJas in (6.12), there exist, by Lemma 2.3.11, nearest points of the affine hullB_Ito the subspaceA_J. We observe that we have finitely many affine hulls of faces B_I ofC and finitely many linear subspacesA_J. This gives us finitely many gap vectorsg_{I J} between theB_IandA_Jas well as finitely many Friedrichs angles between theB_IandA_J, whose cosines are given byc_{I J} Bc(B_I,A_J). step of the alternating projections sequence, we have

P_Cx^k = P_B_Ikx^k. (6.15)

If we had equality, then, by Theorem 4.2.2, we would have a fixed point of alter-nating projections between B_J_k−g_I_k_J_k and A_I_k and, hence, a best approximation point toB_J_k inA_I_k. The only possiblity for the pointx^knot to be a best

Therefore, suppose that (6.16) holds. Further, d_A_s

6 Angles, Polyhedral Sets, and Sparsity

Further, we know by Theorem 4.2.3 that d_C

P_A_JkP_Cx^k

− kg_I_k_J_kk ≤ ρ_k d_C x^k

− kg_I_k_J_kk

⇔ dC

P_A_JkPCx^k

≤ ρ_k dC x^k

− kg_I_k_J_kk+kg_I_k_J_kk ^(6.19) with ρ_k < 1. In other words, in step k, the sequence approaches a local best approximation point to BJ_k in AI_k. The constant ρ_k depends on the Friedrichs angle betweenA_J_k andB_I_k. Since there are only finitely many different Friedrichs angles between the affine hulls B_I of faces F_I and linear subspaces A_J, there are only finitely many different constantsρ_k. For simplicity, we define

ρBmax{ρ_k|ρ_k <1, k ∈_N}. (6.20) If the cosine of the Friedrichs angle of BJ_k −g and AI_k is zero, then, by Lemma 6.2.6, either one of the spaces is contained in the other one or the spaces fulfill an orthogonality condition. In both cases, the sequence would attain a fixed point after one more step.

2. Letx^k be a fixed point ofP_A_sPC. If P_A_sPCx^k ⊂ FixP_A_sPC, then, for every AP step, we canchoosethe next iterate among the points inP_A_sP_Cx^kand, hence, obtain the subsequences{xⁿ^k¹}_k₁_∈_N, . . . ,{xⁿ^kν}_k_ν_∈_N ⊂ {xⁿ}_n_∈_N, whereνis the cardinality of P_A_sPCx^k. BecauseP_A_sPCx^k ⊂FixP_A_sPCand sinceCis convex, we havePCx =PCx^k for allx∈ P_A_sP_Cx^k.

If P_A_sP_Cx^k contains some y which isnota fixed point of P_A_sP_C, then, as soon as y = x^k⁺¹ is chosen as the next iterate, we have the case which is similar to the previous one.

What remains to be shown is that, by this procedure, the sequence actually converges.

We know that the sequence{δⁿ}_n_∈_Nwith

δ^k Bd_C(xⁿ) (6.21)

is nonnegative and monotonically decreasing (Lemma 4.2.4). This is already sufficient for the sequence(δⁿ)_n_∈_Nto converge. We denote the limit by

δB lim

n→_∞δⁿ. (6.22)

Hence, there exists a sequence{εⁿ}_n_∈_N⊂_Rsuch that for alln∈N, the termdC(xⁿ)² can be written asδ²+εⁿwithεⁿ ≥0 and lim_n→_∞εⁿ =0. Now, letn∈_Nbe such thatεⁿ is sufficiently small. By monotonicity of the distance of the iteratesxⁿto the setC, there existsεⁿ⁺¹satisfying 0≤ εⁿ⁺¹≤ εⁿsuch that

d_C(xⁿ)² =δ²+εⁿ d_C xⁿ⁺¹2

=δ²+εⁿ⁺¹.

6.3 Alternating Projections and Sparse-Polyhedral Feasibility

Together with convexity (see (2.31)) of the setC, we can give a bound for the squared distance betweenPCxⁿandPCxⁿ⁺¹:

As already noted, there exist finitely many ratesρ_{I J} in Equation (6.19), together with a smallest positiveρ, defined in Equation (6.20). We distinguish between two cases:

1. If the number of iterates is such that dC xⁿ⁺¹ there can only be finitely many cluster points of the sequence {xⁿ}_n_∈_N. Their projection ontoC, however, is identical, which was claimed.

2. If the number of iterates is such thatdC xⁿ⁺¹

= dC(xⁿ) is finite, then we can choose n large enough such thatd_C xⁿ⁺¹

< d_C(xⁿ)for alln ≥ n. If we write dC(xⁿ) =δ+τⁿ, then, by Equation (6.20), we know thatτⁿ⁺¹ ≤ρτⁿfor alln≥n.

To adapt our notation to (6.3), we can write

dC(xⁿ)² =δ²+εⁿ=δ²+τⁿ(2δ+τⁿ),i.e.,εⁿ= τⁿ(2δ+τⁿ).

6 Angles, Polyhedral Sets, and Sparsity

=√ εⁿ 1

1− √ ρ.

From this, we deduce that the sequence {PCxⁿ}_n_∈_N is bounded. Hence, it is a subset of a compact subset of Rⁿ, and there exists a cluster pointy ∈ C of the sequence{PCxⁿ}_n_∈_N. Further, this cluster point must satisfyd_A_s(y) =δ. Finally, we show by contradiction that the cluster point y is unique. Assume there is another cluster point y⁰ , y ∈ C of the sequence{P_Cxⁿ}_n_∈_N. Then the distance between these points isdBky−y⁰k₂>0. Since the sequence{εⁿ}_n_∈_Nconverges monotonically to zero, we can find somen∈_Nsuch that √

εⁿ₁₋¹^√_ρ ≤ ^d₂ and such that P_Cxⁿ−y < ^d₄. Hence, there cannot be an infinite subsequence{P_Cx^k}_k_∈_I ⊂ {PCxⁿ}_n_∈_N having y⁰ as a cluster point because there exist neighborhoods of y⁰ such that there exists no k such that PCx^k is contained in these neighborhoods.

This contradicts the fact thaty⁰ is a cluster point. We conclude thaty ∈ C is the only cluster point of{PCxⁿ}_n_∈_N. As before, sinceP_A_sPCxⁿ is of finite cardinality, there can only be finitely many cluster points of the sequence{xⁿ}_n_∈_N.

This completes the proof.

Remark 6.3.4. The novelty in Theorem 6.3.3 is that we have shown convergence of the alternat-ing projections in sparse polyhedral feasibility without any further restriction. In particular, if Cis a compact set then classical convergence theory applies. In Theorem 6.3.3 we do not require any compactnes of the polyhedral set.

We present examples for different cases in Theorem 6.3.3. A good example is the objective of finding the nearest point of a line L in three dimensions to the set of 1-sparse vectors A₁ ⊂ _R³. Suppose that L has no common point with A₁. Then there exists a best approximation pair betweenLandA₁.

Figure 6.2: An example of an alternating projections sequence that finds a best approx-imation pair between the set of 1-sparse vectors inR²and the polyhedronC in finitely many steps. The proximal normal cone at ¯xis shown in green.

The following can occur when the nearest pointxin a polyhedron to the set of sparse vectors is a corner of the polyhedron. The alternating projection sequence behaves as if

6.3 Alternating Projections and Sparse-Polyhedral Feasibility it is finding the best approximation pair relative to one of the faces of the polyhedron.

Then it may happen that the sequence contains points which are normal to another face of the polyhedron. The following projection onto the polyhedron will be in the intersection of these two faces. Figure 6.2 shows an example of this case.

Another example is as follows: takeC = {(1, 1)}and find the nearest point in the set of 1-sparse vectors. Generate the alternating projections sequence with an arbi-trary initial point inR². The projectionPC is always the point(1, 1), while P_A₁(1, 1) = {(1, 0),(0, 1)}. Hence, we can choose a point in P_A₁(1, 1). So the AP-sequence can be any randomly chosen sequence of the points (_{1, 0})_and (_{0, 1}), which is not a best ap-proximation pair (see Figure 6.3).

Figure 6.3: An example of a best approximation pair between the set of 1-sparse vectors inR²and the polyhedronCwhere the pair is not unique.

Now we can formulate a necessary condition for the method of alternating projec-tions to converge globally.

Theorem 6.3.5. Let B and A_sbe given as in(3.6)and(3.5)respectively. Suppose A_s∩B,∅.

Let

As= ^[

J∈J_s

A_J, (6.23)

whereJ_sis given by(3.11). Denote by(x_J,x^s_J)∈B×Asthe best approximation pairs between B and A_Jfor every J ∈ J_s. Then, for an arbitrary x⁰∈ _Rⁿ, the sequence generated by

x^k⁺¹ =T_APx^k = P_A_sP_Bx^k (6.24) converges to a pointx¯ ∈ A_s∩B if and only if, for all x_J <B∩A_s, we have

d_A_s(x_J)<d_A_J(x_J) for all J∈ J_s. (6.25) Proof. First, we note that the setBis by definition a polyhedral set. By Theorem 6.3.3, the method of alternating projections converges at a linear rate to a best approximation pair. Now define

δ Bmin

J∈J_sd(AJ,B). (6.26)

6 Angles, Polyhedral Sets, and Sparsity

Suppose now that, for every best approximation pair (x_J,x^s_J), the property (6.25) is fulfilled. Then, after finitely many steps, sayk, we havedB(x^k)< δ. From then on the sequence must converge toA_s∩Bat a linear rate.

On the other hand, suppose property (6.25) is not fulfilled for some I ∈ J_s. Then we can just consider the alternating projections sequence initiated at x⁰ = x_I. Since d_A_s(x_I) = d_A_I(x_I), we can chooseP_A_sx^k = x^s_I. Thenx^k⁺¹ = x_I = x^k, and the sequence will not converge at all to a solution of (3.8). This completes the proof.

As seen before, there are strong sufficient conditions for alternating projections to converge globally in the sparse affine feasibility problem. The condition (5.13) can also be formulated in terms of angles.

Lemma 6.3.6. Property(5.13)holds true if and only if we have

c(A_J, ker(M))≤ ^p1−δ_2s for all J ∈ J_s. (6.27) Proof. By (5.13), we have

(1−δ_2s)kxk²₂≤ kM^†Mxk²₂ for allx∈ A_2s. (6.28) In other words, cos(α)≤ √

1−δ_2sfor allJ ∈ J_s.

To close this chapter, we consider an example for a setup where an alternating pro-jections sequence will converge to the intersection of the set of sparse vectors.

Example 6.3.7. Consider the alternating projections algorithm applied to the set of 2-sparse vectors inRⁿand the one-dimensional affine space defined by

B(λ) =

Apparently, the only solution to(3.8)is the point with its first entry equal toµand all other entries equal to zero. For a thought experiment, we determine all best approximation pairs between B and the subspaces gained by the decomposition of A₂(see Equation(3.12)).

First, we consider

where e_i are the standard unit vectors inRⁿ. Let us determine the best approximation pairs between B andspan

e_i,e_j for i,j , 1. The case where i = 1is trivial due to the existing intersection. Without loss of generality, let i = _2,_j = 3. A best approximation pair between B andspan

e_i,e_j will be of the shape((λ+x,x, . . . ,x),(0,x,x, 0, . . . , 0)). The projection

6.3 Alternating Projections and Sparse-Polyhedral Feasibility first-order necessary condition for optimality yields

dy = 2λ+2ny−4x =0

⇔ y = ^2x_n⁻^λ. (6.32)

Now, in order to have a best approximation pair, the equality x= ^2x−λ

In other words, the projection of

. From now on, the alternating projections se-quence will converge to the solution of (3.8). Using Theorem 6.3.3, we know that any sequence of alternating projections between A₂and a convex polyhedral set converges to best approxima-tion pairs. But initiated at a best approximaapproxima-tion pair, the sequence approaches the intersecapproxima-tion A₂∩B. This shows that the affine subspace B is a prototype for an affine subspace where, independent of how it is shifted from the origin, alternating projections converges globally.

Im Dokument Projection Methods in Sparse and Low Rank Feasibility (Seite 77-87)