Analysis of the outer optimization problem

2.3 First order analysis via reduction technique

2.3.6 Analysis of the outer optimization problem

By means of the preliminary work ofParagraph 2.3.5it is possible to derive the derivative of the reduced objectiveF of the reduced bilevel optimization problem (2.38). This is step two of the general recipe of Paragraph 2.3.2within the analysis of the set optimal control problem (2.30).

By means of the geometry-to-solution operatorG(cf.Paragraph 2.3.4), it is possible to reduce the bilevel optimization problemBiOPto (2.38). The detailed analysis of the inner optimization problem (cf. Para-graph 2.3.5) yields necessary and sufficient conditions, which enable an easy evaluation of the geometry-to-solution operator. Altogether the set optimal control problem (2.30) is equivalent to a strongly reduced shape/topology optimization problem.

Theorem 6(Set optimal control problem as shape/topology optimization problem):

The set optimal control problem (2.30) is equivalent to the shape/topology optimization problem minimize F(B):= ¹

y¯_J −y_d

L²(J)+¹

2ky^max_min −y_dk²_L₂₍_B)_˚ + ¹ 2λ

p¯_J

L²(J)+ ¹

2λkp^max_mink²_L₂₍_B)_˚ (2.45a) subject to

B ∈ O, (2.45b) ymin<y¯_J <ymax inJ, (2.45c)

−_∆y¯_J +y¯_J + ¹

λp¯_J =u_d inJ, (2.45d)

∂ny¯_J =0 onΓ, (2.45e)

y_J|_β=y^max_min|_β onβ, (2.45f)

∂

J ny¯_J =∂

ny^max_min onβ, (2.45g)

−_∆p¯_J +p¯_J −y¯_J =−y_d inJ, (2.45h)

∂np¯_J =0 onΓ, (2.45i)

p_J ∈ L²(J,∆), ¯y_J ∈ H²(J), (2.45j) in the following sense:

Let(A; ¯u_I, ¯u_A, ¯y_I, ¯y_A)be the optimal solution of (2.30) and let ¯Bbe the optimal solution of (2.45), then A=B^¯, u¯_A =−¹

λp^max_min+u_d, y¯_A =y^max_min,

¯ u_I =−¹

λp¯_J +u_d, y¯_I =y¯_J.

2.3.6 Analysis of the outer optimization problem 41

In particular, (2.45) is uniquely solvable. At this, the coefficient functionp^max_min ∈H²(Ω)is constructed the same way asy^max_min inLemma 4, but such that

p^max_min(x) = (

λ(_∆y_max(x)−ymax(x) +u_d(x))_, xin a neighborhoodBmaxofBmax,

λ(_∆y_min(x)−ymin(x) +u_d(x)), xin a neighborhoodBminofB_min, (2.46)

∂_np^max_min =0 onΓ.

Remarks:

Although the set optimal control problem (2.30) and the shape/topology optimization problem look sim-ilar, there is an essential discrepancy: The boundary value problem (2.45d)–(2.45i) is uniquely solvable for any givenB ∈ O, whereas (2.30h)–(2.30l) is not. Consequently, the set optimal control problem requires optimization with respect to the function space variables, whereas its reduced counterpart does not.

The strict inequality constraint (2.45c) plays the role of a constraint here, which influences the admis-sibility of the geometrical splitting of Ω = B∪ J˙ . That is to say, the constraint is a state constraint in shape/topology optimization. Moreover, in view of the discussion ofParagraph 2.2.4, it is expected that is has no effect on first order necessary conditions. This actually turns out to be true inParagraph 2.3.7.

Proof. The set-OCP (2.30) is equivalent to the bilevel optimization problem (2.36), (2.37) according to Theorem 4. By means of strict convexity of the inner optimization problem (2.37) – confer the proof of Theorem 3– its first order necessary conditions (2.42) are sufficient, too. Hence, the inner optimization problem can equivalently be replaced by its optimality system within the bilevel optimization problem.

However, one is only interested in the optimal primal variables ¯u_J, ¯u_B, ¯y_J, ¯y_B and not in the dual ones.

Consequently, it is sufficient to execute the first three solving steps ofLemma 7:

1. assign ¯y_B =y^max_min,

2. solve (2.44), i. e. solve (2.45d)–(2.45i)

3. assign ¯u_J =−_λ¹p¯_J +udand ¯u_B =−∆y^max_min +y^max_min =−_λ¹p^max_min+ud. Plugging these results into the objectiveJin (2.30a) yields (2.45a)

F(B):=J(B; ¯u_J, ¯u_B, ¯y_J, ¯y_B) =J(B;−¹

λp¯_J +ud,−¹

λp^max_min+ud, ¯y_J,y^max_min). All in all, one has reached the reduced reformulation (2.45).

As already mentioned, the scope of this paragraph shall be to execute the second step of the general recipe ofParagraph 2.3.2. The first part therein is to prove differentiability of the control-to-state oper-ator S, which means to prove shape differentiability of the equality constraints in the present context.

However, differentiability of the constraints could not yet be proven (seeAppendix Cfor a more detailed discussion), and has to be assumed here.

Lemma 8(Shape derivative of the constraints):

Let the family of admissible setsObe given byDefinition 4. LetB ∈ Obe given, such that the solution (y¯_J, ¯p_J)of (2.45d)–(2.45j) lies inH²(J)×H¹(J).¹² Moreover, letp^max_min be defined as in (2.46). Addition-ally, assume that the boundary value problem (2.45d)–(2.45j) is shape differentiable.

Then for each

V∈ V :={W∈C^1,1(_Ω,_R²)|W·n=0 onΓ} (2.47) the (local¹³) shape derivativesy_J⁰[V]∈ H²(J)andp_J⁰[V] ∈L²(J,∆)are given as the unique solution of the boundary value problem

−∆y_J⁰[V] +y_J⁰[V] =−¹

λp_J⁰[V] inJ, (2.48a)

∂ny_J⁰[V] =0 onΓ, (2.48b) y_J⁰ [V]|_β=₀ _on_β, _(2.48c)

∂

ny_J⁰[V] =V·n_J 1

λ(p^max_min|_β−p¯_J|_β)onβ, (2.48d)

−∆p_J⁰ [V] +p_J⁰[V] =y_J⁰ [V] inJ, (2.48e)

∂np_J⁰[V] =0 onΓ. (2.48f)

Remark:

1. The definition of the space of velocity fieldsVis advisedly chosen:

• it ensures that theholdallΩremains unchanged under the action ofV;

• C^1,1regularity of the transported candidate active setB_tis preserved; seeParagraph 2.6.1;

• the regularity assumptions (V) (cf. [44, Chp. 4 Eq. (5.5)]), required for the definition Hadamard differentiability (cf. [44, Chp. 9, Def. 3.1]), which is the basis for the definition of shape differ-entiability (cf. [44, Chp. 9, Def. 3.4]), are fulfilled. In particular, attend to [44, Chp. 4, Rem. 5.2 and the introduction to Sec. 5.2].

2. As already mentioned in the5thitem of theRemarksonpage 36the low regularity of ¯p_J is cru-cial. In particular, it has not yet been possible to proveLemma 8without the additional regularity assumption at ¯p_J. From the perspective of necessary conditions of the set-OCP (2.30) the assump-tion is without problems, since it is fulfilled at the optimum. However, from an algorithmic stand point, the assumption made may be a true restriction, since the optimality system ofiOPand its local shape derivative system (2.48) have to be solved at non-optimal configurations as well; see Algorithm 1.

3. The notation(.)⁰[V]is used here for the local shape derivative. The explicit usage of the velocity fieldVindicates, that this object is a semiderivative, and hence requires a “direction”.

Proof. The proof consists of two parts. Firstly, it is shown, that the shape derivativesy⁰[V]and p_J⁰ [V]are solutions to the coupled BVP (2.48). Afterwards unique solvability in of the system is provided.

1) Since each component – except the Neumann interface condition – of the BVP (2.45d)–(2.45j) is pretty much standard, the reader is referred to the rules for shape differentiation of boundary value problems [147, Lem. 14, Lem. 15] or [151, Prop. 3.1, Prop 3.3]. For convenience the derivation of the non-standard Neumann boundary condition (2.45g) is given here. Its special character is, that although the function y^max_min does not depend on the choice ofBlocally (cf. theRemarktoLemma 4), its normal derivative∂^J_ny^max_min does, indeed.

Before the derivation of the shape derivative can be addressed, it is useful to notice the following finding:

In contrast to the Neumann trace operator∂^J_n(.), thetangential gradient ∇_β(.) and theLaplace-Beltrami operator∆β(.)are directly acting on the submanifoldβ⊂_R². That is, they act on the image space of the Dirichlet trace operatorτ_β(.) = (.)|_β. Consequently, there holds

φ∈H²(J) with φ|_β≡0 onβ ⇒ ∇_βφ≡0 and ∆_βφ≡0, whereas

φ∈ H²(J) _with φ|_β≡0 onβ ; ^∂

J nφ=_0.

Transferred to the Dirichlet boundary condition (2.45f), this yields

∇_β(y¯_J −y^max_min)≡0 and ∆β(y¯_J −y^max_min)≡0. (2.49) According to [147, Lem. 15], which provides the derivative of Neumann boundary conditions, and with use of the notation∂nnfor the binormal derivative (cf.Definition 2), there holds

∂

n(y_J⁰ [V]−y^max_min⁰[V]

| {z }

) =−V·n_J ∂nn(y¯_J −y^max_min) +∇_β(y¯_J −y^max_min)

| {z }

·∇_β(V·n_J)

=V·n_J ∆(y^max_min−y¯_J)|_β−∆_β(y^max_min−y¯_J)

| {z }

−∂

n(y^max_min−y¯_J)

| {z }

κ_J

=_V·n_J

∆y^max_min−y¯_J

=y|{z}^max_min +_u_d− ¹

λp¯_J _β

=V·n_J 1

λ(p^max_min|_β−p¯_J|_β)_, _(2.50)

12Actually, this condition is fulfilled for the active setB=Aat least, seeCorollary 2. Note in addition, that higher regularity at the optimum can be proven without knowledge of shape differentiability, since it only relies on weak continuity of the optimal control across the optimal interfaceγ.

13A detailed background to this notion can be found inParagraph 2.4.2.

2.3.6 Analysis of the outer optimization problem 43 where (2.45d), (2.49) and the identity [151, Prop. 2.68]

∂nn(.) =_∆(.)|_β−_∆_β(.)−∂

n(.)κ_J. (2.51)

are applied. Hereκ_J denotes themean curvatureofβ, whereβis interpreted as boundary ofJ; cf. [44, p. 74]¹⁴.

2) Unique solvability of the BVP is ensured by the following reasoning. Regard the auxiliary strictly convex optimization problem

SinceV ∈ V, and since the normal vector fieldn_J is Lipschitzian (cf.Definition 2), their scalar product V·n_J is Lipschitz continuous, too. In consequence of [69, Thm. 1.4.1.1] and of ¯p_J ∈H¹(J), there holds

λV·n_J(p^max_min −p¯_J)∈H¹(_Ω).

Furthermore, the right hand side of the inhomogeneous Neumann boundary condition onβwithin the definition ofUis an element ofH^1/2(β).Lemma 1ensuresUto be nonempty now.

The auxiliary optimization problem is uniquely solvable, which is obtained with the same proof as that ofTheorem 3. Furthermore, with the same reasoning as in the proofs ofTheorem 5andLemma 7, one recognizes that the BVP (2.48) can be interpreted as the reduced first order necessary and sufficient con-ditions of the auxiliary problem. Unique solvability of (2.48) is a consequence of unique solvability of the auxiliary problem now.

Lemma 8offers the opportunity to derive the shape derivative of the reduced functionalF. Lemma 9(Shape differentiability ofF):

Let the family of admissible setsObe given byDefinition 4and letB ∈ Osuch that the assumption of Lemma 8are fulfilled. Furthermore, letV∈ V– see (2.47) – be arbitrarily chosen.

Then theshape semiderivativeof the reduced objectiveF – see (2.45a) – in the directionVis given by dF(B;V) =

Proof. Due to the rules of shape calculus (cf. [151, Eq. (2.168)]), the first summand can be differentiated.

Sincey_dis not dependent on the shapeJ, sincey_J⁰[V]∈ H²(J)is well-defined (seeLemma 8) and since

14Note, that the sign of the mean curvature depends on the choice of the orientation of the normal vector field of the boundary.

Hence, sincen_B =−n_J, there holdsκ_B=−κ_J.

The second summand ofF yields

since neitherydnory^max_min is dependent on the shapeB(at least locally; cf. theRemarkofLemma 4). The analog results for the remaining two terms of the sum lead to (2.52).

In view of the general recipe to derive first order necessary conditions inParagraph 2.3.2, a closer look at the representation (2.52) of the shape semiderivative reveals, that the Hadamard form (cf. [44, Chp. 9 Thm. 3.6, Chp. 9 Cor. 1]) – i. e. a gradient representation – has not yet been obtained. Generally speaking, it is necessary to identify the adjoint operator of the derivative of the geometry-to-solution operator, such that adjoint states can be derived; see (2.34).

Lemma 10(L¹-shape gradient ofF):

Let the family of admissible setsObe given byDefinition 4, letB ∈ Obe chosen such that the assump-tions ofLemma 8are fulfilled and letp^max_min be defined as in (2.46). Furthermore, let theshape adjoint states Y_J andP_J be the unique solution to theshape adjoint equation

−∆Y_J +Y_J + ¹ Then the shape semiderivative of the reduced objectiveF – see (2.45a) – evaluated at the set B in the directionV∈ V can be expressed as

dF(B;V) = Thus, the (L¹-)shape gradientcan be identified with

∇F(B) = ¹ In respect of higher regularity of the shape gradient confer theRemarktoTheorem 7.

Remark:

In defiance of the original usage of the notion of the shape gradient [44, Chp. 9 Def. 3.4 and Thm. 3.6], the associated but strictly speaking distinguished scalar distribution (2.54) (cf. [44, Chp. 9 Cor. 1]) is called shape gradient in the following. See also15thitem of the discussion onpage 77.

Proof. 1) This part concerns the unique solvability of the shape adjoint system (2.53).

Since it has the same form as (2.48), the assertion follows along the lines of the second part of the proof ofLemma 8, in which one uses the auxiliary strictly convex optimization problem

minimize f(U):=

2.3.6 Analysis of the outer optimization problem 45 whereS:L²(J)→H²(J)is the solution operator for the boundary value problem

−∆Y+Y=U inJ,

∂nY=0 onΓ, Y=0 onβ.

2) LetV ∈ Vbe arbitrary, but fixed. One recognizes that the shape semiderivative of the reduced objec-tive can be transformed into Hadamard form by means of the shape adjoint system (2.53) and the local shape derivative BVP (2.48)

Herefrom, the shape gradient (2.54) can be identified by means of the fundamental lemma of calculus of variations.

Remark(Constructive heuristic to derive the shape adjoint system):

The shape adjoint boundary value problem (2.53) can be obtained constructively by means of the follow-ing heuristic:

• Multiply the homogeneous PDEs (2.48a) and (2.48e), which define the shape derivativesy_J⁰ [V]and p_J⁰ [V], byP_J and respectivelyY_J, integrate and add the two terms todF(B;V)(2.52).

The Hadamard form is obtained inLemma 10at the price of solving the shape adjoint boundary value problem (2.53). This drawback can be overcome.

Theorem 7(Shape gradient ofFwithout shape adjoints):

Let the family of admissible sets O be given by Definition 4 and let B ∈ O be chosen such that the assumptions ofLemma 8are fulfilled.

Then the shape gradient of the reduced objectiveF – see (2.45a) – evaluated at the setBhas a represen-tation as

∇F(B) =− ¹

2λ(p^max_min|_β−p¯_J|_β)²∈ L¹(β), (2.55) wherep^max_min is defined in (2.46) and ¯p_J is given by (2.45d)–(2.45j). In particular, the shape gradient comes without shape adjoint variables.

Proof. Lemma 10ensures, that the shape adjoint system (2.53) is uniquely solvable. A closer look reveals, that the unique solution to the shape adjoint system is given byP_J = p¯_J andY_J =0. Thus, one finally

the product of twoH²functions areH²regular, as well. And this fact can be carried over to the trace spaces. Indeed,Corollary 2ensures, that ¯p_I ∈ H²(I)and thus the shape gradient isH^3/2-regular at least the optimum.

• As a result, one can identify a so calledSobolev gradientof the reduced objectiveF (see [130], [146, Sec. 5.3]). Hereunto, let∇F(B)∈ L²(β)and consider the variational problem

Z which is associated with thesurface PDEproblem of the Laplace-Beltrami operator

−_∆_β(∇_SF(B)) +∇_SF(B) =∇F(B) a. e. onβ.

This surface PDE is known to be uniquely solvable, cf. for instance [49].

• Sobolev gradients have the appealing property that the operator which maps∇Fto∇_SF may be used for preconditioning in steepest descent algorithms, cf. for instance [146,56].

2.3.6 Analysis of the outer optimization problem 47 The assertion ofTheorem 7is significantly affected by the observance that the shape adjoint variables are either zero or already given by the adjoint state of the inner optimization problem. This is a special case of a more general result concerning bilevel optimization problems with the following structure:

minimize J(b;ub,yb)

subject to b∈_M⊂_B (2.56)

(u_b,y_b):= _{arg min}

(u,y)∈U×Y

J(b;u,y)_{subject to}T(b;u,y) =_{0 in}_Z.

It is assumed here, that there exists operators

S_b :U→Y, u7→y=S_b(u)withT(b;u,S_b(u)) =0, ∀b∈B, G:B→_U×_Y, b7→(u_b,y_b),

and that all operators are sufficiently smooth for the following analysis. In particular, the parametrized inner optimization problem is uniquely solvable for each parameterb ∈ B. In other words, the bilevel optimization problem can equivalently be formulated as

minimize J(b;G(b)) subject to b∈_M⊂_B.

Moreover, it is assumed, that the first order necessary conditions for the parametrized inner optimization problem are also sufficient. Then the effect of the operatorGis the same as solving the optimality system.

In view ofParagraph 2.3.2, the optimality system can be written as T(b;u,y,p) =0, Unique solvability of the optimality system for arbitrarily chosenb∈Binduces

G:B→_U×_Y×_Z^∗, b7→(u_b,y_b,p_b) = (G(b),p_b). All in all, the bilevel optimization problem can equivalently be replaced by

minimize J(b;u_b,y_b) subject to b∈_M⊂_B,

T(b;ub,yb,pb) =0.

Referring to the definition (2.34) of the adjoint statep, one can introduce another adjointP= (P^u,P^y,P^p) inZ^∗×_Y^∗∗×_U^∗∗

∂_u,y,pT(b,G(b))

∗

P=−∇_u,y,pJ(b,G(b)), where the objectiveJis only formally dependent onp. This yields



Remark:

One recognizes, that the bilevel structure of set optimal control problem – confer (2.30) and (2.36), (2.37) – fits into the more general framework (2.56). Additionally, the reduction of the optimality system of the inner optimization problem (cf.Lemma 7) has no impact on the results on principle. The reduction only is for convenience in order to avoid large systems. With it, the whole procedure of paragraphs2.3.4–2.3.6 is not restricted to the special case under consideration. In particular, one might think of state-constrained optimal control problems with multiple controls and/or states, where the reduction step ofLemma 7is not applicable.

Im Dokument Shape Calculus Applied to State-Constrained Elliptic Optimal Control Problems (Seite 50-58)