• Keine Ergebnisse gefunden

Z

y(T)−yd

2dx, J2:U →R, J2(u) := 1

2

k

X

k=1

γk Z T

0

|uk(t)|2dt.

Since yny¯ but un * u¯ as n → ∞ we treat the functions J1 and J2 separately.

Note that nonlinear continuous functions are not neccessarily weakly continuous.

The function J2 is convex and consequently weakly lower semicontinuous, see [45, Theorem 2.12]. I.e. we have

lim infn→∞ J2(un)≥J2u) as un*u.¯ The following estimate finishes the proof:

J= limn→∞J(yn, un) = limn→∞J1(yn) + lim infn→∞ J2(un)

=J1y) + lim infn→∞ J2(un)≥J1y) +J2u) =Jy,u¯).

Obviously, to show optimality of (¯y,u¯) we did not need the specific structure of the cost functional. Lower semicontinuity of J would have been sufficient.

Remark 2.8. The cost functional is convex. If (SE) is linear, problem (P) is strictly convex with respect to u. But in case of a semilinear equation it might be non-convex.

Hence, there might exist several (local) optimal controls and further assumptions would be necessary to prove uniqueness; compare, e.g., [45].

2.4 First- and Second-Order Derivatives

For the numerical solution of the original problem (P) we will work with the equivalent nonlinear reduced problem (). We turn to the first- and second-order derivatives of the reduced cost functional. For an introduction to the generalization of the notion of differentiability to Banach spaces we refer to [21, 45].

First, we name the derivatives of the cost functional J : W(0, TU → R. Further below, the derivatives of the reduced cost functional ˆJ are computed following a Lagrangian function based approach. The chain rule, see [45, Theorem 2.20], is applied. This requires the derivatives of the control-to-state operator. We restrict the dimension of the spatial domain tom := 2 but we will point out when this restriction is actually needed.

Recall the Hilbert spaceU =L2(0, T;Rk). We identifyU0 withU viah·,·iU0,U =h·,·iU. LetY1 :=W(0, T) and U1 :=L(0, T;Rk).

Cost functional:

Lety, v, v1, v2W(0, T) and u, w, w1, w2U. The Fréchet derivatives ofJ are given by hJy(y, u), viY0

1,Y1 =hy(T)−yd, v(T)iH, hJyy(y, u)v2, v1iY0

1,Y1 =hv1(T), v2(T)iH, hJu(y, u), wiU =Z T

0 k

X

k=1

γkuk(t)wk(t) dt, hJuu(y, u)w2, w1iU =Z T

0 k

X

k=1

γkw2k(t)w1k(t) dt,

while the second-order mixed derivatives vanish. The linear mapping yW(0, T) 7→

y(T)∈H is continuous due to the embeddingW(0, T),C([0, T];H).

The Riesz representation for Ju(y, u) is directly visible within the third of the above equations: Fort∈[0, T] a.e. it holds

Ju(y, u)(t) =Dγu(t) withDγ := diag(γ1, . . . , γk), (2.8) and thus

Juu(y, u)w(t) =Dγw(t). (2.9) Remark 2.9. J is twice continuously Fréchet differentiable with Lipschitz continuous second-order derivative becauseJuu(y, u) and Jyy(y, u) are both independent of (y, u).

We continue by considering the control-to-state operator. To obtain the desired differen-tiability we require

(A3) N is twice differentiable with locally Lipschitz continuous second-order derivative.

Assumption (A3) implies local Lipschitz continuity of N0 and N by using the mean value theorem. Therefore assumption (A1) holds.

Remark 2.10. The second-order derivative N00(y) = 6y of the function N(y) = y3 is obviously globally Lipschitz continuous.

The given control-to-state operator is differentiable as a mapping from Ls(0, T;Rk) to Y with s > m+ 1; compare [45, Chapter 5]. In advance of the derivative computation we give the following theorem, where we restrict ourselves toL(0, T;Rk)⊃Uad.

Theorem 2.11. Suppose that (A2)-(A3) hold. Then, the control-to-state operator is twice continuously Fréchet differentiable as a function fromL(0, T;Rk) toY.

Proof. We have the continuous linear operatorB from the proof of Theorem 2.4 so that Theorem 5.15 in [45] yields the claim.

By the chain rule it follows:

Corollary 2.12. With (A2)-(A3) holding the reduced cost functional Jˆ is twice con-tinuously Fréchet differentiable on L(0, T;Rk).

Unfortunately, the control-to-state operator and hence the reduced cost functional are not differentiable from the Hilbert spaceU toY and toRrespectively. Here, we encounter

the two-norm discrepancy being well-known for occuring in optimal control problems go-verned by semilinear parabolic PDEs; see [24, 45].

In [24] the problem is overcome by using continuous extensions. Let uL(0, T;Rk) arbitrary but fixed. Motivated by the argumentation given in [24] the following will be shown:

• We can view y0(u) ∈ L(L(0, T;Rk), Y) as continuous linear operator from U to W(0, T) so that its dual operator maps continuouslyW(0, T)0 toU0U.

• ˆJ0(u)∈L(0, T;Rk)0 belongs to U0U.

• ˆJ00(u) maps continuously U toU0U.

For the derivative computation we write (SE) elegantly as a nonlinear operator equa-tion ‘e(y, u) = 0’. We use the two abbreviations L2(V) := L2(0, T;V) and L2(V0) :=

L2(0, T;V0) so thatL2(V0)0 =L2(V) holds. We introduce the required mappings:

• DefineFL2(V0) by

hF, ϕiL2(V0),L2(V)=Z T

0

hF(t), ϕ(t)iV0,V dt:=Z T

0

Z

f(t,x)ϕ(t,x) dxdt for ϕL2(V).

• The continuous linear operator A:L2(V)→L2(V0), hAy, ϕiL2(V0),L2(V)=Z T

0

h Ay(t), ϕ(t)iV0,V dt:=Z T

0

a(y(t), ϕ(t)) dt for y, ϕL2(V), with the symmetric and bounded bilinear form a:V ×V →R,

a(ϕ1, ϕ2) :=Z

∇ϕ1(x)· ∇ϕ2(x) dx+q Z

Γ

ϕ1(x)ϕ2(x) dx forϕ1, ϕ2V.

The boundedness of a is transferred to A. Therefore, the operator A is indeed continuous; compare [21, p. 90] or see also [45].

• The continuous linear operator B:UL2(0, T;V0), hBu, ϕiL2(V0),L2(V)=Z T

0

h Bu(t), ϕ(t)iV0,V dt :=Z T

0 k

X

k=1

uk(t)Z

Γ

χk(x)ϕ(t,x) dxdt foruU, ϕL2(V).

• The nonlinear operatorN :L(Q)→L2(V0), hN(y), ϕiL2(V0),L2(V)=Z T

0

h N(y)(t), ϕ(t)iV0,V dt :=Z T

0

Z

N y(t,x)ϕ(t,x) dxdt foryL(Q), ϕL2(V).

In addition,yW(0, T) implies ytL2(V0). Hence, the weak formulation (2.3) defines

Recall that a linear and bounded operator is Fréchet differentiable and that the deriva-tive is given by the operator itself; see [45]. Thus, concerning differentiability the only delicate term in the above definition is the nonlinear operatorN.

Lemma 2.13. With(A3)holding the operatorN :L(Q)→L2(V0)is twice continuously Fréchet differentiable. The action of the derivatives reads

hN0(y)v, ϕiL2(V0),L2(V) =Z T

Proof. First, one has to verify that the expressions above represent the Fréchet derivatives.

This can be shown by using the estimation techniques from [45, Sections 4.3, 4.9] where Tröltzsch considers Nemytskii operators and their first- and second-order derivatives as mappings fromL(Q) to L(Q).

We briefly show that the derivatives can be continuously extended. Let yL(Q).

Local Lipschitz continuity of N0 and N00 implies N0(y(·,·)), N00(y(·,·))∈L(Q). This is of Hölders inequality; see [21, Lemma 1.13]. Boundedness is given due to W(0, T) ,C([0, T];H) andL2(V),L2(0, T;Lq(Ω)) for 2≤q≤6. The latter embedding is true for m≤3 by the Sobolev embedding theorem; see [21, Theorem 1.14].

Now, we can name the derivatives of theoperator e:

e0(y, u)(v, w) = cpvt+Av+N0(y)v v(0)

!

| {z }

=ey(y,u)v

+ −Bw

0

!

| {z }

=eu(y,u)w

,

e00(y, u)((v1, w1),(v2, w2)) = N00(y)(v1, v2) 0

! , fory, v, v1, v2Y and u, w, w1, w2U.

Remark 2.14. Let (y, u) ∈ Y ×U. The above formula for ey(y, u)v and Lemma 2.13 show that ey(y, u) can be viewed as a continuous linear operator from W(0, T) to Z = L2(V0H. So, its dual operator maps continuouslyZ0 =L2(VH toW(0, T)0. Control-to-state operator:

From the chain rule and Theorem 2.11, it follows that the equation e(y(u), u) = 0

can be differentiated in a directionwL(0, T;Rk). This yields

ey(y(u), u)y0(u)w+eu(y(u), u)w= 0. (2.13) Thus, the sensitivityv:=y0(u)w is given by the solution to the

linearized state equation

ey(y(u), u)v=−eu(y(u), u)w.

Letyu :=y(u). Written in expanded form the linearized state equation reads cpvt+Av+N0(yu)v

v(0)

!

= Bw

0

! . This is the weak formulation of

(LSE)

cpvt−∆v+N0(yu(·,·))v= 0 inQ,

∂v

∂ν +qv=

k

X

k=1

wkχk on Σ, v(0) = 0 in Ω. We investigate the solvability of (LSE):

1. wUvW(0, T):

The operator B is linear and continuous. From (A2)-(A3) we can deduce that the function (t,x) 7→ N0(yu(t,x)) ≥ 0 belongs to L(Q). Hence, the linearized state equation has a continuous linear solution operator w 7→ v from U to W(0, T); see [11, Chapter XVIII].

2. wL(0, T;Rk)⇒vY:

If the controlwbelongs toL(0, T;Rk), we even obtainvC( ¯Q) and henceyY; see [45, Chapter 5].

Remark 2.15. (1) The above point 2 would allow to apply the implicit function theo-rem, see [21, Theorem 1.41], to prove Theorem 2.11, i.e. differentiability of the control-to-state operator.

(2) Let uL(0, T;Rk). Point 1 above justifies to write

y0(u) =−ey(y(u), u)−1eu(y(u), u)∈ L(U, W(0, T)). (2.14) Consequently, the dual operator y0(u) maps continuously W(0, T)0 to U. Point 2 above yields y0(u)wY, ifwL(0, T;Rk).

(3) The operatore0(y, u) = (ey(y, u), eu(y, u)) is surjective for all (y, u)∈Y ×U because the operator ey(y, u) is bijective. In order to see this, we consider the linearized state equation with an arbitrary right-hand side. Surjectivity of ey(y, u) follows if and only if for all (g, v0) ∈Z there exists vY such that ey(y, u)v = (g, v0). The reference from point 1 above yields the existence of a weak solution vW(0, T) which is even unique. By a bootstrap argument the regularity of v can be improved such that vC( ¯Q) is satisfied.

Hence, a so-called regular point condition is fulfilled and provides the existence of a Lagrange muliplier p= (p1, p2)∈Z0 associated with (SE) in the context of Karush-Kuhn-Tucker theory; see [32, Theorem 4.1]. By variational arguments it follows that p1,t belongs to L2(V0). Thus, we even havep1W(0, T).

(4) Differentiating equation (2.13) with w1 := w once again in another direction w2L(0, T;Rk) yields an equation for the second-order derivative y00(u)(w1, w2). We will not have to compute this derivative. But we need thaty00(u) can also be applied to elements w1, w2U. Therefore, let us name a representation formula which is also stated in [45, Theorem 5.16]. The application v:=y00(u)(w1, w2) is the solution

to

cpvt−∆v+N0(yu(·,·))v=−N00(yu(·,·))v1v2 inQ,

∂v

∂ν +qv= 0 on Σ,

v(0) = 0 in Ω,

(2.15)

withvi =y0(u)wi,i= 1,2. Forv1, v2W(0, T) we obtainN00(yu(·,·))v1v2L2(Q):

The embedding W(0, T),L4(0, T;L4(Ω))∼L4(Q) form= 2 gives Z T

0

Z

v1(t,x)v2(t,x)2dxdtZ T

0

v1(t)2L4(Ω)

v2(t)2L4(Ω)dt

v1

2 L4(Q)

v2

2 L4(Q).

Thus, the reference from point 1 above ensures that equation (2.15) has a unique weak solutionvW(0, T). Let us mention that the use of bootstrapping even yields vL(0, T;H2(Ω))∩H1(0, T;H).

Reduced cost functional:

Now, we can compute ˆJ0(u) for any uL(0, T;Rk) using a Lagrangian function based approach.

We introduce the Lagrange functionL:Y ×L(0, T;RkZ0 →Rby L(y, u, p) :=J(y, u) +hp, e(y, u)iZ0,Z

=J(y, u) +he1(y, u), p1iL2(V0),L2(V)+hp2, e2(y, u)iH, wherep= (p1, p2)∈Z0 =L2(VH.

For anyuL(0, T;Rk) andpZ0 we have

Jˆ(u) =J(y(u), u) =J(y(u), u) +hp, e(y(u), u)iZ0,Z =L(y(u), u, p), becausee(y(u), u) = 0 holds. The first-order derivative of ˆJ thus reads

hJˆ0(u), w1iU0

1,U1 =hLu(y(u), u, p), w1iU0

1,U1+hLy(y(u), u, p), y0(u)w1iY0,Y (2.16) forw1L(0, T;Rk).

The left termLu(y(u), u, p) is given by hLu(y(u), u, p), w1iU0

1,U1 =hJu(y(u), u), w1iU+he1u(y(u), u)w1, p1iL2(V0),L2(V)

=hJu(y(u), u), w1iU+h−Bw1, p1iL2(V0),L2(V)

=hJu(y(u), u)− Bp1, w1iU,

with the dual operatorB of B. Thus, Lu(y(u), u, p) can be identified with the element Lu(y(u), u, p) =Ju(y(u), u)− Bp1U. (2.17) We determine the dual operatorB :L2(V)→U ofB, satisfying

hBu, ϕiL2(V0),L2(V)=hu,BϕiU for all (u, ϕ)∈U ×L2(V). Actually, it can be directly read off from the definition ofB. We obtain

(Bϕ) (t) =

R

Γχ1(x)ϕ(t,x) dx ...

R

Γχk(x)ϕ(t,x) dx

for all ϕL2(V), a.e. in [0, T]. (2.18) For the second term in (2.16) we introduce the adjoint statep(u) ∈Z0 associated with the controlu: It is given by the solution to

Ly(y(u), u, p(u)) = 0. LetvY. We have

hLy(y(u), u, p(u)), viY0,Y =hJy(y(u), u), viY0

1,Y1+hp(u), ey(y(u), u)viZ0,Z

=hJy(y(u), u) +ey(y(u), u)p(u), viY10,Y1. (2.19) The second equality holds since ey(y(u), u) maps from W(0, T) to Z, see Remark 2.14.

This shows thatLy(y(u), u, p(u)) belongs toW(0, T)0. We can deduce ˆJ0(u)∈U0U via equation (2.16) and equation (2.17) withy0(u)∈ L(U, W(0, T)).

Let us determine the adjoint statep(u) = p(u)1, p(u)2

. For clarity we writep1 :=p(u)1

and p2 := p(u)2. The adjoint state can be viewed as a Lagrange multiplier associated with (SE). Hence, its existence withp1W(0, T) follows from Remark 2.15(3). Equation (2.19) says that the adjoint state is given by the solution to the

adjoint equation

ey(y(u), u)p(u) =−Jy(y(u), u).

Letyu :=y(u). Written in variational form, the adjoint equation reads he1y(yu, u)v, p1iL2(V0),L2(V)+he2y(yu, u)v, p2iH =−hJy(yu, u), viY0

1,Y1 for all vY1. Expandinge1y(yu, u), e2y(yu, u) and insertingJy(yu, u) gives

hcpvt+Av+N0(yu)v, p1iL2(V0),L2(V)+hp2, v(0)iH =−hyu(T)−yd, v(T)iH. The formula of integration by parts applied on the termhcpvt, p1iL2(V0),L2(V) yields

hcpp1(T), v(T)iH − hcpp1(0), v(0)iH

+h−cpp1,t+Ap1+N0(yu)p1, viL2(V0),L2(V)

+hp2, v(0)iH =−hyu(T)−yd, v(T)iH.

Notice that the equalityA =A was used, which follows from symmetry of the bilinear forma.

Recall the space Cc((0, T);V) which consists of all functions in C((0, T);V) with compact support in (0, T). The space Cc((0, T);V) ⊂ W(0, T) is dense in L2(V); see [21, p. 80]. Therefore, the above equation is equivalent to

h−cpp1,t+Ap1+N0(yu)p1, viL2(V0),L2(V)= 0 for all vL2(0, T;V), cpp1(T) =−(yu(T)−yd) ∈H, p2=cpp1(0).

This is the weak formulation of

(AE1)

−cpp1,t−∆p1+N0(yu(·,·))p1 = 0 inQ,

∂p1

∂ν +qp1 = 0 on Σ,

cpp1(T) =−(yu(T)−yd) in Ω.

We already know thatp1W(0, T) exists and Lemma 3.17 in [45] provides unique exis-tence of a weak solution to (AE1). From yu(T)−ydC(¯Ω), it even follows p1C( ¯Q).

This impliesp1Y; see [45, p. 279]. Hence, the unique adjoint state associated with uis given byp(u) = (p1, cpp1(0)).

We insert the adjoint statep(u) into equation (2.16). Thus, the derivative ˆJ0(u) is given by formula (2.17) and only the first component of p(u) is needed. This is why we set pu:=p(u)1. In the following we will refer to only this first component as the adjoint state

associated with u. Since we identified ˆJ0(u) ∈ U0 with an element in U we refer to it as the gradient of the reduced cost functional ˆJ atuL(0, T;Rk).

We derived the following computation scheme:

Computation of the representation of ˆJ0(u) in U:

Require: uL(0, T;Rk);

1. Solve (SE) to get yu=y(u);

2. Solve (AE1) to getpu =p(u)1; 3. Insert u and pu into

Jˆ0(u)(t) = Dγu(t) − (Bpu) (t)

=

γ1u1(t)

...

γkuk(t)

R

Γχ1(x)pu(t,x) dx ...

R

Γχk(x)pu(t,x) dx

; (2.20)

Let us turn to the second-order derivative. We differentiate equation (2.16) in the directionw2L(0, T;Rk), omit the mixed derivatives which vanish for the given problem and use point (4) from Remark 2.15:

hJˆ00(u)w2, w1iU0

1,U1 =hLuu(y(u), u, p)w2, w1iU0

1,U1

+hLyy(y(u), u, p)y0(u)w2, y0(u)w1iY0,Y +hLy(y(u), u, p), y00(u)(w1, w2)iY0

1,Y1.

By inserting the adjoint statep=p(u), satisfying Ly(y(u), u, p(u)) = 0, we obtain hJˆ00(u)w2, w1iU0

1,U1 =hLuu(y(u), u, p(u))w2, w1iU0

1,U1

+hLyy(y(u), u, p(u))y0(u)w2, y0(u)w1iY0,Y. (2.21) We verify that the controlsw1, w2 in equation (2.21) can also belong toU:

The first termLuu(y(u), u, p(u)) equalsJuu(y(u), u) ∈U0U since the state equation is linear inu; compare equation (2.17). This gives

Luu(y(u), u, p(u))w(t) =Dγw(t) forwU,a.e. in [0, T]. (2.22) The second term in (2.21) requires some effort. For anyv, ϕY we have

hLyy(y(u), u, p(u))v, ϕiY0,Y =hJyy(y(u), u)v, ϕiY10,Y1+he1yy(y(u), u)(v, ϕ), p(u)1iL2(V0),L2(V)

=hv(T), ϕ(T)iH +hN00(y(u))(v, ϕ), p(u)1iL2(V0),L2(V).

Hence,v andϕcan also belong toW(0, T) and we can viewLyy(y(u), u, p(u)) as mapping from W(0, T) to W(0, T)0. Consequently, the controls w1, w2 in (2.21) can belong to U and we have

hLyy(y(u), u, p(u))y0(u)w2, y0(u)w1iY0

1,Y1 =hy0(u)Lyy(y(u), u, p(u))y0(u)w2, w1iU.

LetwU. The computation ofy0(u)Lyy(y(u), u, p(u))y0(u)wneeds to be done in several steps. We define

v:=y0(u)w(2.14)= −ey(y(u), u)−1eu(y(u), u)w, h:=Lyy(y(u), u, p)v.

Thus, we obtain

y0(u)Lyy(y(u), u, p(u))y0(u)w=y0(u)h.

Withyu :=y(u) andpu:=p(u)1 this means:

1. The sensitivityv=y0(u)w is the solution to (LSE).

2. The application ofh is given by hh, ϕiY0

1,Y1 =hv(T), ϕ(T)iH +hN00(yu)(v, ϕ), puiL2(V0),L2(V). 3. The formula fory0(u)h reads

y0(u)h=−eu(y(u), u)ey(y(u), u)−∗h.

Hence, this requires one adjoint equation solve. We define p := −ey(y(u), u)−∗h. This gives the adjoint equation with Jy(y(u), u) replaced byh. Letp1W(0, T) be the unique weak solution to

(AE2)

−cpp1,t−∆p1+N0(yu(·,·))p1=−N00(yu(·,·))vpu inQ,

∂p1

∂ν +qp1= 0 on Σ,

cpp1(T) =−v(T) in Ω.

Thus,pis given byp= (p1, cpp1(0)). Note thatwL(0, T;Rk) first impliesvY and hence p1Y. Consequently,

y0(u)h=eu(y(u), u)h=−Bp1.

We summarize the computation of the application of the Hessian ˆJ00(u),uL(0, T;Rk), on a vectorwU:

Computation of Hessian-vector products ˆJ00(u)w:

Require: wU,yu =y(u) andpu=p(u)1 (with uL(0, T;Rk));

1. Solve (LSE) to get v; 2. Solve (AE2) to getp1; 3. Insert w andp1 into

( ˆJ00(u)w)(t) = Dγw(t) − (Bp1)(t)

=

γ1w1(t) ...

γkwk(t)

R

Γχ1(x)p1(t,x) dx ...

R

Γχk(x)p1(t,x) dx

; (2.23)

Remark 2.16. Consider the linear case, i.e. let N be a linear function so that N0 is constant and N00 equals zero. Then, via (SE) and (AE1), ˆJ0(·) is linear and, via (LSE) and (AE2), ˆJ00(·) is constant as expected.