Numerical Analysis of Optimality-System POD for Constrained Optimal Control

(1)

Chapter 1

Numerical Analysis of Optimality-System POD for Constrained Optimal Control

Eva Grimm, Martin Gubisch, and Stefan Volkwein^∗

AbstractIn this work linear-quadratic optimal control problems for parabolic equations with control and state constraints are considered. Utilizing a Lavrentiev regularization we obtain a linear-quadratic optimal control problem with mixed control- state constraints. For the numerical solution a Galerkin discretization is applied utilizing proper orthogonal decomposition (POD). Based on a perturbation method it is determined by a-posteriori error analysis how far the suboptimal control, computed on the basis of the POD method, is from the (unknown) exact one. POD basis updates are computed by optimality-system POD. Numerical examples illustrate the theoretical results for control and state constrained optimal control problems.

1.1 Introduction

In this paper we consider a certain class of linear-quadratic optimal control problems governed by linear evolution equations together with control and state constraints.

Such linear-quadratic problems are especially interesting as they occur for example as subproblems in each step of sequential quadratic programming (SQP) methods for solving nonlinear problems. For the numerical solution we apply a Galerkin approximation, which is based on proper orthogonal decomposition (POD), a method for deriving reduced-order models of dynamical systems; see [7, 11, 19], for instance. In order to ensure that the POD suboptimal solutions are sufficiently accurate, we derive an a-posteriori error estimate for the difference between the exact (unknown) optimal control and its suboptimal POD approximations. The proof re- lies on a perturbation argument [5] and extends the results of [8, 22, 25].

S. Volkwein

University of Konstanz, Department of Mathematics and Statistics, Universit¨atsstr. 10, 78457 Kon- stanz, Germany, e-mail: Stefan.Volkwein@uni-konstanz.de

∗ This work was supported by the DFG projectA-Posteriori-POD Error Estimators for Nonlinear Optimal Control Problems governed by Partial Differential Equations, grant VO 1658/2-1.

1

Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-275129

(2)

However, to obtain the state data underlying the POD reduced order model, it is necessary to solve once the full state system and consequently the POD approximations depend on the chosen parameters for this solve. To be more precise, the choice of an initial control turned out to be essential. When using an arbitrary control, the obtained accuracy was not at all satisfying even when using a huge number of basis functions whereas an optimal POD basis (computed from the FE optimally controlled state) led to far better results. To overcome this problem different techniques for improving the POD basis have been proposed. Here, we will apply the so called optimality system POD (OS-POD) introduced in [17]. The idea of OS-POD is straightforward: include the equations determining the POD basis in the optimiza- tion process. A thereby obtained basis would be optimal for the considered problem.

We follow the ideas in [6, 26], where OS-POD is combined efficiently with an a- posteriori error estimation to compute a better initializing control. The POD basis is then determined from this control and the a-posteriori error estimate ensures that the optimal control problem is solved up to a desired accuracy. Let us refer to [1]

where the trust-region POD method is introduced as a different update strategy for the POD basis.

The paper is organized in the following manner: In Section 1.2 we introduce our optimal control problem with control and state constraints. To deal numerically with the state constraints a Lavrentiev regularization is utilized in Section 1.3. The POD method is explaine briefly in Section 1.4. In Section 1.5 the existing a-posteriori error analysis is extended to our state-constrained control problem. The combination of the a-posteriori error estimation and OS-POD is explained in Section 1.6. In Sec- tion 1.7 we propose two algorithms to solve the reduced optimal control problem.

Numerical examples are presented in Section 1.8.

1.2 The state-constrained optimal control problem

Suppose thatΩ⊂R^d,d∈ {1,2,3}, is an open and bounded domain with Lipschitz- continuous boundaryΓ =∂ Ω. LetV be a Hilbert space withH₀¹(Ω)⊂V⊂H¹(Ω).

We endow the Hilbert spacesH=L²(Ω)andV with the usual inner products hϕ,ψi_H=

Z

Ω

ϕ ψdx, hϕ,ψi_V= Z

Ω

ϕ ψ+∇ϕ·∇ψdx

LetT>0 be the final time. We introduce a continuous bilinear forma(·,·):V×V→ Rsatisfying

a(ϕ,ϕ)≥α₁kϕk_V²−α₂kϕk²_H for allϕ∈V

for constantsα1>0 andα2≥0. Let us mention that the results can be extended easily to time-dependent bilinear forms in a straightforward way. Recall the Hilbert spaceW(0,T) ={ϕ∈L²(0,T;V)|ϕt ∈L²(0,T;V⁰)} endowed with the common inner product [4, pp. 472-479]. LetDbe a bounded subset ofR^dwithd∈N. Then the control space is given by U=L²(D;R^m)for m∈N. ByUad⊂Uwe define

(3)

the closed, convex and bounded subsetUad={u∈U|u_a≤u≤u_binU}, where u_a,u_b∈Uholds withu_a≤u_b. In particular, we identifyUwith its dual spaceU⁰. Foru∈Uad,y◦∈Hand f∈L²(0,T;V⁰)we consider the linear evolution problem

d

dthy(t),ϕi_H+a(y(t),ϕ) =h(f+Bu)(t),ϕi_V0,V∀ϕ∈V in(0,T],

y(0) =y◦ inH,

(1.1)

whereh·,·i_V⁰_,V stands for the dual pairing betweenV and its dual spaceV⁰ and B:U→L²(0,T;V⁰) is a continuous, linear operator. It is known that for every f ∈L²(0,T;V⁰),u∈Uandy◦∈H there is a unique weak solution y∈W(0,T) satisfying (1.1) and

kyk_W(0,T₎≤C

ky◦k_H+kfk_L2(0,T;V⁰)+kuk_U

(1.2) for a constantC>0 which is independent ofy◦, fandu. For a proof of the existence of a unique solution we refer to [4, pp. 512-520]. The a-priori error estimate follows from standard variational techniques and energy estimates.

Remark 1.Let ˆy∈W(0,T)be the unique solution to the problem d

dthy(t),ϕi_H+a(y(t),ϕ) =hf(t),ϕi_V0,V ∀ϕ∈V in(0,T], y(0) =y◦inH.

We introduce the bounded, linear solution operatorS :L²(0,T;V⁰)→W(0,T): for g∈L²(0,T;V⁰)the functionSg∈W(0,T)is the unique solution to

d

dthy(t),ϕi_H+a(y(t),ϕ) =hg(t),ϕi_V0,V ∀ϕ∈V in(0,T], y(0) =0 inH.

Then, the unique solution to (1.1) is given byy=yˆ+S Bu. ♦ We setW=L²(0,T;Rⁿ). Let us introduce the set of admissible states

Y˜ad=

y∈W(0,T)

y_a≤Iy≤y_binW ,

whereI:L²(0,T;V)→Wis a bounded, linear operator withn∈N,ya,y_b∈Wwith ya≤y_b. It follows that ˜Yadis closed and convex inW(0,T). We introduce the Hilbert space ˜X=W(0,T)×Uendowed with the natural product topology. Moreover, we define the closed and convex subset ˜Xad=Y˜ad×Uad⊂X˜. The cost function ˜J: X˜ →Ris given by

J(y,˜ u) =σΩ

2 ky(T)−yΩk²_H+σQ

2 Z _T

0

ky(t)−y_Q(t)k²_Hdt+σu

2 kuk²_U (1.3) forx= (y,u)∈X˜, whereσQ,σΩ are nonnegative weighting parameters,σu>0 is a regularization parameter andy_Q∈L²(0,T;H),y_Ω ∈H are given desired states.

Then, we consider the following convex optimal control problem

(4)

min ˜J(x) subject to (s.t.) x∈F(P) (P) with the setF(P)={(yˆ+S Bu,u)∈X˜ad}of feasible solutions. By (1.2) the cost functional is radially unbounded. SinceJis weakly lower semicontinuous, (P) ad- mits a global optimal solution ¯x= (y,¯u)¯ providedF(P) is nonempty. Sinceσu>0 holds, ¯xis uniquely determined. Uniqueness follows from the strict convexity prop- erties of the objective functional on ˜Xad. For a proof we refer to [12, Section 1.5.2]

or [24], for instance.

Example 1 (Boundary control without state constraints). For T >0 we set Q= (0,T)×Ω andΣ= (0,T)×Γ. LetV =H¹(Ω). For the control space we choose D=Σandm=1, i.e.,U=L²(Σ). Then, for given controlu∈Uand initial conditiony◦∈Hwe consider

c_py_t(t,x)−∆y(t,x) =f˜(t,x) inQ, (1.4a)

∂y

∂n(t,x) +qy(t,x) =u(t,x) onΣ, (1.4b)

y(0,x) =y◦(x) inΩ. (1.4c)

In (1.4) we supposec_p>0,q≥0 and ˜f∈L²(0,T;H). Settingf=f˜/c_p, introducing the bounded (symmetric) bilinear forma:V×V→Rby

a(ϕ,ψ) = 1 c_p

Z

Ω

∇ϕ(x)·∇ψ(x)dx+ q c_p

Z

Γ

ϕ(x)ψ(x)dx forϕ,ψ∈V

and the linear, bounded operatorB:U→L²(0,T;V⁰)by h(Bu)(t),ϕi_V0,V= 1

c_p Z

Γ

u(t,x)ϕ(x)dx forφ∈V,t∈[0,T]

then the weak formulation of (1.4) can be expressed in the form (1.1). More details

on this example one can found in [6]. ♦

Example 2 (Distributed control with state constraints).LetΩ,Γ,T,Q,Σas in Ex- ample 1. Letχi∈H, 1≤i≤m, denote given control shape functions. For the control space we chooseD= (0,T)and setU=L²(0,T;R^m). Then, for given controlu∈U, initial conditiony◦∈H and inhomogeneity f ∈L²(0,T;H)we consider the linear heat equation

y_t(t,x)−ν ∆y(t,x) +β·∇y(t,x) =f(t,x) +

m

∑

i=1

u_i(t)χ_i(x), inQ,

y(t,x) =0 onΣ,

y(0,x) =y◦(x) inΩ.

(1.5)

withν>0 andβ ∈R^d. We introduce the bounded (symmetric) bilinear form

(5)

a(ϕ,ψ) = Z

Ω

∇ϕ·∇ψdx forϕ,ψ∈V

and the bounded, linear operatorB:U→L²(0,T;H),→L²(0,T;V⁰)as (Bu)(t,x) =

m

∑

i=1

u_i(t)χ_i(x) for(t,x)∈Qandu∈U.

It follows that the weak formulation of (1.5) can be expressed in the form (1.1).

We choose certain shape functionsπ1, . . . ,πn∈H and introduce the operatorI : L²(0,T;V)→Wby

(Iϕ)(t) =







(I1ϕ)(t) ... (Inϕ)(t)





 with (Iiϕ)(t) = Z

Ω

πi(x)ϕ(t,x)dx

forϕ∈L²(0,T;V). Then, the state constraints have the form y_ai(t)≤

Z

Ω

πi(x)y(t,x)dx≤y_bi(t) in[0,T]and for 1≤i≤n,

where(y,w)∈W(0,T)×Wholds; see also [7]. ♦

1.3 The Lavrentiev regularization

It is well-known that the (sufficient) first-order optimality conditions for (P) involve a measure-valued Lagrange multiplier associated with the state constraint ¯y∈Y˜ad; see [12, Section 1.7.3]. To develop a fast numerical solution methods (by combining semismooth Newton techniques with reduced-order modelling) we apply a Lavren- tiev regularization of the state constraints. For that purpose we introduce an additional (artificial) control variable and approximate the pure state by mixed control- state constraints, which enjoyL²-regularity; see [23].

Instead of ˜Xwe consider the Hilbert spaceX=W(0,T)×U×W, again supplied with the product topology. For givenε>0 the subset ˜Xadis replaced by the closed and convex subset

X^εad=

(y,u,w)∈X

y_a≤εw+Iy≤y_binW,u∈Uad . (1.6) For a chosen weightσw>0 we also extend the cost functional ˜Jby definingJ:X→ Rwith

J(y,u,w) =J(y,u) +˜ σw

2 kwk²_W, x= (y,u,w)∈X. Now the regularized optimal control problem has the following form

minJ(x) s.t. x∈F(P^ε) (P^ε)

(6)

with the feasible setF(P^ε)={(ˆy+S Bu,u,w)∈X^ε_ad}. IfF(P^ε)6=/0 holds, it follows by similar arguments as above that (P^ε) possesses a unique global optimal solution ¯x.

Let us define the control spaceV=U×W. We introduce the reduced cost functional ˆJby ˆJ(v) =J(ˆy+S Bu,u,w)forv= (u,w)∈V. By Remark 1 the solution to (1.1) can be expressed asy=yˆ+S Bu. Thus, the set of admissible controls is given by

V^ε_ad=

v= (u,w)∈V|u∈Uadand ˆy_a≤εw+I S Bu≤yˆ_binW

with ˆy_a=y_a−Iyˆand ˆy_b=y_b−Iy. Now, (Pˆ ^ε) is equivalent to the reduced problem min ˆJ(v) s.t. v∈V^εad. (Pˆ^ε) The control ¯v= (u,¯ w)¯ is the unique solution to (Pˆ^ε) if and only if ¯x= (yˆ+S Bu,¯ v)¯ is the unique solution to (P^ε).

Next we formulate first-order sufficient optimality conditions for (P^ε) (see [24], for instance):

Theorem 1.Suppose that the feasible setF(P^ε)is nonempty. The pointx¯= (y,¯ u,¯ w)¯ ∈ X^ε_adis a (global) optimal solution to(P^ε)if and only if there are unique Lagrange multipliers(p,¯ λ¯_u,λ¯_y)∈Xsatisfying the dual equations

− d

dthp(t),¯ ϕi_H+a(ϕ,p(t)) +¯ h(I^?λ¯_y)(t),ϕi_V0,V=σ_Qh(y_Q−y)(t¯ ),ϕi_H

∀ϕ∈V in[0,T), p(T¯ ) =σΩ y_Ω−y(T¯ ) in H,

(1.7)

and the optimality conditions

σ_uu¯−B^?p¯+λ¯_u=0inU, σ_ww¯+ελ¯_y=0inW,

whereI^?:W→L²(0,T;V⁰)andB^?:L²(0,T;V)→Udenote the adjoint operators ofI andB, respectively. For the Lagrange multipliersλ¯_uandλ¯_ywe have

λ¯_u=max 0,λ¯_u+γ_u(u¯−u_b)

+min 0,λ¯_u+γ_u(u¯−u_a)

inU, λ¯y=max 0,λ¯y+γw(εw¯+Iy¯−y_b)

+min 0,λ¯y+γ_w(εw¯+Iy¯−y_a) inW, whereγu,γw>0are arbitrarily chosen.

Remark 2. 1) Analogous to Remark 1 we split the adjoint variable into one part depending on the fixed desired states and into two other parts, which depend linearly on the control variable and on the multiplier λ. Recall that we have defined ˆyas well as the operatorS in Remark 1. For givenyQ∈L²(0,T;H)and y_Ω ∈Hlet ˆp∈W(0,T)denote the unique solution to the adjoint equation

(7)

−d

dthp(t),ˆ ϕi_H+a(ϕ,p(t)) =ˆ σQh(y_Q−y)(t),ϕˆ i_H ∀ϕ∈Vin[0,T), ˆ

p(T) =σΩ y_Ω−y(Tˆ )

inH.

Further, we define the linear, bounded operatorsA1:U→W(0,T)andA2:W→ W(0,T)as follows: for anyu∈Uthe functionp=A1uis the unique solution to

−d

dthp(t),ϕi_H+a(ϕ,p(t)) =−σ_Qh(S Bu)(t),ϕi_H ∀ϕ∈V in[0,T), p(T) =−σ_Ω(S Bu)(T) inH

and for givenλ∈Wthe functionp=A2λuniquely solves p(T) =0 inHand

−d

dthp(t),ϕi_H+a(ϕ,p(t)) +h(I^?λ_y)(t),ϕi_V0,V=0 ∀ϕ∈V in[0,T).

Then, the solution to (1.7) can be expressed as ¯p=pˆ+A1u¯+A2λ¯y.

2) To solve (P^ε) numerically for fixedε>0 we use a primal-dual active set strategy. This method is equivalent to a locally superlinearly convergent semi-smooth Newton algorithm applied to the first-order optimality conditions [8, 9, 10]. ♦

1.4 The POD method

LetZbe either the spaceHor the spaceV. InZwe denote byh·,·i_Zandk · k_Z= h·,·i^1/2_Z the inner product and the associated norm, respectively. For fixed℘∈Nlet the so-calledsnapshots z^k(t)∈Zbe given fort∈[0,T]and 1≤k≤℘. To avoid a trivial case we suppose that at least one of thez^k’s is nonzero. Then, we introduce the linear subspace

Z^℘=spann

z^k(t)|t∈[0,T]and 1≤k≤℘o

⊂Z (1.8)

with dimensiond≥1. We call the setZ^℘snapshot subspace. The method of POD consists in choosing a complete orthonormal basis{ψ_i}^∞_i=1inZsuch that for every

`≤dthe mean square error between the℘elementsz^kand their corresponding`-th partial Fourier sum is minimized:





 min

℘

∑

k=1 Z T

0

z^k(t)−

` i=1

∑

hz^k(t),ψ_ii_Zψ_i

2 Zdt s.t.{ψ_i}^`_i=1⊂Zandhψ_i,ψ_ji_Z=δ_{i j},1≤i,j≤`.

(1.9)

In (1.9) the symbolδ_{i j}denotes the Kronecker symbol satisfyingδ_ii=1 andδ_{i j}=0 fori6=j. An optimal solution{ψ¯_iⁿ}^`_i=1to (1.9) is called aPOD basis of rank`.

(8)

Remark 3.In real computations, we do not have the whole trajectoriesz^k(t)at hand for allt∈[0,T]and 1≤k≤℘. Here we apply a discrete variant of the POD method;

see [7, 16] for more details. ♦

To solve (1.9) we define the linear operatorR:Z→Z^℘as follows:

Rψ=

℘

∑

k=1 Z _T

0

hψ,z^k(t)i_Zz^k(t)dt forψ∈Z. (1.10) Then,R is a compact, nonnegative and selfadjoint operator. Suppose that{λ¯i}^∞_i=1 and{ψ¯i}^∞_i=1denote the nonnegative eigenvalues and associated orthonormal eigenfunctions ofRsatisfying

Rψ¯i=λ¯iψ¯i, λ¯1≥. . .≥λ¯d>λ¯d+1=. . .=0. (1.11) Then, for every`≤dthe first`eigenfunctions{ψ¯i}^`_i=1solve (1.9) and

℘ k=1

∑

Z T 0

z^k(t)−

` i=1

∑

hz^k(t),ψ¯ii_Zψ¯i

2 Zdt=

d i=`+1

∑

λ¯i.

For more details we refer the reader to [11, 13] and [7, Chapter 2], for instance.

Remark 4. a) In the context of the optimal control problem (P^ε) a reasonable choice for the snapshots isz¹=yandz²=p. Utilizing new POD error estimates for evolution problems [3, 20] and optimal control problems [14, 25] convergence and rate of convergence results are derived for linear-quadratic control constrained problems in [7] for the choicesZ=HandZ=V.

b) For the numerical realization the space Zhas to be discretized by, e.g., finite element discretizations. In this case the Hilbert spaceZhas to be replaced by an Euclidean spaceR^lendowed with a weighted inner product; see [7].

If a POD basis{ψ_i}^`_i=1of rank `is computed, we setV^`=span{ψ₁, . . . ,ψ`}.

Then, one can derive a reduced-order model (ROM) for (1.1): for anyg∈L²(0,T;V⁰) the functionq^`=S^`gis given byq^`(0) =0 inHand

d

dthq^`(t),ψi_H+a(q^`(t),ψ) =hg(t),ψi_V0,V ∀ψ∈V^`in(0,T].

For anyu∈Uadthe POD approximationy^`for the state solution isy^`=yˆ+S^`Bu.

Analogously, a ROM can be derived for the adjoint equation; see, e.g., [7]. The POD Galerkin approximation of (Pˆ^ε) is given by

minJ^`(v) =J(yˆ+S^`Bu,v) s.t. v= (u,w)∈V^ε,`_ad (Pˆ^ε,`) where the set of admissible controls is

V^ε,`_ad =

v= (u,w)∈V|u∈Uadand ˆya≤εw+I S^`Bu≤yˆ_binW .

(9)

1.5 A-posteriori error analysis

Let us consider (P) with control, but no state constraints. Based on a perturbation argument [5] it is derived in [25] how far the suboptimal POD control ¯u^`, computed on the basis of the POD model, is from the (unknown) exact ¯u. Then, the error estimate reads as follows:

ku¯^`−uk¯ _U≤ 1 σu

kζ^`k_U, (1.12)

where the computable perturbation functionζ^`∈Uis given by

ζ^`=







−min 0,σ_uu¯^`−B^?p˜^`

inA^`a= s∈D

u¯^`(s) =u_a(s) ,

−max 0,σuu¯^`−B^?p˜^`

inA^`b= s∈D

u¯^`(s) =u_b(s) ,

− σuu¯^`−B^?p˜^`

inD\ A^`a∪A^`_b ,

with ˜p^`=pˆ+A1u¯^`. It is shown in [7, 25] that kζ^`k_U tends to zero as ` tends to infinity. Hence, increasing the number of POD ansatz functions leads to more accurate POD suboptimal controls.

Estimate (1.12) can be generalized for the mixed control-state constraints. First- order sufficient optimality conditions for (Pˆ^ε) are of the form

hJˆ⁰(v),¯ v−vi¯ _V≥0 for allv∈V^ε_ad, (1.13) where the gradient at a pointv= (u,w)∈Vis given by [24]

hJˆ⁰(v),v_δi_V=hσ_uu−B^?(pˆ+A1u),u_δi_U+hσ_ww_δ,w_δi_W∀v_δ = (u_δ,w_δ)∈V. Let us introduce the bounded, linear transformationT :V→Vas

T(v) = (u,εw+I S Bu) forv= (u,w)∈V. (1.14) We assume thatT is continuously invertible. For sufficient conditions we refer to [8, Lemma 2.1]. Then,v= (u,w)belongs toV^ε_adif and only ifv= (u,w) =T(v) satisfies

u_a≤u≤u_binU and yˆ_a≤w≤yˆ_binW. (1.15) Notice that (1.13) can be expressed equivalently as

T^−?Jˆ⁰(T⁻¹v),¯ v−v¯

V≥0 for allv∈Vsatisfying (1.15), (1.16) whereT^−?denotes the inverse of the operatorT^?. Suppose that ¯v^`= (u¯^`,w¯^`)∈V^ε,`_ad is the solution to (Pˆ^ε,`). Our goal is to estimate the norm

kv¯−v¯^`k_V

(10)

without the knowledge of the optimal solution ¯v=T⁻¹v. We set ¯¯ v^`=Tv¯^`= (u¯^`,εw¯^`+I S Bu¯^`). If ¯v^`6=v¯holds, then ¯v^`6=v. In particular, ¯¯ v^`does not satisfy the sufficient optimality condition (1.13). However, there exists a functionζ^`∈V such that

T^−?Jˆ⁰(T⁻¹v¯^`) +ζ^`,v−v¯^`

V≥0 for allv∈Vsatisfying (1.15). (1.17) Choosingv=v¯^`in (1.16),v=v¯in (1.17) and adding both inequality we infer that

0≤D

T^−? Jˆ⁰(T⁻¹v¯^`) +T^?ζ^`−Jˆ⁰(T⁻¹v)¯

,v¯−v¯^`E

V

=D

Jˆ⁰(¯v^`)−Jˆ⁰(v) +¯ T^?ζ^`,T⁻¹ v−¯ v¯^`E

V

=D

σu(u¯^`−u)−¯ B^?A1(u¯^`−u),¯ σw(w¯^`−w)¯

+T^?ζ^`,v¯−v¯^`E

V

≤ −σkv¯−v¯^`k²_V+hB^?A1(u¯−u¯^`),u¯−u¯^`i_U+hT^?ζ^`,v¯−v¯^`i_V

withσ=min(σ_u,σ_w)>0. In [8, Lemma 2.2] it is shown thathB^?A1(u¯−u¯^`),u¯−

¯

u^`i_U≤0 holds. Consequently,

0≤ −σkv¯−v¯^`k²_V+hT^?ζ^`,v¯−v¯^`i_V≤ −σkv¯−v¯^`k²_V+kT^?ζk_Vkv¯−v¯^`k_V which implies the following proposition.

Proposition 1.Let the operatorT – introduced in(1.14)– possess a bounded inverse. Suppose thatv and¯ v¯^`are the optimal solution to(Pˆ^ε)and(Pˆ^ε,`), respectively, satisfyingv¯^`=Tv¯^`∈V^ε_ad. Then, there is a perturbationζ^`= (ζ_u^`,ζ_w^`)∈Vsatisfying

kv¯−v¯^`k_V≤ 1

σkT^?ζ^`k_V withσ=min(σ_u,σ_w)>0. (1.18) The perturbationζ^`= (ζ_u^`,ζ_w^`)can be computed as follows: Letξ^`= (ξ_u^`,ξ_w^`) = T^−?Jˆ⁰(¯v^`)∈V. Then,ξ^`solves the linear system

id_U 0 B^?S^?I^?εid_W

ξ_u^` ξ_w^`

=

σ_uu¯^`−B^?(pˆ+A1u¯^`) σww¯^`

,

where, e.g., id_U:U→Ustands for the identity operator. Note that (1.17) can be written ashξ+ζ,v−v¯^`i_V≥0 for allv∈Vsatisfying (1.15). We find

ζ_u^`=







−min(0,ξ_u^`) inA^ua={u¯^`=u_a} ⊂U,

−max(0,ξ_u^`)inA^u_b={u¯^`=u_b} ⊂U,

−ξ_u^` inU\(A^u_a∪A^u_b) and

ζ_w^`=







−min(0,ξ_w^`) inA^w_a={εw¯^`+I S Bu¯^`=yˆ_a} ⊂W,

−max(0,ξ_w^`)inA^w_a={εw¯^`+I S Bu¯^`=yˆ_b} ⊂W

−ξ_w^` inW\(A^wa∪A^w_b).

(11)

1.6 Optimality-system POD

The accuracy of the reduced-order model can be controlled by the a-posteriori error analysis presented in Section 1.5. However, if the POD basis is created from a reference trajectory containing features which are quite different from those of the optimally controlled trajectory, a rather huge number of POD ansatz functions has to be included in the reduced-order model. This fact may lead to non-efficient reduced- order models and numerical instabilities. To avoid these problems the POD basis is generated in an initialization step utilizingoptimality system POD(OS-POD) introduced in [17]. In OS-POD the POD basis is updated in the direction of the minimum of the cost. Recall that the POD basis is computed from the statey=y+ˆ S Buwith some control u⁰∈Uad. Thus, the reduced-order Galerkin projection depends on the state variable and hence on the controluat which the eigenvalueRψ_i=λ_iψ_i fori=1, . . . , `is solved for the basis{ψ_i}^`_i=1. If the optimal control ¯udiffers significantly from the initially chosen control u⁰, the POD basis does not reflect the dynamics of the system in a sufficiently accurate manner. Therefore, we consider the extended problem:

min ˆJ^`(v)s.t.

(v= (u,w)∈V^ε,`_ad,

(ψ,λ)satisfies (1.11) for℘=1 andz¹=y+ˆ S Bu. (Pˆ^ε,`_os) Notice that the first line of the constraints in (Pˆ^ε,`_os) coincides with the constraints in (Pˆ^ε,`), whereas the second line of the constraints in (Pˆ^ε,`_os) are the infinite- dimensional eigenvalue problem defining the POD basis. For the optimal solution the problem formulation (Pˆ^ε,`_os) has the property that the associated POD reduced system is computed from the trajectory corresponding to the optimal control and thus, differently from (Pˆ^ε,`), the problem of unmodelled dynamics is removed. Of course, (Pˆ^ε,`_os) is more complicated than (Pˆ^ε,`). For practical realization an operator splitting approach is used in [17], where also sufficient conditions are given so that (Pˆ^ε,`_os) possesses a unique optimal solution, which can be characterized by first-order necessary optimality conditions; compare [17] for more details. Convergence results for OS-POD are studied in [18]. The combination of OS-POD and a-posteriori error analysis is suggested in [26] and tested successfully in [6]. The resulting strategy is presented in the next section.

1.7 Algorithms

Forpure control constraints, i.e., ˆJ^`depends only onu, a variable splitting is proposed, where a good POD basis is initialized by applying a few projected gradient steps [15]. Then, the POD basis is kept fixed and (Pˆ^ε,`) is solved. If the a-posteriori error estimatorkζ^`k_U/σ_uis too large (compare (1.12)), the number`of POD basis elements is increased and a new solution to (Pˆ^ε,`) is computed. This process

(12)

is repeated until we obtain convergence; see Algorithm 1. Let us mention that we Algorithm 1(OS-POD with a-posteriori error estimation for control constraints) Require: Maximal number`_maxof POD basis elements,`_min< `_max, initial controlu⁰, and a-

posteriori error toleranceεapo>0;

1: Determine the statey=y+ˆ S Bu⁰and adjointp=pˆ+A1u⁰; 2: Compute a POD basis{ψ_i(u)}^`_i=1as described in Remark 4-a);

3: Performk≥0 projected gradient steps (PGS) with an Armijo line search for (Pˆ^ε,`os) in order to getu^kand associated POD basis{ψi(u^k)}^`_i=1^max; set`=`min;

4: Solve (Pˆ^ε,`) for ¯u^`∈U^adby the primal-dual active set strategy;

5: Compute the perturbationζ^`=ζ^`(u¯^`)as explained in Section 1.5;

6: ifkζ^`k_U/σu>εapoand` < `maxthen 7: Enlarge`and go back to step 4;

8: end if

also utilize snapshots of the adjoint variable in order to compute a POD basis as described in Remark 4-a).

For the mixed constraints, this iteration turns out to not efficient enough. The gradient steps do not lead to a satisfactorily fast and accurate POD basis. Therefore, we invest more effort in the gradient steps by interacting between the projected gradient method and the primal-dual active set strategy (PDASS). In contrast to the situation of pure control constraints, we can provide basis updates based on the more accurate PDASS controls. The stratagy is explained in Algorithm 2.

Algorithm 2(OS-POD with a-posteriori error estimation for state constraints) Require: Maximal number `max of POD basis elements, ` < `max, initial controlu⁰, and a-

posteriori error toleranceεapo>0;

1: Determine the statey=y+ˆ S Bu⁰and adjointp=pˆ+A1u⁰; 2: Compute a POD basis{ψ_i(u)}^`_i=1as described in Remark 4-a);

3: Solve (Pˆ^ε,`) for ¯v^`= (u¯^`,w¯^`)∈V^ε,`ad by the primal-dual active set strategy;

4: Performk≥0 projected gradient steps with an Armijo line search for (Pˆ^ε,`_os) in order to getu^k and associated POD basis{ψi(u^k)}^`_i=1;

5: Compute the perturbationζ=ζ(¯v^`)as explained in Section 1.5;

6: ifkT^?ζk_V/σ>εapoand` < `_maxthen 7: Enlarge`and go back to step 3;

8: else

9: Set`=`minand go back to step 1;

10: end if

(13)

1.8 Numerical experiments

In this section we carry out numerical test examples illustrating the efficiency of the combination of OS-POD and a-posteriori error estimation. The evolution problems is approximated by a standard finite element (FE) method with piecewise linear finite elements for the spatial discretization. For the time integration is done by the implicit Euler method. All programs are written in MATLAButilizing the PARTIAL

DIFFERENTIALEQUATIONTOOLBOXfor the FE discretization.

Run 1 (Example 1) We choosed=2 and consider the unit squareΩ = (0,1)× (0,1)⊂R²as spatial domain with time interval[0,T] = [0,1]. The FE triangulation with maximal edge lengthh=0.06 leads to 498 degrees of freedom. For the time integration we choose an equidistant time gridtj=j∆tfor j=0, . . . ,250 with∆t= 0.004. Motivated by the discretization error we setεapo=max(h²,∆t) =∆t. In (1.4) we choose the datac_p=10,q=0.01, ˜f=0 andy◦(x₁,x₂) =3−4(x₂−0.5)²; see left plot in Figure 1.1. We useσQ=0,σΩ=1 and the regularizationσu=0.1 in the cost

0 0.2

0.4 0.6

0.8 1

0 0.5 1 2 2.5 3

x1

initial statey◦

x2

y◦(x1,x2)

0 0.2

0.4 0.6

0.8 1

0 0.5 1 2 4 6

x1

desiredfinal stateyΩ

x2

yΩ(x1,x2)

Fig. 1.1 Run 1: The initial conditiony◦(left) and the desired terminal statey_Ω(right).

function (1.3) to approximate the desired terminal statey_Ω(x₁,x₂) =2+2|2x₁− x₂|; see right plot in Figure 1.1. The control constraints are chosen to beu_a=0 andu_b=1. The FE primal-dual active set strategy needs five iterations and 860.75 seconds. The optimal FE control is presented in Figure 1.2. We apply Algorithm 1 with`_max=40,`=10 and initial controlu⁰=0. First we do not perform any OS- POD strategy (i.e.,k=0 In Algorithm 1). The method stops after 110.77 seconds with`=35< `_max ansatz functions withkζ^`k_U/σ_u≈0.0034<εapo. Each solve of (Pˆ^ε,`) needs four or five iterations to determine the suboptimal POD solutions.

If we initialize Algorithm 1 with the optimal FE control ¯u^FE as initial control and perform no OS-POD strategy, only`=13 POD basis functions are required. We get kζ^`k_U/σ_u≈0.0019<εapoand the CPU time is 11.48 seconds, which is ten times faster than with the initial control u⁰=0. With one OS-POD gradient step, the toleranceεapois not reached with the available`_max=40 basis functions. Though we make an effort in direction of the optimal control, the algorithm seems to perform even worse than with the basis corresponding to the uncontrolled state. This can be

(14)

0 0 2 0 4 0 6 0 8 1 0

0 5 1 0 0 2 0 4 0 6 0 8 1

t met x1= 0

x₂axis

u(t,x)

0 0 2 0 4 0 6 0 8 1 0

0 5 1 0 5 0 6 0 7 0 8 0 9 1

timet x1= 1

x₂ax s

u(t,x)

0 0 2 0 4 0 6 0 8 1 0

0 5 1 0 0 2 0 4 0 6 0 8 1

timet x2= 0

x₁ax s

u(t,x)

0 0 2 0 4 0 6 0 8 1 0

0 5 1 0 0 2 0 4 0 6 0 8 1

timet x2= 1

x₁ax s

u(t,x)

Fig. 1.2 Run 1: FE optimal control along the boundary partsx1=0,x1=1,x2=0, andx2=1.

seen in the higher control errors that cause the algorithm to run up to`max=40 ansatz functions. We can see, however, that the errors in the suboptimal state are one order smaller than without gradient steps, so the POD basis did improve after all. Afterk=2 gradient steps, the performance is considerably better: The algorithm already terminates with a ROM rank of`=13 like in the optimal case. In Table 1.1 we provide the required CPU times and final errors. Additionally regard Table 1.2

k=0 k=1 k=2 with ¯u^FE

Required` 35 40 13 13

CPU time 110.77 s 147.14 s 18.39 s 11.48 s kζ^`k_U/σ_u 3.43·10⁻³1.14·10⁻²2.82·10⁻³1.94·10⁻³ ku¯^`−u¯^FEk_U 3.15·10⁻³9.53·10⁻³2.62·10⁻³1.93·10⁻³ Table 1.1 Run 1: Performance of Algorithm 1.

where we compare the errors for the POD suboptimal solutions for fixed rank`=15.

Here, we also provide the number of nodes that are restricted by the box constraints

k=0 k=1 k=2 with ¯u^FE kζ^`k_V/σ_u 2.50·10⁻²1.45·10⁻²2.27·10⁻³1.59·10⁻³ ku¯^`−u¯^FEk_U 2.06·10⁻²1.19·10⁻²2.07·10⁻³1.59·10⁻³

differentua 96 67 15 16

differentub 63 38 6 4

Table 1.2 Run 1: Comparison of POD suboptimal solutions for`=15 andkOS-POD steps.

either in the suboptimal control ¯u¹⁵or in the FE optimal control ¯u^FE, but not in both.

It tells us, how many of the restricted nodes are mistaken. This number decreases to 21 by the gradient steps. Next we are interested in the approximation of the active sets. The computations are done with 68 triangulation nodes at the boundary and

(15)

251 time steps; that is a total amount of 68·251=17068 boundary nodes in the time interval[0,T]. The FE optimal control is restricted byu_aat 2233 and byu_bat 3891 nodes. In Table 1.3 we present the number of nodes whereu_k is restricted to the lower or upper bound and, in parenthesis, how many of these nodes are actually restricted correctly, i.e. equal to ¯u^FE, what amounts to more than 99%. Finally, let

u¹ u² u¯^FE

u^k=ua1321 (1321) 1814 (1812) 2233 u^k=ub 986 (986) 3632 (3627) 3891

Table 1.3 Run 1: Number of active nodes. In parenthesis the number of nodes, whereu^k=u¯^FE= uaoruk=u¯^FE=ub, respectively.

us illustrate the changes achieved in the POD basis by the OS-POD steps. The left plot of Figure 1.3 shows how the decay of normalized eigenvalues differs depending on the used control for snapshot generation. The eigenvalues corresponding to the

0 5 10 15 20 25 30 35 40

10²⁰ 10¹⁵ 10¹⁰ 10⁵ 10⁰

indexi λi/Pn j=1λj

decay of normalized eigenvalues

u= 0 u1 u2 uF E

1 5 10 15 20 25 30 35 40

10⁵ 10⁴ 10³ 10² 10¹ 10⁰ 10¹

POD basis rankℓ kζℓk/σu

POD a-posteriori error estimate

u= 0 u1 u2 uF E

Fig. 1.3 Run 1: Comparison of eigenvalue decay for POD basis generated withukafterkgradient steps or with ¯u^FE(left) and a-posteriori error for increasing`(right).

uncontrolled state decay faster and further than those corresponding to the more or less optimally controlled state; increasing the utilized rank further than`=35 yields no more improvement. The difference caused by one gradient step is significant.

A lot more basis functions contain still relevant information for the reduced order models. After the second gradient step the course is equal to the optimal situation, at least for the considered rank `≤40. The right plot of Figure 1.3 shows the a- posteriori error for the suboptimal control. By one gradient step the control error first decreases, but then stagnates at this level. Though without any gradient step, the error is higher at the beginning, between 30 and 35 basis functions it jumps down once more and therefore the algorithm can reach the tolerance. However, the right plot shows that the absolute error in state stays far above the OS-POD results.

In Figure 1.4 we compare the first four POD basis functions obtained either with u⁰=0,u₂or ¯u^FE. In the first POD basis function associated with the uncontrolled equation (u=0) we recognize the initial condition; see left plot of Figure 1.1. The

(16)

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 1 05 1 0 95 0 9 0 85

x₁ POD bas s functionψ1

x₂

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 0 6 0 4 0 2 0 0 2

x₁ POD ba is unctionψ₂

x₂

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 1 0 5 0 0 5

x₁ POD bas s functionψ₃

x₂

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 0 2 0 1 0 0 1 0 2

x₂

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 1 1 1 05 1 0 95 0 9

x₂ 0

0 2 0 4

0 6 0 8

1

0 0 2 0 4 0 6 0 8 1 0 2 0 0 2 0 4 0 6

x₁ POD ba is unctionψ2

x₂ 0

0 2 0 4

0 6 0 8

1

0 0 2 0 4 0 6 0 8 1 0 5 0 0 5 1

x₂ 0

0 2 0 4

0 6 0 8

1

0 0 2 0 4 0 6 0 8 1 0 4 0 2 0 0 2 0 4

x₂

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 1 1 1 05 1 0 95 0 9

x1 POD bas s functionψ1

x₂

0 0 2

0 4 0 6

0 8 1

0 0 2 0 4 0 6 0 8 1 0 2 0 0 2 0 4 0 6

x1 POD ba is unctionψ₂

x2 0

0 2 0 4

0 6 0 8

1

0 0 2 0 4 0 6 0 8 1 0 5 0 0 5 1

x1 POD bas s functionψ₃

x2 0

0 2 0 4

0 6 0 8

1

0 0 2 0 4 0 6 0 8 1 0 4 0 2 0 0 2 0 4

x1 POD bas s functionψ4

x2

Fig. 1.4 Run 1: First four POD basis functions associated with the initial controlu⁰=0 (top), with the control gained afterk=2 OS-POD steps (middle) and with the optimal FE control ¯u^FE (bottom).

optimal state is richer in dynamics what is reflected by a different shape of the POD basis functions. After two OS-POD steps the basis has changed significantly and at least the first four modes can hardly be distinguished from the optimal ones. ♦ Run 2 (Example 2) As a second test, we study a distributed control problem with control and state constraints. In Example 2 we choosed=1,ν=1,β =−5,N_t= 400 time points in the time interval[0,1],N_x=600 grid points in the domainΩ = [0,3],m=50 control components andn=800, i.e. pontwise state constraints. For the data, we choose f =0,y◦=¹₂χ_[1.2,1.8]andy_Q(t,x) =¹₉(6x+6tx−2x²)fort<

1−¹₃x,y_Q(t,x) =0 elsewhere andσQ=1,σΩ =0,σu=σw=ε=7.5e-02. The control and state bounds areu_a=−1,u_b=4 andy_a=0.05,y_b=0.5. Compared to the situation in Run 1 additional challenges arise here:

1. If the convection parameterβ which resembles the dispersal speed of the initial profile is dominant, a rapid decay of the singular values of the POD operatorR

(17)

is prevented. This results in a slower decay of the POD error`7→ kv¯−v¯^`k_V, so larger POD basis ranks are required to ensure a good approximation.

2. The transport termβy_x requires further considerations for the full-order solution techniques. For instance, central differences lead to a stable discretization ifν ∆t≤∆x²/2 holds true, but nevertheless, strong oscillations of the discrete solution may occure if the conditon|β ∆x/ν|<2 is violated; see, e.g., [21]. An upwind scheme forβy_xwhich combines forward and backward differences pre- vents oscillations, but is only of convergence order one.

3. By evaluation of the a-posteriori error estimator, the active set equations ¯u^`=u_a and ¯u^`=u_bdefining the control perturbationζ_u^`are fulfilled exactly by construc- tion since ¯v^`₁=u¯^`holds. This is not the case for the state perturbationζ_w^`: Here, a high-order solution operation is required to calculate ¯v^`₂=εw¯^`+I S Bu¯^`and to determine the active sets ¯v^`₂=yˆ_aand ¯v^`₂=yˆ_b, respectively. We propose to re- place the active set equalities byk¯v^`₂−yˆ_a,bk_W<ε_acc, whereε_accis the accuracy of the full-order model.

4. If the penalized state constraint shall resemble a pointwise pure state constraint, one may choose a fine partition(Ω_j)_1≤_j≤n_y⊆ΩofΩandπ_j(x) =|Ω_j|⁻¹forx∈ Ω_jas well as 0 otherwise. In this case, we have(Ijy)(t) =|Ω_j|⁻¹^R_Ω

jy(t,x)dx≈

y(t,x_j). Now, choosingε1 andσ_w1 ensuresεw+Iy≈y: The penaltyw cannot compensate strong violations of the state constraint any more. A smallε leads to bad condition numbers of the optimality system matrices already for the full-order model which causes not only stability problems, but also less regular state solutions. Since the convergence of POD solutions to the full-order ones require an additional regularity of the snapshot ensemble, a good accuracy of the POD model can be expected only if additional effort is conducted for finding appropriate snapshots.

The uncontrolled FE state is plotted in the left plot of Figure 1.5. The discontinuous

0 0.2

0.4 0.6

0.8 1

0 1 2 3 0 0.2 0.4 0.6 0.8 1

timet convection diffusion equation

spacex

statey=y(t,x)

0 0.5

1 0

1 2 3 0

0.2 0.4 0.6 0.8 1

direction x desired state

time t yQ(t,x)

Fig. 1.5 Run 2: The uncontrolled state (left) and the desired stateyQ(right).

desired statey_Q is presented in the right plot of Figure 1.5. The optimal FE solution to (P^ε) is shown in Figure 1.6. The primal-dual active set strategy (PDASS) required a rather large number of iterations to converge. The complex structure of the active and inactive sets are given in Figure 1.7. In this example, 39 updates of