Optimal control and model-order reduction of an abstract parabolic system containing a controlled bilinear form : Applied to the example of a controlled advection term in an advection-diffusion equation

(1)

Optimal control and model-order reduction of an abstract parabolic system containing a controlled bilinear form

Applied to the example of a controlled advection term in an advection-diffusion equation

Stefan Banholzer, Dennis Beermann

Department of Mathematics and Statistics, University of Konstanz, 78457 Konstanz, Germany (E-Mail: Dennis.Beermann@uni-konstanz.de)

Abstract

In the present paper, a linear parabolic evolution equation is considered whose bilinear form is controlled from a general Banach space. The control-to-state operator and some important properties thereof are presented.

For a quadratic objective function, the gradient in the control space is derived. A-posteriori error estimators are presented for a given reduced-order model (ROM) with respect to both the cost function and the gradient.

Keywords: Partial differential equations, optimal control, reduced-order modelling, a-posteriori error analysis

1 Introduction

Many real-world, time-dependent systems can be modeled by parabolic evolution equations. The resulting systems usually contain one or several algebraic parameters that have to be specified in an optimal way to suit the demands of the application. Two of the most prominent examples are:

1. Parameter identification: Identify the set of parameters such that the solution to the model best approximates the measured data from the real-life system. Usually, this leads to inverse problems, see Isakov (2006) or Vogel (1999)

2. Optimal control: Use external influences to control the system in a way that the solution of the model approximates a predefined desired state, see Tr¨oltzsch (2010) or Hinze et al. (2009).

A parabolic system is usually defined by an operator and an inhomogeneity. In the standard literature of optimal control, the external influence on the system tends to take the form of an inhomogeneity, meaning that it is additively separated by the solution variable. In this paper, we consider the case where the operatoritself is controlled along with the inhomgeneity, thereby leading the way to more complex optimization models. We will only treat linear parabolic equations in this paper. However, even if the operator depends linearily on the control variable, the control-to-state operator itself will be nonlinear. As an example application, we consider a source-free advection-diffusion equation where the advective term may be controlled in order to steer the system. Due to the fact that in an optimal control run, large systems have to be solved repeatedly, it is often advisable to employ model-order reduction in order to save computation time. There is extensive literature to be found for a general overview. For Reduced Basis (RB) techniques, we refer to Patera and Rozza (2006), Hesthaven et al. (2016) and Quarteroni et al. (2016). As far as Proper Orthogonal Decomposition (POD) is concerned, Holmes et al. (2012) and Gubisch and Volkwein (2016) offer an excellent introduction. For model order reduction techniques to work properly, error estimators are required to measure the quality of reduced properties during the optimization programs. We present a general error estimator for a quadratic function and also a specific one for the gradients of the two most common cost functions in the case of the advection-diffusion equation.

(2)

This paper is organized as follows:

In Section 2, we introduce the general parabolic equation containing a control variable. Under given coercivity assumptions, we show that the system admits a unique solution for every control, allowing us to define the (nonlinear) control-to-state operator and to present an energy estimate for the solution. We then show that under certain requirements as to how the control variable influences the operator and the inhomogeneity, this operator is continuous and Fr´echet-differentiable. Gradients of generalized quadratic cost functions are derived in Section 3. Section 4 focuses on Reduced-order modelling and introduces general error estimators for the state solution and a quadratic cost function from Section 3. In Section 5, we consider the concrete example of controlling an advection term in an advection-diffusion equation: For a cost function, we consider tracking both the entire trajectory and the state at the final time. For both cases, we derive ROM-error estimators for the gradients.

2 The general parabolic equation

Throughout these pages, we consider the following linear parabolic evolution system:

y_t(t) +A(u)(t)y(t) =f(u)(t) inV⁰ f.a.a. t∈(0, T) (1a)

y(0) =y0 in H (1b)

where V ,→H =H⁰ ,→V⁰ is a Gelfand triple and T >0 is the final time. The control space is given by U :=L²(0, T;U) where U is a Hilbert space. Let A:U →L^∞(0, T;L(V, V⁰)) and f :U →L²(0, T;V⁰) be the control-dependent bilinear form and inhomogeneity of equation (1a). Lastly,y0∈H is an initial time.

If we fix a controlu∈ U, we may defineB:=A(u) andg:=f(u) so that (1) reads:

y_t(t) +B(t)y(t) =g(t) inV⁰ f.a.a. t∈(0, T) (2a)

y(0) =y0 inH (2b)

Observe thatkB(t)k_L(V,V0)≤C f.a.a. t∈(0, T) withC=kA(u)k_L∞(0,T;L(V,V⁰)). We start by giving a solvability result on (2):

2.1 Theorem. Assume that there exist constants α >0, β≥0 such that B6= 0 is uniformly coercive:

hB(t)ϕ, ϕiV⁰×V ≥ αkϕk²_V −βkϕk²_H f.a.a. t∈(0, T), for allϕ∈V (3) Then there exists a unique solution y∈W(0, T) :=L²(0, T;V)∩H¹(0, T;V⁰)of (2) that satisfies

ky(T)k²_H+kyk²_L2(0,T;V)+kytk²_L2(0,T;V⁰) ≤ C

ky0k²_H+kgk²_L2(0,T;V⁰)

(4) where the constantC >0depends continuously on kBk_L∞(0,T;L(V,V⁰)) andα, βin (3). In particular, it holds

ky(T)k²_H+kyk²_L2(0,T;V) dt≤ e^2βT α

ky0k²_H+ 1

αkgk²_L2(0,T;V⁰)

(5) Furthermore, the mapping(g, y0)7→y is linear fromL²(0, T;V⁰)×H toW(0, T).

Proof. We define the time-dependent bilinear form

b:V ×V ×(0, T)→R: b(ϕ, ψ;t) :=hB(t)ϕ, ψiV⁰×V

and observe that

|b(ϕ, ψ, t)| ≤ kB(t)k_L(V,V0)· kϕk_V · kψk_V ≤ kBk_L∞(0,T;L(V,V⁰))

| {z }

6=0 sinceB6=0

· kϕk_V · kψk_V

(3)

So the bilinear form b ist-uniformly continuous and, because of (3), t-uniformly coercive. Equation (2a) is now equivalent to

hy_t(t), ϕi_V⁰_×V +b(y(t), ϕ;t) =hg(t), ϕi_V⁰_×V for allϕ∈V, f.a.a. t∈(0, T)

As it was shown in (Hinze et al., 2009, Theorem 1.37), a unique solutiony∈W(0, T) of this problem exists.

The fact thaty is linear in (g, y₀) can be easily verified by hand. For the energy estimates (4) and (5), we start by utilizing (2) along with Young’s inequality:

d

dtky(t)k²_H = 2hyt(t), y(t)iV⁰×V = 2h

hg(t), y(t)iV⁰×V − hB(t)y(t), y(t)iV⁰×V

i

≤2 1

2αkg(t)k²_V0+α

2 ky(t)k²_V −αky(t)k²_V +βky(t)k²_H

= 1

αkg(t)k²_V0−αky(t)k²_V + 2βky(t)k²_H (6) This especially implies

d

dtky(t)k²_H≤ 1

αkg(t)k²_V0+ 2βky(t)k²_H Utilizing Gronwall’s Lemma, we obtain

ky(t)k²_H≤e^2βt

ky0k²_H+ 1 α

Z t 0

kg(s)k²_V0 ds

(7) We now return to (6) and integrate over (0, T):

ky(T)k²_H− ky(0)k²_H+α Z T

0

ky(t)k²_V dt≤ 1 α

Z T 0

kg(t)k²_V0 + 2β Z T

0

ky(t)k²_H dt

By utilizing (2b) and (7), this leads to ky(T)k²_H+α

Z T 0

ky(t)k²_V dt≤ ky₀k²_H+ 1

αkgk²_L2(0,T;V⁰)+ 2β Z T

0

e^2βt

ky₀k²_H+ 1 α

Z t 0

kg(s)k²_V0 ds

dt

≤ 1 + 2β Z T

0

e^2βt dt

!

ky0k²_H+ 1

αkgk²_L2(0,T;V⁰)+2β α

Z T 0

e^2βt Z T

0

kg(s)k²_V0 ds dt

=e^2βT

ky0k²_H+1

(8) In particular, this proves (5).

To obtain the estimate foryt, letv∈V be given arbitrarily withkvk_V = 1. We observe using (2a):

|hyt(t), viV⁰×V| ≤ kg(t)−B(t)y(t)k_V0 ≤

1 +kBk_L∞(0,T;L(V,V⁰))

(kg(t)k_V0+ky(t)k_V) This implies:

kytk²_L2(0,T;V⁰)≤2

1 +kBk_L∞(0,T;L(V,V⁰))

²

| {z }

=:Ct

kgk²_L2(0,T ,V⁰)+kyk²_L2(0,T;V)

and if we utilize (5), we can further deduce:

kytk²_L2(0,T;V⁰)≤C_t

kgk²_L2(0,T;V⁰)+e^2βT α

ky0k²_H+ 1

=C_te^2βT

α ky0k²_H+C_t

1 +e^2βT α²

kgk²_L2(0,T;V⁰)

≤C_t

1 +α+ 1 α² e^2βT

| {z }

=: ˜C

ky₀k²_H+kgk²_L2(0,T;V⁰)

(9)

(4)

We can easily verify that ^e^2βT_α ≤C. Therefore, by setting˜ C:= 2 ˜C, (4) holds true andCdepends continuously onα,β andkBk.

To derive from Theorem 2.1 the solvability of equation (1) for any given controlu∈ U, we have to make some assumptions onA:

2.2 Corollary. Assume that for every controlu∈ U, there are αu>0 andβu≥0 such that

hA(u)(t)ϕ, ϕiV⁰×V ≥ αukϕk²_V −βukϕk²_H f.a.a. t∈(0, T), for allϕ∈V (10) Then for every control u∈ U, there exists a unique solutiony∈W(0, T)of (1) which satisfies

ky(T)k²_H+kyk²_L2(0,T;V)+kytk²_L2(0,T;V⁰) ≤ Cu

ky0k²_H+kf(u)k²_L2(0,T;V⁰)

(11) where the constant Cu depends continuously on kA(u)k_L∞(0,T;L(V,V⁰)) as well asαu, βu. We writey=G(u) and have a solution operator G:U →W(0, T).

Next we are interested in properties of the solution operator. For this we first need a result about the coercivity constants αu, βu in (10).

2.3 Lemma. Assume that (10) is satisfied for a control u¯ ∈ U and that A is continuous in u. Then for¯ every ε >0, there isδ >0such that

hA(u)(t)ϕ, ϕiV⁰×V ≥(α_u_¯−ε)kϕk²_V −β_u_¯kϕk²_H for all ϕ∈V (12) holds for allu∈ U with ku−uk¯ _U < δ. In this sense, the coercivity constants α, β from (10) are continuous with respect tou.

Proof. We have for allϕ∈V:

hA(u)(t)ϕ, ϕiV⁰×V =hA(¯u)(t)ϕ, ϕiV⁰×V +h[A(u)−A(¯u)](t)ϕ, ϕiV⁰×V

≥αu¯kϕk²_V −βu¯kϕk²_H− k[A(u)−A(¯u)](t)k_L(V,V0)· kϕk²_V

≥

αu¯− kA(u)−A(¯u)k_L∞(0,T;L(V,V⁰))

kϕk²_V −βu¯kϕk²_H

Now, since Ais continuous in ¯u, there is δ >0 such that for ku−uk¯ _U < δ, it iskA(u)−A(¯u)k < ε. This proves the lemma.

2.4 Lemma. Let (10) be satisfied andAandf be continuous mappings onU. Then the solution operatorG is continuous. IfA andf are locally Lipschitz continuous, thenGis locally Lipschitz continuous.

Proof. Consider two controls u1, u2 ∈ U withku1−u2k< ε and their according solutionsy1, y2∈W(0, T), i.e. yi=G(ui) (i= 1,2). Then the differencey:=y1−y2solves the following differential equation for almost allt∈(0, T) in V⁰:

yt(t) +A(u1)(t)y1(t)−A(u2)(t)y2(t) =f(u1)(t)−f(u2)(t)

⇔ yt(t) +A(u1)(t)y(t) = [f(u1)−f(u2)] (t)

| {z }

=:g₁(t)

−[A(u1)−A(u2)] (t)y2(t)

| {z }

=:g₂(t)

along withy(0) =y0−y0= 0. We know thatg1∈L²(0, T;V⁰). Furthermore,

kg2(t)k_V0 ≤ k[A(u1)−A(u₂)](t)k_L(V,V0)· ky2(t)k_V ≤ kA(u1)−A(u₂)k_L∞(0,T;L(V,V⁰))· ky2(t)k_V which yieldsg₂∈L²(0, T;V⁰) with

kg2k_L2(0,T;V⁰)≤ kA(u1)−A(u2)k_L∞(0,T;L(V,V⁰))· ky2k_L2(0,T;V)

(5)

We can therefore apply Theorem 2.1 and obtain the estimate kyk²_W_(0,T)≤C_A(u₁₎

kf(u₁)−f(u₂)k_L2(0,T;V⁰)+kA(u₁)−A(u₂)k_L∞(0,T;L(V,V⁰))· ky₂k_L2(0,T;V)

2

≤2C_A(u₂₎

kf(u1)−f(u2)k_L2(0,T;V⁰)+kA(u1)−A(u2)k_L∞(0,T;L(V,V⁰))· ky2k_L2(0,T;V)

2

The last inequality holds if εis small enough. This is due to the fact that by Theorem 2.1, C_A(u) depends continuously on the coercivity constantsαuandβuas well askA(u)k. It was shown in Lemma 2.3 that these parameters can be seen as depending continuously onuwhich makes the constantC_A(u)depend continuously onu. By the continuity off andA fromU, we can see thatkyk_W_(0,T₎→0 asu1→u2, implying thatGis continuous in u2. If, in addition,f andAare locally Lipschitz continuous, we obtain for small enoughε:

kyk²_W_(0,T)≤2C_A(u₂₎

L_f(u₂) +L_A(u₂)ky2k_L2(0,T;V)

²

· ku1−u₂k²_U

whereLf(u2) andLA(u2) are the local Lipschitz constants inu2. Therefore,Gis locally Lipschitz-continuous in u2.

2.5 Lemma. Let (10) be satisfied and the mappings A and f be Fr´echet differentiable (in u). Then the solution operator Gis Fr´echet differentiable and its derivative is given by G⁰(u)h=y^h foru, h∈ U, where y^h∈W(0, T)satisfies the system

y_t^h(t) +A(u)(t)y^h(t) = (f⁰(u)h)(t)−(A⁰(u)h)(t)¯y(t)in V⁰ f.a.a. t∈(0, T) (13a)

y^h(0) = 0 inH (13b)

andy¯is the solution for the controlu, i.e. y¯=G(u).

Proof. Consider a control u ∈ U and a direction h ∈ U. We define y := G(u+h)−G(u). We proceed similarily to the proof of Lemma 2.4 and observe thaty satisfies the system

y_t(t) +A(u)(t)y(t) = [f(u+h)−f(u)] (t)−[A(u+h)−A(u)] (t) [G(u+h)(t)] inV⁰ f.a.a. t∈(0, T) (14a)

y(0) = 0 inH (14b)

Now, sincef andAare Fr´echet differentiable, this means

f(u+h)−f(u) =f⁰(u)h+rf(u, h), A(u+h)−A(u) =A⁰(u)h+rA(u, h) withrf(u, h)∈L²(0, T;V⁰),rA(u, h)∈L^∞(0, T;L(V, V⁰)) such that

krf(u, h)k_L2(0,T;V⁰)

khk_U →0, kr_A(u, h)k_L∞(0,T;L(V,V⁰))

khk_U →0 as khk_U→0 (15)

Inserting this into the right-hand side of (14a) yields

[f(u+h)−f(u)] (t)−[A(u+h)−A(u)] (t) [G(u+h)(t)]

= [f⁰(u)h+r_f(u, h)] (t)−[A⁰(u)h+r_A(u, h)] (t)[G(u+h)(t)]

= [(f⁰(u)h)(t)−(A⁰(u)h)(t)G(u)(t)]

+ [r_f(u, h)(t)−r_A(u, h)(t)G(u+h)(t)−(A⁰(u)h)(t)(G(u+h)−G(u))(t)]

So nowy solves the system

y_t(t) +A(u)(t)y(t) =g₁(t) +g₂(t) inV⁰ f.a.a. t∈(0, T) (16a)

y(0) = 0 in H (16b)

with

g1(t) := (f⁰(u)h)(t)−(A⁰(u)h)(t)G(u)(t)

g₂(t) :=r_f(u, h)(t)−r_A(u, h)(t)G(u+h)(t)−(A⁰(u)h)(t)(G(u+h)−G(u))(t)

(6)

Note that the termy=G(u+h)−G(u) still appears on the right-hand side of (16a) withing₂yet is treated as an inhomogeneity of (1a). This is possible since we already know thaty exists and is a solution to (16). It is straightforward to show thatg₁, g₂∈L²(0, T;V⁰), so Theorem 2.1 is applicable. Therefore, the solution of this equation depends linearily on the right-hand side of (16a), meaning thatyallows for the decomposition y=y^h+y^δ where y^h, y^δ∈W(0, T) satisfy the systems

y_t^h(t) +A(u)(t)y^h(t) =g1(t) inV⁰ f.a.a. t∈(0, T)

y(0) = 0 in H

as well as

y^δ_t(t) +A(u)(t)y^δ(t) =g2(t) inV⁰ f.a.a. t∈(0, T)

y(0) = 0 inH

We want to show thaty^h=G⁰(u)h. For this, we have to show thaty^h is linear and continuous inhand that k^y^δk_W(0,T)/khk_U→0 askhk_U →0.

We start withy^h. The linearity follows directly sinceg₁ depends linearily onhandy^h depends linearily on g1 as it was shown in Theorem 2.1. For the continuity, we infer from (4) that

y^h

2

W(0,T)≤CA(u)kf⁰(u)h−(A⁰(u)h)(·)G(u)(·)k²_L2(0,T;V⁰)

≤C_A(u)

kf⁰(u)k_L(U,L2(0,T;V⁰))+kA⁰(u)k_L(U,L∞(0,T;L(V,V⁰)))· kG(u)k_L2(0,T;V)

²

· khk²_U (17)

The above estimate shows that y^h is continuous inh= 0 which, along with the linearity inh, implies that y^h is continuous inheverywhere. Altogether, we have shown that

(h7→y^h)∈L(U, W(0, T)) We now turn to y^δ. Again from the energy estimate (4), we infer that

y^δ

2

W(0,T) ≤C_A(u)krf(u, h)−r_A(u, h)(·)G(u+h)(·)−(A⁰(u)h)(·)(G(u+h)−G(u))(·)k²_L2(0,T;V⁰)

≤C_A(u)

kr_f(u, h)k_L2(0,T;V⁰)+kr_A(u, h)k_L∞(0,T;L(V,V⁰))· kG(u+h)k_L2(0,T;V)

+kA⁰(u)k_L(U,L∞(0,T;L(V,V⁰)))· khk_U· kG(u+h)−G(u)k_L2(0,T;V⁰)

2

This yields y^δ

W(0,T)

khk_U

!²

≤C_A(u) krf(u, h)k_L2(0,T;V⁰)

khk_U +krA(u, h)k_L∞(0,T;L(V,V⁰))

khk_U · kG(u+h)k_L2(0,T;V)

+kA⁰(u)k_L(U,L∞(0,T;L(V,V⁰)))· kG(u+h)−G(u)k_L2(0,T;V⁰)

!²

Again with the fact that Gis continuous inuand (15), we obtain k^y^δk_W(0,T)/khk_U →0 as khk_U →0. This shows that the derivative ofGatuin the directionhis indeed given by y^h and we can writeG⁰(u)h=y^h.

3 Quadratic cost functions

In this section, let us assume that we are given a cost function of the form

Jˆ:U →R, J(u) :=ˆ ¹₂kΦG(u)−y_dk²_X (18)

(7)

whereG:U →W(0, T) is the solution operator,Xis a Hilbert space and Φ∈L(W(0, T), X) is an observation operator. The vectory_d∈X is called a desired state. We would like to derive a gradient representation ofJ. First of all, we observe that we can use the decomposition ˆJ =J◦Gwith the non-reduced cost function

J :W(0, T)→R, J(y) := ¹₂kΦy−ydk²_X The derivative of this function is easy to compute:

3.1 Lemma. The function J is Fr´echet differentiable and the derivative is given by

J⁰(y)hy=hΦy−yd,ΦhyiX for ally, hy∈W(0, T) (19) Proof. Lety, hy∈W(0, T) be arbitrarily given. Then we observe

J(y+hy)−J(y) =¹₂

kΦy+ Φhy−ydk²_X− kΦy−ydk²_X

=¹₂

kΦy−y_dk²_X+ 2hΦy−y_d,Φh_yiX+kΦhyk²_X− kΦy−y_dk²_X

=hΦy−y_d,Φh_yiX+¹₂kΦhyk²_X

and clearly,h_y7→ hΦy−y_d,Φh_yi_X is a linear and continuous mapping fromW(0, T) toR. Furthermore,

1

2kΦh_yk²_X

khyk_W_(0,T) ≤¹₂kΦk²_L(W_(0,T),X)· kh_yk_W_(0,T) → 0 as kh_yk_W_(0,T₎→0 This implies that J is differentiable iny with the proposed derivative.

Using Lemma 3.1, we can immediately see that ˆJ is differentiable:

3.2 Corollary. Let (10) be satisfied and the mappings A andf be Fréchet differentiable from U. Then the cost functionJîs Fréchet differentiable fromU toRand the derivative is given by

Jˆ⁰(u)h=hΦ¯y−y_d,Φy^hiX (20) wherey¯:=G(u)∈W(0, T)is the state solution andy^h:=G⁰(u)hthe solution to (13).

Proof. Seeing as ˆJ=J◦G, and thatJ andGthemselves are differentiable as it was shown in the Lemmata 2.5 and 3.1, ˆJ is differentiable by the chain rule and the derivative is given by

Jˆ⁰(u)h=J⁰(G(u)) [G⁰(u)h] for allu, h∈ U

It was shown in Lemma 2.5 thaty^h:=G⁰(u)hsatisfies the system (13). Inserting this into the representation (19), we end up with

Jˆ⁰(u)h=hΦG(u)−yd,Φy^hiX (21)

We have given a representation for the derivative of the reduced cost function. For the purposes of optimization, the notion of a gradient is additionally required:

3.3 Corollary. Let (10) be satisfied and the mappings A and f be Fr´echet differentiable. Then the cost function Jˆhas a gradient which is given by

∇Jˆ:U → U : ∇J(u) = (ΦGˆ ⁰(u))^∗(ΦG(u)−y_d) (22) where(ΦG⁰(u))^∗∈L(X;U)is the Hilbert adjoint of the operatorΦG⁰(u)∈L(U, X).

Proof. Follows directly from the representation (21) by definition of the Hilbert adjoint.

(8)

4 Reduced Order Modelling

In this chapter, we assume that we are given a finite-dimensional subspace V^` ⊂V which is spanned by a V-orthonormal basis{ϕ1, ..., ϕ`}. We employ a Galerkin projection of (1) ontoV^` and look for a functiony^` satisfying:

hy^`_t(t), ϕiV⁰×V +hA(u)(t)y^`(t), ϕiV⁰×V =hf(u)(t), ϕiV⁰×V f.a.a. t∈(0, T), for allϕ∈V^` (23a)

y^`(0) =P^`y0 inH (23b)

whereP^`∈L(V) is a projection operator ontoV^`. We expressy^`through the basis ofV^`: 4.1 Lemma. Assume that we are given a coefficient function a^`∈H¹(0, T;R^`)such that

y^`: (0, T)→V, y^`(t) :=

`

X

i=1

a^`_i(t)ϕi f.a.a. t∈(0, T) (24) Then it isy^`∈H¹(0, T;V)⊂W(0, T) with derivativey_t^`(t) =P`

i=1a˙^`_i(t)ϕ_i for almost allt∈(0, T).

Proof. SinceW(0, T) =L²(0, T;V)∩H¹(0, T;V⁰), we proceed in two steps:

(i) For almost allt∈(0, T), we have y^`(t)

_V ≤

`

X

i=1

a^`_i(t)

· kϕik_V ≤

`

X

i=1

a^`_i(t)

2

!¹^/²

·

`

X

i=1

kϕik²_V

!¹^/²

| {z }

=:CV

=CV

a^`(t) R^`

and, sincea^`∈L²(0, T;R^`), this yieldsy^`∈L²(0, T;V).

(ii) For almost allt∈(0, T), it holds that

y^`(t+h)−y^`(t)−h

`

X

i=1

˙ a^`_i(t)ϕ_i

_V

≤

`

X

i=1

a^`_i(t+h)−a^`_i(t)−a˙^`_i(t)h · kϕik_V

≤CV

a^`(t+h)−a^`(t)−a˙^`(t)h R^`

which implies

y^`(t+h)−y^`(t)−hP`

i=1a˙^`_i(t)ϕi

_V

|h| ≤C_V

a^`(t+h)−a^`(t)−a˙^`(t)h R^`

|h| →0 as|h| →0 Furthermore, the mappingh7→hP`

i=1a˙i(t)ϕiis obviously linear and continuous fromRtoL²(0, T;V).

So indeedy^`∈H¹(0, T;V) with the proposed derivative.

Insertingy^` from (24) into (23) yields a system fora^`:

˙

a^`(t) +A^`(u)(t)a^`(t) =f^`(u)(t) inR^` f.a.a. t∈(0, T) (25a)

a(0) =a⁰in R^` (25b)

where a⁰ ∈R^` is the basis representation ofP^`y₀ in the basis of V^`. Furthermore,A^`(u) : (0, T)→R^`×`

andf^`(u) : (0, T)→R^` for allu∈ U with

A^`_ij(u)(t) =hA(u)(t)ϕj, ϕiiV⁰×V for allu∈ U, f.a.a. t∈(0, T) f_i^`(u)(t) =hf(u)(t), ϕiiV⁰×V for allu∈ U, f.a.a. t∈(0, T)

(9)

4.2 Lemma. The system (25) admits a unique solution a^` ∈H¹(0, T;R^`). The function y^` from (24) then solves the system (23) which is its unique solution. We define the solution operator G_` : U → W(0, T), u7→y^`.

Proof. The evolution equation (25a) can equivalently be understood in (R^`)⁰ sinceR^`∼= (R^`)⁰. We will therefore utilize the trivial Gelfand tripleW(0, T;R^`;R^`), i.e. W(0, T;H;V) whereH =V =R^`. Furthermore, it is:

A^`(u)(t) _L(

R^`,R^`)≤C max

i,j=1,...,`

A^`_ij(u)(t)

≤CkA(u)(t)k_L(V,V0)≤CkA(u)k_L∞(0,T;L(V,V⁰))

where we have used the fact that in the finite-dimensional spaceL(R^`,R^`), all norms are equivalent and that the system {ϕ1, ..., ϕ`} is V-orthonormal. Therefore, we have shown that A^`(u) ∈ L^∞(0, T;L(R^`,(R^`)⁰)) for every u∈ U. Furthermore, it is easy to show thatA^`(u) satisfies (10) due to the fact that R^` is finite- dimensional and for allu∈ U, we have

A^`_ij(u)(t)

≤ kA(u)(t)k_L(V,V0)≤ kA(u)k_L∞(0,T;L(V,V⁰)) for alli, j= 1, ..., `, f.a.a. t∈(0, T) At last, we have

f^`(u)(t)

2 (R^`)⁰ =

f^`(u)(t)

2 R^` =

p

X

i=1

hf(u)(t), vⁱiV⁰×V

2≤C_V² kf(u)(t)k²_V0

with the constant C_V from Lemma 4.1. Therefore, f^`(u) ∈ L^∞(0, T; (R^`)⁰). Utilizing Corollary 2.2, this means that (25) admits a unique solutiona^`∈W(0, T;R^`; (R^`)⁰). Due to the fact that (R^`)⁰ ∼=R^`, this also means thata^`∈H¹(0, T;R^`). By Lemma 4.1, this impliesy^`∈H¹(0, T;V) and it satisfies (23).

The fact that (23) has a unique solution can be proven in the very same way as in Section 2, along with every other result in that section, by replacing the spaceV withV^`.

4.3 Theorem. Assume that (10) holds for every control u∈ U and let y ∈W(0, T)satisfy the full system (1) and y^`∈W(0, T)the reduced system (23). Then the following a-posteriori error estimate holds true:

y(T)−y^`(T)

2 H+

y−y^`

2

L²(0,T;V)≤e^2β^u^T αu

(1− P^`)y₀

2 H+ 1

αu

R^`

2

L²(0,T;V⁰)

(26) where α_u, β_u are the coercivity constants of the operator A(u)from (10) and the residual R^`∈L²(0, T;V⁰) is given by

R^`(t) =y_t^`(t) +A(u)(t)y^`(t)−f(u)(t) ∈V⁰ f.a.a. t∈(0, T) (27) Proof. We define the errore:=y−y^`∈W(0, T) and observe that for everyϕ∈V, it holds:

het(t), ϕiV⁰×V +hA(u)(t)e(t), ϕiV⁰×V =f(u)(t)− hy_t^`(t), ϕiV⁰×V +hA(u)(t)y^`(t), ϕi

=−hR^`(t), ϕiV⁰×V

e(0) = (1− P^`)y0 inH It follows from (5) applied to ethat

ke(T)k²_V +kek²_L2(0,T;V)≤ e^2β^u^T αu

(1− P^`)y₀

2 H+ 1

αu

R^`

2

L²(0,T;V⁰)

In addition to a quadratic cost function ˆJ as defined in (18), we define the according reduced-order cost function

Jˆ^`:U →R, Jˆ^`(u) :=J(G`(u)) = ¹₂kΦG`(u)−ydk²_X

We will use the inequality (26) to estimate the error made in the cost function between the full and reduced- order model:

(10)

4.4 Corollary. Assume that (10) holds for every controlu∈ U and consider a quadratic cost function Jˆas defined in (18). Then the following estimate holds for the cost function:

J(u)ˆ −Jˆ^`(u) ≤1

2

Φ(y−y^`)

2 X+

Φ(y−y^`) _X·

q

2 ˆJ^`(u) (28)

If either Φy=y∈L²(0, T;L²(Ω)) orΦy=y(T)∈L²(Ω), we can use estimate (26) for

Φ(y−y^`) _X. Proof. We begin by utilizing the third Pythagorean Theorem:

Jˆ(u)−Jˆ^`(u) = 1

2

kΦG(u)−y_dk²_X− kΦG_`(u)−y_dk²_X

≤ 1 2

Φy−y_d−(Φy^`−y_d) _X·

kΦy−y_dk_X+

Φy^`−y_d _X

≤ 1 2

Φ(y−y^`) _X·

Φ(y−y^`) _X+ 2

Φy^`−yd

_X

= 1 2

Φ(y−y^`)

2 X+

Φ(y−y^`(u)) _X·

q 2 ˆJ^`(u)

In addition to an estimate for the cost function, we require one for the gradient. However, this will depend strongly on the concrete parabolic system and the cost function, which we will cover later.

5 Controlling a convection term

Consider the equation

y_t(t, x)−κ∆y(t, x) +v(u)(t, x)· ∇y(t, x) = 0 inQ:= (0, T)×Ω (29a)

∂y

∂n(t, x) = 0 on Σ := (0, T)×Γ (29b)

y(0) =y0 in Ω (29c)

where Ω⊂R^d is a bounded Lipschitz domain with boundary Γ. We consider the Gelfand tripleV :=H¹(Ω), H := L²(Ω). For the controlled convection term v, we demand that v : U → L^∞(0, T;L^∞(Ω;R^d)). As an application to (29), one can for example think of two different fluids that are supposed to be mixed by steering the rotation velocity of mixers in the domain.

To derive the weak formulation of (29), we assume thaty(t)∈H¹(Ω) and y_t(t)∈(H¹(Ω))⁰ and then ’test’

(29a) with a functionψ∈H¹(Ω) which yields, using Green’s identity:

hyt(t), ψiH¹(Ω)⁰×H¹(Ω)+κ Z

Ω

∇y(t, x)· ∇ψ(x)dx+ Z

Ω

(v(u)(t, x)· ∇y(t, x))ψ(x)dx= 0 This motivates us to define, foru∈ U and t∈(0, T):

A(u)(t)∈L(H¹(Ω), H¹(Ω)⁰) hA(u)(t)ϕ, ψiV⁰×V :=κ

Z

Ω

∇ϕ(x)· ∇ψ(x)dx+ Z

Ω

(v(u)(t, x)· ∇ϕ(x))ψ(x)dx (30) Of course, we still have to show that this is indeed an element ofL(V, V⁰). Basic estimates reveal that

hA(u)(t)ϕ, ψiH¹(Ω)⁰×H¹(Ω)

≤κkϕk_H1(Ω)· kψk_H1(Ω)+kv(u)(t,·)k_L∞(Q)· kϕk_H1(Ω)· kψk_H1(Ω)

and therefore

kA(u)(t)k_L(V,V0)≤κ+kv(u)(t,·)k_L∞(Q) (31) We can therefore write (29) in the form of (1):

yt(t) +A(u)(t)y(t) = 0 inH¹(Ω)⁰ f.a.a. t∈(0, T)

y(0) =y0 inL²(Ω) (32)

(11)

5.1 Properties of the bilinear form

We start by showing some elementary properties of the bilinear formA:

5.1 Lemma. Consider the operator Aas defined by (30). Then it is

A(u)∈L^∞(0, T;L(V, V⁰)) for allu∈ U (33) and condition (10) holds for Awith the coercivity constants

αu= κ

2, βu=κ

2 +kv(u)k_L∞(0,T;L^∞(Ω;R^d))

2κ

2

(34) Proof. The fact that (33) holds can be seen from (31) which implies

kA(u)k_L∞(0,T;L(V,V⁰))≤κ+kv(u)k_L∞(0,T;L^∞(Ω;R^d)) for allu∈ U To prove that (10) holds, letu∈ U be arbitrary but fixed. We then observe

hA(u)(t)ϕ, ϕi_V⁰_×V =κ Z

Ω

∇ϕ(x)· ∇ϕ(x)dx

| {z }

=:(I)

+ Z

Ω

(v(u)(t, x)· ∇ϕ(x))ϕ(x)dx

| {z }

=:(II)

For the first term, we simply obtain

(I) =κ

kϕk²_H1(Ω)− kϕk²_L2(Ω)

The second term can be estimated using Young’s inequality as follows:

(II) ≥ − kv(u)k_L∞(0,T;L^∞(Ω;R^d))· k∇ϕk_L2(Ω;R^d)· kϕk_L2(Ω)

≥ − kv(u)k

κ

2kv(u)kk∇ϕk²_L2(Ω;R^d)+^kv(u)k_2κ kϕk²_L2(Ω)

= −^κ₂

kϕk²_H1(Ω)− kϕk²_L2(Ω)

−^kv(u)k_2κ ² kϕk²_L2(Ω)

Adding (I) and (II) again, we end up with

hA(u)(t)ϕ, ϕiV⁰×V ≥ ^κ₂kϕk²_H1(Ω)−_kv(u)k2

2κ +^κ₂

kϕk²_L2(Ω)

so condition (10) is satisfied with the coercivity constants proposed in (34).

5.2 Lemma. Let the mappingv be Fr´echet differentiable fromU toL^∞(0, T;L^∞(Ω;R^d)). Then the operator A is Fr´echet differentiable fromU toL^∞(0, T;L(V, V⁰))with derivative

h(A⁰(u)h)(t)ϕ, ψiV⁰×V = Z

Ω

[(v⁰(u)h)(t, x)· ∇ϕ(x)]ψ(x) dx for allϕ, ψ∈V (35) Proof. The proof is straightforward and will not be carried out here.

5.3 Example. Assume that U =R^p and that there exists continuously differentiable coefficientsηi :R→R and shape functions vⁱ∈L^∞(0, T;L^∞(Ω;R^d))such that v(u) :=Pp

i=1ηi(ui(t))vⁱ foru∈ U. Then it can be shown that v is differentiable with derivative

(v⁰(u)h)(t, x) =

p

X

i=1

η_i⁰(u_i(t))h_i(t)vⁱ(t, x) for all u, h∈ U, t∈(0, T), x∈Ω (36) Next, we consider error estimation and start with Corollary 4.4: The estimator depends on the variables α(u),β(u) andR^` which we have to clarify for the current situation:

(12)

5.4 Corollary. Lety=G(u)andy^`=G_`(u)be full and reduced solutions to (32). Then we obtain the error estimate

y(T)−y^`(T)

2

L²(Ω)+ y−y^`

2

L²(0,T;V)≤ 2 κexp

κ+^kv(u)k_κ ^∞ T

(1− P^`)y0

2 H+ 2

κ R^`

2

L²(0,T;V⁰)

(37) where the residual R^` is given by (27).

Proof. This is a direct consequence of (26) with the coercivity constantsαu, βu from (34).

For error estimation of derivatives, we have to turn to analyzing the concrete cost functions:

5.2 Gradients in L

²

(Q)

Following Section 3, we would like to analyze a specific cost functions for the system (29):

Jˆ₁(u) :=J₁(G(u)) := 1 2

Z T 0

Z

Ω

(y(t, x)−y_Q(t, x))²dx dt

with a desired functionyQ∈L²(0, T;L²(Ω)). In order to accurately describe ˆJ1 as in (18), we introduce the operator

Φ_Q :W(0, T)→L²(0, T;L²(Ω)), (Φ_Qy)(t, x) :=y(t, x) f.a.a. t∈(0, T), x∈Ω which is trivially continuous. Thereby, we can write

Jˆ₁(u) =¹₂kΦ_QG(u)−y_Qk²_L2(0,T;L²(Ω)) for allu∈ U

We can therefore accurately define a gradient∇Jˆ1:U → U by use of Corollary 3.3 if we know how the adjoint operator (ΦQG⁰(u))^∗looks like. For this, let us define the adjoint equation for an arbitraryz∈L²(0, T;L²(Ω)) andu∈ U:

−hpt(t), ϕiV⁰×V +hA(u)(t)ϕ, p(t)iV⁰×V = hz(t), ϕiH for allϕ∈V, f.a.a. t∈(0, T)

p(T) = 0 inH (38)

Because of the negative sign in front of the time derivative in (38), we also call this a backwards equation.

With identical arguments as for the forward equation, (38) admits a unique solutionp∈W(0, T).

5.5 Lemma. Let v be Fr´echet differentiable. Then for arbitraryu∈ U, the adjoint operator is given by (ΦQG⁰(u))^∗∈L(L²(0, T;L²(Ω)),U), [(ΦQG⁰(u))^∗z](t) = [B(u)^∗ΦQp](t)inU, f.a.a. t∈(0, T) (39) wherep∈W(0, T)is the solution to the adjoint equation (38) and Bis given by

B:U →L(U, L²(0, T;L²(Ω))) (B(u)h)(t, x) :=−(v⁰(u)h)(t, x)·(∇G(u))(t, x) f.a.a. t∈(0, T), x∈Ω Proof. Letu, h∈ U andz∈L²(0, T;L²(Ω)). Then it is

h(ΦQG⁰(u))h, zi_L2(0,T;L²(Ω))= Z T

0

Z

Ω

(G⁰(u)h)(t, x)z(t, x)dx dt We writey^h:=G⁰(u)hand choosey^h(t)∈V as a test function ϕin (38):

...= Z T

0

h− hpt(t), y^h(t)iV⁰×V +hA(u)(t)y^h(t), p(t)iV⁰×V

i dt

=hp(0), y^h(0)i_H− hp(T), y^h(T)i_H+ Z T

0

hhy_t^h(t), p(t)i_V⁰_×V +hA(u)(t)y^h(t), p(t)i_V⁰_×Vi dt

(13)

where we have utilized the formula of partial integration for functions in W(0, T), compare Zeidler (1990).

Next, using y^h(0) = p(T) = 0 along with using p(t) as a test function ϕ in (13), we obtain by writing

¯

y:=G(u):

...=− Z T

0

h(A⁰(u)h)(t)¯y(t), p(t)i_V⁰_×V dt⁽³⁵⁾= − Z T

0

h(v⁰(u)h)(t)· ∇¯y(t), p(t)i_H dt

=h(B(u)h),Φ_QpiL²(0,T;L²(Ω))=hh,B(u)^∗Φ_QpiU

Lastly, it is obvious thatB(u)∈L(U, L²(0, T;L²(Ω))) and so the Hilbert adjoint B(u)^∗ is well-defined.

5.6 Example. Assume that we are in the situation of Example 5.3. Then the operator from Lemma 5.5 has, for every u∈ U, the adjoint

(B(u)^∗z)i(t) =−η_i⁰(ui(t)) Z

Ω

vⁱ(t, x)·(∇G(u))(t, x)z(t, x)dx for alli= 1, ..., p, f.a.a. t∈(0, T) Therefore, the functional Jˆ1 has the gradient

(∇Jˆ1(u))i(t) =−η_i⁰(ui(t)) Z

Ω

vⁱ(t, x)·(∇G(u))(t, x)p(t, x)dx for alli= 1, ..., p, f.a.a. t∈(0, T) (40) wherep∈W(0, T)is the solution to the adjoint equation (38) with z= ΦQG(u)−yQ.

5.7 Remark. Assume that the control space was given by U rather than U = L^∞(0, T;U), meaning the controls would be constant in time. In this case we define the mapping

Ψ :U → U, (Ψu)(t) :=u f.a.a. t∈(0, T)

It is quite obvious that Ψ∈L(U,U). Let us now consider a cost function defined on U: Kˆ1:U →R, Kˆ1(u) :=J1(Ψu) =¹₂kΦQG(Ψu)−yQk²_L2(0,T;L²(Ω))

By the chain rule, Kˆ1 is Fr´echet differentiable ifJ1 is Fr´echet differentiable which in turn is the case ifv is differentiable. The derivative is then given byKˆ⁰₁(u)h=J₁⁰(Ψu)(Ψh). This implies the existence of a gradient

∇Kˆ1(u) = Ψ^∗∇J1(Ψu) for all u∈ U, where Ψ^∗ is the Hilbert adjoint of Ψ. In order to compute this, let u∈U andv∈ U:

hΨu, vi_U = Z T

0

hu, v(t)iU dt=hu, Z T

0

v(t)dtiU =hu,Ψ^∗viU

Therefore, the gradient is given by∇Kˆ₁(u) =RT

0 ∇J₁(Ψu)(t)dtfor allu∈U. Inserting this into (40) in the situation of Example 5.3, we obtain the representation:

(∇Kˆ1(u))_i=−η⁰_i(u_i) Z T

0

Z

Ω

vⁱ(t, x)·(∇G(u))(t, x)p(t, x) dx dt for alli= 1, ..., p

We return to the task of error estimation and start by introducing three preliminary estimators for the following terms:

(i) k(ΦQG⁰_`(u))^∗k_L(L2(0,T;L²(Ω)),U) (Lemma 5.8).

(ii) p−p^`

_L₂_(0,T;V₎ wherep, p^`∈W(0, T) are the full and reduced solutions to the adjoint equation (38) to a right-hand sidez∈L²(0, T;L²(Ω)) (Lemma 5.9).

(iii) k(ΦQ(G⁰(u)−G⁰_`(u)))^∗zk_U for an elementz∈L²(0, T;L²(Ω)) (Lemma 5.10).

Using these three estimates, we will be able to present an error estimator for the reduced gradient (Theorem 5.11).