Optimality System POD for Time-Variant, Linear-Quadratic Control Problems

(1)

Optimality System POD for Time-Variant, Linear-Quadratic

Control Problems

Diploma Thesis

submitted by Simone Metzdorf

at the

Faculty of Sciences

Department of Mathematics and Statistics

Supervisor and Reviewer: Prof. Dr. Stefan Volkwein Second Reviewer: Prof. Dr. Karl Kunisch

Konstanz, December 2015

Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-329322

(2)

(3)

List of Figures

8.1 Initial state and desired nal state of( ¯P) . . . 50

8.2 Desired statey_Q=y⁽⁰⁾ of(P⁽⁰⁾)and adjoint state p⁽⁰⁾₁ . . . 52

8.3 FE solutiony_{F E} and uncontrolled solutiony . . . 53

8.4 Optimal POD, OS-POD and FE control . . . 53

8.5 POD solutiony_POD and OS-POD solutiony_OS-PODfor two gradient steps . . . 54

8.6 First four POD basis functions associated with the uncontrolled solution . . . 54

8.7 First four POD basis functions generated by an OS-POD initialization step with two gradient steps . . . 55

8.8 First four POD basis functions associated with the optimal FE solution . . . 55

8.9 POD a-posteriori error estimate of the control for an increasing number of basis functions, generated by an OS-POD initialization step . . . 55

8.10 Comparison of normalized eigenvalues computed by eigs and svd . . . 56

8.11 Optimal OS-POD control onk= 40boundary segments . . . 57

8.12 POD solutiony_POD and OS-POD solutiony_OS-POD . . . 57

8.13 Optimal OS-POD controls in the constrained and unconstrained case . . . 58

8.14 OS-POD solutiony_OS-POD based on a constrained control . . . 58

8.15 OS-POD, POD, TR-POD and TR-FE controls . . . 60

v

(6)

(7)

List of Tables

8.1 Results of POD, OS-POD and FE solver fork= 4 . . . 52 8.2 Results of POD, OS-POD and FE solver fork= 40. . . 56 8.3 Results of OS-POD based SQP, POD based SQP, TR-POD and TR-FE . . . 59

vii

(8)

(9)

Abstract

The optimization of processes is an omnipresent task in industry and science. In many cases those processes are characterized by partial dierential equations (PDEs), which describe how the state of a considered system can be regulated by a control. The aim is to nd a control which induces a desired state or a close approximation of it. A task of this type is called an optimal control problem.

In this thesis we study thermal processes. As an initial setting we assume a temperature distributiony0

on a domainΩand a desired nal distributiony(T). The temperature can be controlled by a heat source uacting on the boundary of the domain. In the considered cases of this work, the correlation between control and state is determined by a (non-)linear heat equation. In this way every controlugets assigned uniquely to a state y(u). The purpose is to nd a control, such that the associated state approximates the desired state as precise as possible at nal timeT. Additional restrictions can be imposed on the control to meet technical limiting factors for example. We state this task mathematically by a quadratic minimization problem governed by a (non-)linear heat equation and inequality constraints concerning the control.

We start this thesis with a short chapter providing fundamental facts and denitions. To motivate the relevance of the considered problem we introduce a general optimal control problem governed by a nonlinear parabolic equation at the beginning of the second chapter. One possible approach to this problem is the application of a sequential quadratic programming (SQP) method. At this proceeding each iterative step yields a quadratic minimization problem with linearized constraints. This is the kind of problem we study in the course of the present thesis. In Chapter 3 we specify the parameters of the problem, show the existence of a unique solution, and state the corresponding optimality conditions. We perform a spatial discretization of the domain in Chapter 4 using the nite element (FE) Method and develop a reduction of the problem by the FE-Galerkin approach. This leads to high dimensional systems of ordinary dierential equations. The discretization is completed by the application of the implicit Euler method. Furthermore we present the primal-dual active set strategy (PDASS) as a possible way to solve the completely discretized problem with inequality constraints. To classify the quality of an approximate solution we make use of an a-posteriori error estimation, which we present in Chapter 5. In Chapter 6 we introduce the proper orthogonal decomposition (POD) method for model reduction and adjust the PDASS algorithm of the previous chapter to the POD-Galerkin reduced problem. By this method we can lower the dimension of the considered systems of ordinary dierential equations signicantly. In order to improve the approximation quality of the reduced order model we introduce optimality system proper orthogonal decomposition (OS-POD) in Chapter 7. This approach results in a POD based algorithm with an OS-POD initialization step and a-posteriori error estimation. We apply the presented algorithms in several numerical test runs and analyze the results in the last chapter. Additionally, we compare our ndings with the ones of [Rog14] and [Grä14]. Finally, we draw a short conclusion in the last section.

ix

(10)

(11)

Zusammenfassung

Die Optimierung von Prozessen ist eine allgegenwärtige Aufgabenstellung in Industrie und Forschung.

Häug werden diese Prozesse durch partielle Dierentialgleichungen bestimmt. Sie beschreiben beispielsweise wie der Zustand eines betrachteten Systems durch eine Steuerung beeinusst werden kann. Das Ziel ist nun eine passende Steuerung zu nden, so dass der resultierende Zustand möglichst genau dem ge- wünschten Zustand entspricht. In dieser Arbeit werden thermische Prozesse untersucht. Gegeben ist eine Temperaturverteilungy0auf einem GebietΩund eine gewünschte Verteilungy(T)zum Endzeitpunkt der Betrachtung. Wärmequellen am Rand des Gebiets dienen als Steuerung. Der Zusammenhang zwischen Steuerunguund Zustandy wird in den betrachteten Fällen über eine (nicht-)lineare Wärmeleitungsglei- chung beschrieben. Sie ordnet einer Steuerungueinen eindeutigen Zustandy(u)zu. Gesucht ist nun eine Steuerung, deren zugehöriger Zustand zum Endzeitpunkt möglichst genau der angestrebten Temperatur- verteilung entspricht. Dabei können noch zusätzliche Bedingungen an die Steuerung gestellt werden um beispielsweise technische Rahmenbedingungen zu berücksichtigen. Mathematisch formulieren wir diese Aufgabenstellung in Form eines quadratischen Minimierungsproblems mit einer (nicht-)linearen Wärme- leitungsgleichung als Nebenbedingung und gegebenenfalls zusätzlichen Ungleichungsnebenbedingungen an die Steuerung.

Diese Arbeit beginnt mit einem einführenden Kapitel, in dem grundlegende Denitionen und Eigenschaf- ten eingeführt werden. Die Bedeutung des betrachteten Problems wird im zweiten Kapitel verdeutlicht.

Dort wird es aus einem allgemeineren Problem hergeleitet. So wird ein nichtlineares, parabolisches Opti- malsteuerungsproblem durch die Anwendung von Sequential Quadratic Programming (SQP-Verfahren) auf die Lösung eines linearen, parabolischen Problems je Iterationsschritt reduziert. Auf diese Weise erhalten wir die Problemstellung, die im Verlauf dieser Arbeit untersucht wird. In Kapitel 3 wird das Problem explizit deniert und die Voraussetzungen an dessen Parameter speziziert. Anschlieÿend wird die Existenz einer eindeutigen Lösung des Problems gezeigt. Zudem werden die Optimalitätsbedingungen aufgestellt. Zur räumlichen Diskretisierung nutzen wir in Kapitel 4 die Methode der Finiten Elemente (FE). Durch den so genannten FE-Galerkin-Ansatz ergibt sich ein endlich-, jedoch hochdimensiona- les, reduziertes Problem. Aus den partiellen Dierentialgleichungen werden so groÿe Systeme gewöhn- licher Dierentialgleichungen. Um den Diskretiesierungsprozess zu vervollständigen, wird das implizite Euler Verfahren zur zeitlichen Diskretisierung verwendet. Das reduzierte Problem kann nun samt Un- gleichungsnebenbedingungen durch ein primal duales Verfahren gelöst werden. Um die Qualität einer solchen Näherungslösung des Problems abschätzen zu können, stellen wir in Kapitel 5 einen a-posteriori Fehlerschätzer bereit. Da das reduzierte Problem auf Grund der Gröÿe noch zu komplex ist, wird in Ka- pitel 6 Modellreduktion auf der Basis von Proper Orthogonal Decomposition (POD) vorgestellt. Durch Einbeziehung des Fehlerschätzers wird anschlieÿend ein POD-basierter Lösungsalgorithmus formuliert.

Für die Umsetzung des Algorithmus wird die vollständige Diskretisierung und die Anpassung des primal dualen Verfahrens an das neue, reduzierte Problem von verhältnismäÿig niedriger Dimension ausgeführt.

Die Erweiterung des Reduktionsansatzes in Form von Optimality System Proper Orthogonal Decompo- sition (OS-POD) bietet die Möglichkeit, die Güte der Näherung des reduzierten Modells zu verbessern.

Dieser Ansatz wird in Kapitel 7 ausgearbeitet. Die Umsetzung erfolgt durch eine Ergänzung des beste- henden POD-Algorithmus durch einen vorbereitenden Schritt. Im letzten Kapitel werden die erarbeiteten Algorithmen explizit auf ein Problem der optimalen Steuerung angewandt und deren Resultate analysiert.

xi

(12)

Zudem ndet ein Vergleich mit Ergebnissen aus [Rog14] und [Grä14] statt. Die Ergebnisse der Arbeit werden abschlieÿend in einem kurzen Fazit zusammengefasst.

(13)

Chapter 1

Fundamentals

This chapter provides some fundamental facts and useful denitions for the further course of the present work. First of all we need some theory to become familiar with the state spaceW(0, T). Afterwards we list several denitions which are required within the discretization process of the considered problem.

Denition 1.1. The Banach space (of equivalence classes) of Bochner measurable functionsy, mapping from an interval[0, T] ⊂R to a Banach space X, whose normsky(.)k_X lie in standard L^p(0, T) space (1≤p≤ ∞), is denoted by

L^p(0, T;X) :={y: [0, T]→X Bochner measurable :kyk_Lp(0,T;X):=

Z T 0

ky(t)k^p_Xdt

!1/p

<∞}.

Denition 1.2. ForΩ⊂R^m (m∈N)the rst order Sobolev space overL²(Ω) is given by H¹(Ω) :={f ∈L²(Ω) :D_if ∈L²(Ω)for1≤i≤m},

whereDif denotes the distributional derivative concerning xi. We denote its dual space byH¹(Ω)⁰. Then we dene the function spaceW(0, T)for[0, T]⊂Rby

W(0, T) :={y∈L²(0, T;H¹(Ω)) :yt∈L²(0, T;H¹(Ω)⁰)}, whereyt denotes the distributional derivative ofy inL²(0, T;H¹(Ω)⁰).

Section 3.4 in [Trö10] gives a sound introduction to the topic of vector-valued functions (functions with values in Banach spaces) and vector-valued distributions. In particular it provides the following facts:

Remark 1.3. W(0, T)is a Banach space with the related norm kyk_W(0,T):=

kyk²_L2(0,T;H¹(Ω))+ky_tk²_L2(0,T;H¹(Ω)⁰)

1/2

.

Endowed with the inner product stated belowW(0, T)is a Hilbert space:

hu, vi_W_(0,T) := RT

0 hu, vi_H1(Ω)dt+RT

0 hu_t, v_ti_H1(Ω)⁰dt.

The notation hF, Gi_H1(Ω)⁰ is an abbreviation for hI(F),I(G)i_H1(Ω) where the Riesz isomorphism I : H¹(Ω)⁰ →H¹(Ω) is given by the Riesz representation theorem [Wer07, Theorem V.3.6].

1

(14)

Eventually, we will interpret an element y of the space W(0, T) as a continuous function from[0, T] to L²(Ω). For this purpose we need the remark below.

Remark 1.4. H¹(Ω),→L²(Ω) =L²(Ω)⁰ ,→H¹(Ω)⁰ represents a chain of dense, continuous embeddings, which is called a Gelfand triple.

Theorem 1.5. There exists a continuous embedding fromW(0, T)into the Banach space

C([0, T], L²(Ω)) :={y: [0, T]×Ω→R: y continuous in twith respect to the norm of L²(Ω)}.

The corresponding norm is dened by kyk_C([0,T],L2(Ω)):= max_t∈[0,T]ky(t)k_L2(Ω).

Remark 1.6. The following properties of W(0, T)are provided in [DL00, pp. 473-477]:

(i) The formula of integration by parts holds true for all y, ϕ∈W(0, T): Z T

0

hy_t(t), ϕ(t)i_H1(Ω)⁰,H¹(Ω)dt=hy(T), ϕ(T)i_L2(Ω)− hy(0), ϕ(0)i_L2(Ω)

− Z T

0

hϕt(t), y(t)i_H1(Ω)⁰,H¹(Ω)dt, where the notation hF, ψi_H1(Ω)⁰,H¹(Ω) stands forF(ψ)forF∈H¹(Ω)⁰, ψ∈H¹(Ω). (ii) For ally∈W(0, T), ϕ∈L²(Ω) we can state

hyt(t), ϕi_H1(Ω)⁰,H¹(Ω)= d

dthy(t), ϕi_L2(Ω).

In the course of this thesis we are confronted with functions mapping from the Banach space W(0, T) to R, for instance. Concerning dierentiability in Banach spaces we refer to [Trö10], which provides an introduction to this topic in Section 2.6. It includes denitions of Gâteaux and Fréchet derivatives.

We introduce several algorithms to solve the considered problems in this thesis. For implementation, we need methods of numerical integration. In order to approximate integrals over time, we make use of the trapezoidal rule.

Denition 1.7 (Trapezoidal rule).

Let f : [0, T]⊂R→Rbe integrable. For an equidistant grid ofn−1 segments(n∈N)dened by 0 =t₁< t₂< . . . < t_n=T with t_j := (j−1)·∆t for 1≤j≤n and ∆t:= T

n−1, the integral of f can be approximated by

Z T 0

f(t)dt≈ ∆t

2 ·(f(t1) + 2·f(t2) +. . .+ 2·f(t_n−1) +f(tn)) =

n

X

j=1

αj·f(tj) with so called trapezoidal weights

α1:=αn:= ∆t

2 and αj:= ∆t for 2≤j ≤n−1.

By numerical computations we get discrete values of functions in several instants of time instead of continuous solutions. Hence, we need a method to approximate the norm of these functions.

Denition 1.8 (Time-averaged norm).

Let [0, T]⊂R be an interval of time andX a Banach space. For a discrete functionf, mapping time instances tj∈[0, T]tof(tj)∈X for1≤j≤n(n∈N), we introduce the time-averaged norm

kfk_X,_timeav:=

n

X

j=1

αj· kf(tj)k_X, whereα1, . . . , αn are trapezoidal weights as introduced in Denition 1.7.

(15)

Chapter 2

Motivation for the considered optimal control problem

At the beginning we consider a common nonlinear optimal control problem. The task is to minimize the dierence between a desired nal stateyΩ(T) and the actual nal state y(T) on a domainΩ. The generation of the nal state should be accomplished with minimal control costs. These requirements characterize the quadratic cost functionJ¯which depends on the statey and the control uaccordingly.

Since we consider a thermal process, state and control are correlated by a nonlinear heat equation. The set of admissible controls is characterized by a box restriction. We state this initial problem by:

Minimize

J¯(y, u) := 1 2

Z

Ω

|y(T, x)−y_Ω(x)|²dx+1 2

k

X

i=1

γ_i Z T

0

|ui(t)|²dt subject to (s.t.)

cpyt(t, x)−∆y(t, x) + (y(t, x))³ = f¯(t, x) f.a.a. (t, x)in Q,

∂_νy(t, x) +qy(t, x) = Pk

i=1u_i(t)χ_i(x) f.a.a. (t, x)in Σ, y(0, x) = y0(x) f.a.a. xinΩ







(SE)

and

ua(t)≤u(t)≤ub(t) f.a.a. tin [0, T].

( ¯P)

The spatial domainΩis a bounded subset of R^m(m∈N)with a Lipschitz boundary Γ[Trö10, Section 2.2.2], which consists of disjunct segments Γ_i (1 ≤ i ≤ k, k ∈ N) with nonzero Lebesgue measure in R^m−1. The period under consideration is [0, T]with an end time T >0. The setsQ andΣare dened byQ:= (0, T)×ΩandΣ := (0, T)×Γ.

The scalars γi > 0 (1 ≤ i ≤ k) in the cost function are regularization parameters, which weight the respective control costs.

The PDE contains constantscp >0 and q≥0. The vectorν ∈R^m indicates the outward unit normal vector to the boundary. The control shape functionχi (1 ≤i ≤k)denotes the characteristic function on the corresponding boundary segmentΓi. The independent controls ui (1≤i≤k)on the boundary segments are gathered in the vector u := (u1, . . . , uk)^T. Inequalities of vectors as the one in the box restriction are meant to be read component-by-component in this thesis.

3

(16)

Control problems of this type are the subject of various research works. We just mention some examples:

• J. Raymond and H. Zidani [RZ99] provide fundamental existence results and necessary optimality conditions for a more general version of ( ¯P).

• F. Tröltzsch [Trö10] includes theory and several approaches as well as applications concerning this kind of problems.

• K. Kunisch and S. Volkwein [KV08] discuss POD as a technique of model reduction in this context and introduce Optimality System POD for improved results.

• S. Rogg [Rog14] deals with a Trust Region POD approach to problem ( ¯P) with a more general nonlinearity in the heat equation.

Concerning the solvability of ( ¯P) we can summarize that there exists at least one control u^∗ ∈ U :=

L²(0, T;R^k) and a unique associated state y^∗ ∈ Y := W(0, T)∩C( ¯Q) which solve the problem for prescribed functions yΩ, y0 ∈ C( ¯Ω), f¯∈ L^r(Q) (r > m/2 + 1) and ua, ub ∈ L^∞(0, T;R^k) (see e.g.

[Trö10, Theorems 5.5 and 5.7] or [RZ99]).

A common approach for solving a nonlinear parabolic optimal control problem like ( ¯P)is to apply an SQP method as C. Gräÿle [Grä14] does. In this thesis we just roughly outline the SQP method for the considered problem. For more details we refer to the previously mentioned work or to [Trö10].

Advantages of the SQP method are fast convergence rates and the fact that we only need to solve a linear-quadratic problem in each iteration. Furthermore it is easy to incorporate inequality constraints like box restrictions.

Due to the nonlinearity in the PDE,( ¯P)is a non-convex problem. Hence, one needs to consider rst and second order optimality conditions to determine a locally optimal solution. Under certain assumptions about the second order optimality conditions, SQP algorithms are locally quadratic convergent (see [Trö10, Chapter 5]). In addition to solving rst order conditions, an implementation of an SQP algorithm requires a globalization strategy. In the present work we study the kind of problem that has to be solved in every separate SQP step. Therefore we just consider rst order optimality conditions.

We start with the consideration of the equality constrained problem, neglecting the control constraints, to outline the SQP method for( ¯P). With ω:= (y, u)that reads

ω∈Ymin×U

J¯(ω) s.t. (SE).

In this context the SQP approach can be derived by solving the optimality system of this problem using Newton's method. The inequality constraints can be incorporated afterwards.

According to [Rog14, Remark 2.15] the equality constraints fulll a so-called regular point condition for allωin Y ×U. Hence, an optimal solutionω^∗∈Y ×U to the equality constrained problem satises the Karush-Kuhn-Tucker (KKT) condition

∇L(ω¯ ^∗, p) = 0 in (Y ×U)⁰×(L²(0, T;V⁰)×H), (2.1) with a unique Lagrange multiplierp= (p₁, p₂)∈Y.

Here, the Lagrange function L¯: (Y ×U)×(L²(0, T;V)×H)→R with (ω, p) = (y, u, p)7→ L(ω, p)¯ is stated by

L(ω, p) := ¯¯ J(ω) + Z T

0

c_phyt, p₁i_H1(Ω)⁰,H¹(Ω)dt+ Z T

0

Z

Ω

∇y^T∇p1+ (y³−f¯)p₁dxdt +

Z T 0

Z

Γ

qy−

k

X

i=1

u_iχ_i

!

p₁dsdt+ Z

Ω

(y(0)−y₀)p₂dx.

(17)

5 We have omitted the arguments in the functions in favor of clarity. The abbreviationdsis used for the surface measureds(x).

Now we apply Newton's method on the KKT system (2.1). Hence, there has to be solved the linearization of it at a already determined pair of iteratesω^(j) := (y^(j), u^(j)) ∈ Y ×U in one iteration of the SQP method to achieve new iteratesω= (y, u):

∇²L(ω¯ ^(j), p^(j))(ω−ω^(j)) =−∇L(ω¯ ^(j), p^(j)).

This linearized subproblem coincides with the optimality system of a linear-quadratic optimal control problem. That is why the method is called 'sequential quadratic programming method'. Combined with the control constraints from( ¯P)this yields:

Minimize

DJ¯⁰(ω^(j)), ω−ω^(j)E

(Y×U)⁰,(Y×U)+1 2

DL¯ωω(ω^(j), p^(j))(ω−ω^(j)), ω−ω^(j)E

(Y×U)⁰,(Y×U)

= Z

Ω

(y^(j)(T)−y_Ω)(y(T)−y^(j)(T))dx+

k

X

i=1

γ_i Z T

0

u^(j)_i (u_i−u^(j)_i )dt

+1 2

Z

Ω

(y(T)−y^(j)(T))²dx+

k

X

i=1

γi

Z T 0

(ui−u^(j)_i )²dt+ Z T

0

Z

Ω

6y^(j)p^(j)₁ (y−y^(j))²dxdt

!

= Z

Ω

1

2(y(T))²−1

2(y^(j)(T))²−yΩy(T) +yΩy^(j)(T)dx+ Z T

0

Z

Ω

3y^(j)p^(j)₁ (y−y^(j))²dxdt +1

2

k

X

i=1

γi

Z T 0

u²_i −(u^(j)_i )²dt

s.t.

c_py_t(t, x)−∆y(t, x) + 3(y^(j)(t, x))²y(t, x) = 2(y^(j)(t, x))³+ ¯f(t, x) f.a.a. (t, x)inQ,

i=1u_i(t)χ_i(x) f.a.a. (t, x)inΣ, y(0, x) = y₀(x) f.a.a. xin Ω and

ua(t)≤u(t)≤ub(t) f.a.a. t in[0, T].

(18)

Eliminating all constants from the objective function which are independent ofy andu, and adding the constant summand ¹₂R

Ωy_Ω²dx+RT 0

R

Ω6(y^(j))³p^(j)₁ dxdt

, does not change the results of the minimization problem. This leads to an equivalent linear-quadratic minimization problem which we consider throughout this thesis:

Minimize

J(y, u) := 1 2

Z T 0

Z

Ω

=:αQ(t,x)

z }| {

6y^(j)(t, x)p^(j)₁ (t, x)|y(t, x)−

=:y_Q(t,x)

z }| {

y^(j)(t, x)|²dxdt+1 2

Z

Ω

|y(T, x)−yΩ(x)|²dx + 1

2 Z T

0 k

X

i=1

γi(ui(t))²dt s.t.

c_py_t(t, x)−∆y(t, x) +

=:c(t,x)

z }| {

3(y^(j)(t, x))²y(t, x) =

=:f(t,x)

z }| {

2(y^(j)(t, x))³+ ¯f(t, x) f.a.a. (t, x)inQ,

i=1u_i(t)χ_i(x) f.a.a. (t, x)inΣ, y(0, x) = y₀(x) f.a.a. xin Ω and

ua(t)≤u(t)≤ub(t) f.a.a. tin [0, T].

(P^(j))

In the further course of this thesis we consider(P^(j))with more general functionsα_Q, y_Q, c, f and add another weighting functionα_Ωin the cost function. Then we refer to the problem as(P). Eventually, we derive an OS-POD approach to solve(P). In this context the present work can be seen as an extension of [Stu11], [Gri14] and [Grä14]. In the rst thesis, S. Studinger develops a POD approach with a-posteriori error estimation to solve a variation of(P)with a more general cost function and control space L²(Σ). Based on this, E. Grimm examines a special case of problem(P)with a more restricted cost function in [Gri14] and adds an OS-POD initialization step to the existent algorithm to achieve better approximation quality. The third thesis by C. Gräÿle includes a POD approach to solve(P^(j))as an iteration of an SQP algorithm for a nonlinear optimal control problem. As a control space it comprises the choice ofL²(Σ)as well as the choice of piecewise spatially constant controls on disjunct boundary segments. A combination of the results from [Grä14] with the present work provides the opportunity of a OS-POD based inexact SQP method.

(19)

Chapter 3

The linear-quadratic parabolic optimal control problem

In the rst section of this Chapter we generalize the optimal control problem deduced in the preliminary chapter and summarize the required assumptions. The unique solvability of the associated PDE is treated in the second part. We derive the existence of a unique optimal control in the third section and eventually state sucient optimality conditions in Section 3.4.

3.1 Statement of the problem

We consider an optimal control problem specied by:

Minimize J(y, u) :=1

2 Z T

0

Z

Ω

αQ(t, x)|y(t, x)−yQ(t, x)|²dxdt+1 2 Z

Ω

αΩ(x)|y(T, x)−yΩ(x)|²dx + 1

2 Z T

0 k

X

i=1

γi|ui(t)|²dt s.t.

cpyt(t, x)−∆y(t, x) +c(t, x)y(t, x) = f(t, x) f.a.a. (t, x)in Q,

∂νy(t, x) +qy(t, x) = Pk

i=1ui(t)χi(x) f.a.a. (t, x)in Σ, y(0, x) = y0(x) f.a.a. xinΩ







(SE)

and

ua(t)≤u(t)≤ub(t) f.a.a. t in[0, T]. (BR) (P)

The aim is to nd a control u:= (u₁, ..., u_k)^T such that the corresponding statey as a solution of the state equation (SE) diers as little as possible from the desired statesyQ on the domainQandyΩonΩ. Besides, the control costs should be minimized. These costs are weighted by regularization parameters γi (1≤i≤k). They are represented by the last summand of the cost functionJ.

7

(20)

We summarize the requirements on the domains and the data which hold throughout this thesis:

Assumption 3.1.

• The spatial domain Ω is a bounded subset of R^m (m ∈ N) with a Lipschitz boundary Γ, which consists of disjunct segments Γi (1 ≤ i ≤ k, k ∈ N) with nonzero Lebesgue measure in R^m−1. The control uoperates on the time interval [0, T] with an end time T >0 and the state y on the time-space cylinderQ= (0, T)×Ω. The setΣ = (0, T)×Γdenotes the considered spatial boundary.

• For 1≤i≤k the scalars γ_i >0 are regularization parameters. We choose γ_i :=γ|Γ_i| where |Γ_i| denotes the Lebesgue measure of the boundary segment Γi andγ is a positive constant.

• The desired states are yQ ∈L²(Q) andyΩ∈L²(Ω).

• The weighting functions αQ ∈L^∞(Q)and αΩ∈L^∞(Ω) have to satisfy αQ ≥0 almost everywhere (a.e.) inQ andαΩ≥0 a.e. inΩ.

• The functions occurring in the state equation (SE) arec∈L^∞(Q), the characteristic functions χ_i on the boundary segments Γ_i (1≤i≤k),f ∈L²(Q)and the initial statey₀∈L²(Ω).

Furthermore the PDE contains positive constants cp>0 andq≥0.

• The bounds in the box restrictionu_a,u_b∈L^∞(0, T;R^k)satisfyu_a≤u_ba.e. in[0, T]by components.

The choice of a state space is discussed in [Trö10]. Since the time derivative of y occurs in the state equation, the Hilbert spaceW(0, T)turns out to be an adequate choice. Eventually, we can interpret an elementy of the spaceW(0, T)as a continuous function from [0, T]toL²(Ω) by Theorem 1.5.

An appropriate control space is the Hilbert spaceU :=L²(0, T;R^k)with the inner product hu, vi_U :=

Z T 0

hu(t), v(t)i

R^kdt.

Due to the box restriction (BR) we dene the set of admissible controls by U_ad:={u∈U :u_a(t)≤u(t)≤u_b(t)f.a.a. t in[0, T]}.

The component ui of a control usignies the control intensity on the associated boundary segment Γi

(1 ≤ i ≤ k). We work with independent, piecewise spatially constant controls on disjoint boundary segments instead of a control spaceL²(Σ)since this seems to be a reasonable requirement in engineering.

To shorten notation we deneV :=H¹(Ω)andH :=L²(Ω).

3.2 Existence of a unique solution to the state equation

This section addresses the solvability of the state equation.

Denition 3.2. We cally∈W(0, T)a weak solution to (SE) if it satises the initial conditiony(0) =y₀ inH and the variational formulation of (SE) that is

Z T 0

cphyt(t), ϕ(t)i_V0,Vdt+ Z T

0

Z

Ω

∇y^T· ∇ϕ+cyϕ dxdt+ Z T

0

Z

Γ

qyϕ dsdt

= Z T

0

Z

Ω

f ϕ dxdt+ Z T

0

Z

Γ k

X

i=1

uiχi

! ϕ dsdt

for allϕ∈L²(0, T;V). The evaluation ofy in a single point within the initial condition makes sense by consideration of the continuous embedding W(0, T),→C([0, T], H)from Theorem 1.5.

(21)

3.2: Existence of a unique solution to the state equation 9 We intend to apply the results of [Trö10, Section 3.6.4]. For that purpose we verify if the state equation (SE) can be transformed to the considered form of representation. That is (cf. (3.58) in [Trö10]):

yt+Ay+c0y = f in Q,

∂νAy+qy = u in Σ, y(0) = y0 in Ω,

(3.1) whereAis an elliptic dierential operator (cf. equations (2.19)-(2.21) in [Trö10]) of the form

Ay(x) =−

m

X

i,j=1

Di(ai,j(x)Djy(x)) forx∈Ω,

witha_i,j ∈L^∞(Ω) and a_i,j(x) =a_j,i(x) fori, j ∈ {1, . . . , m} and x∈Ω. Furthermore they satisfy the condition of uniform ellipticity for a constantγ0>0:

m

X

i,j=1

a_i,j(x)ξ_iξ_j ≥γ₀|ξ| for allξ∈R^m and f.a.a. x∈Ω.

∂νA denotes the derivative in the direction ofνA which is dened by (νA)i(x) =

m

X

j=1

ai,j(x)νj(x) for1≤i≤m.

In order to transform the state equation to the desired form of representation (3.1), we divide the rst equation of (SE) by cp > 0. For i, j ∈ {1, . . . , m} we set ai,j(x) = δij/cp in (3.1), which meets the assumptions from above. Furthermore we choose 1/c_p·c₀ instead of c₀, 1/c_p·f for f and Pk

i=1u_iχ_i instead ofu. Hence, we can deduce that (SE) is a special case of (3.1).

To adopt the results from [Trö10] we additionally need the following remark.

Denition and Remark 3.3. We dene the operatorB:U →L²(Σ) by (Bu)(t, x) :=

k

X

i=1

ui(t)χi(x) f.a.a. (t, x) inΣ. (3.2) The mappingB is linear and continuous since

kBuk_L2(Σ)= Z T

0

Z

Γ k

X

i=1

u²_iχ_idsdt

!^1/2

= Z T

0 k

X

i=1

|Γi|^1/2ui

2

dt

!^1/2

=kDΓuk_L2(0,T;R^k)

≤c_Γ· kuk_L2(0,T;R^k)

holds withD_Γ:=diag |Γ₁|^1/2, . . . ,|Γ_k|^1/2and a constant c_Γ, which does not depend onu.

Theorem 3.4. Given Assumption 3.1 the state equation (SE) has a unique weak solution y that lies in W(0, T) (after a modication on a set of measure zero as the case may be). It satises the estimate

kyk_W_(0,T)≤α·

kfk_L2(Q)+kuk_L2(0,T;R^k)+ky₀k_H

(3.3) with a constantα >0 independent off,uandy0. Hence, the data-solution mappingS: (f, u, y0)7→y is a linear, continuous operator fromL²(Q)×L²(0, T;R^k)×H intoW(0, T).

Proof. This result is given on page 165 in [Trö10], and is proven there in Section 7.3. Since it is postulated for a control spaceU =L²(Σ) we need Remark 3.3 to obtain estimate (3.3) and thus the continuity of the data-solution mappingS in the case ofU =L²(0, T;R^k).

(22)

3.3 Existence of an optimal control

In this section we show the existence of an optimal control for(P). That is a functionu^∗ ∈U_ad which satises J(y^∗, u^∗) ≤ J(y, u) with the associated optimal state y^∗ = S(f, u^∗, y₀) for all u ∈ U_ad and y =S(f, u, y₀). We reduce the cost function and consider a minimization problem in the control space U_ad in the following. We start with a more precise study of the data-solution mappingS from Theorem 3.4.

Corollary 3.5. (cf. [Trö10, p. 165])

The data-solution mapping of the PDE (SE) from Theorem 3.4 has the structurey=S(f, u, y0) =GQf+ GΣu+G0y0 with continuous, linear operatorsGQ:L²(Q)→W(0, T),GΣ:L²(0, T;R^k)→W(0, T)and G0:H →W(0, T)dened by

GQ(f) := S(f,0,0), GΣ(u) := S(0, u,0), G0(y0) := S(0,0, y0).

According to [Stu11] we interpret L²(0, T;V) as a subset of L²(Q). This is possible since L²(0, T;V) is isometric isomorphic to W₂^1,0(Q) := {y ∈ L²(Q)|∂x_iy ∈L²(Q)for alli ∈ {1, . . . , m}} (see [Trö10, p.

144]). Hence, we can interpret functions in W(0, T) as functions in L²(Q). Furthermore we introduce the trace operatorE_T : W(0, T) →H which mapsy linear and continuous toy(T) (see Theorem 1.5).

Now we can write the optimal control problem(P)in the form

u∈Umin_ad

Jˆ(u) (Pˆ)

where

Jˆ(u) :=1 2

Z T 0

Z

Ω

α_Q|G_Σu+

=:−¯yQ,y¯Q∈L²(Q)

z }| {

G_Qf+G₀y₀−y_Q|²dxdt+1 2

Z

Ω

α_Ω|E_TG_Σu+

=:−¯y_Ω,y¯_Ω∈H

z }| {

E_TG_Qf+E_TG₀y₀−y_Ω|²dx +1

2 Z T

0 k

X

i=1

γ_iu²_idt

= 1 2

Z T 0

Z

Ω

αQ|GΣu−y¯Q|²dxdt+1 2

Z

Ω

αΩ|ETGΣu−y¯Ω|²dx+γ 2

Z T 0

Z

Γ

|Bu|²dsdt

= 1 2

√α_Q(G_Σu−y¯_Q)

2

L²(Q)+1 2k√

α_Ω(E_TG_Σu−y¯_Ω)k²_L2(Ω)+γ

2kBuk²_L2(Σ), with

γ 2

Z T 0

Z

Γ

|Bu(t, x)|²ds(x)dt= γ 2

Z T 0

k

X

j=1

Z

Γ_j

|

k

X

i=1

u_i(t)χ_i(x)|²ds(x)

! dt

= γ 2

Z T 0

k

X

i=1

Z

Γi

|ui(t)χi(x)|²ds(x)

dt

= γ 2

Z T 0

k

X

i=1

|u_i(t)|² Z

Γ_i

|χ_i(x)|²ds(x)

dt

= γ 2

Z T 0

k

X

i=1

|ui(t)|²|Γi|dt

= 1 2

Z T 0

k

X

i=1

γ_i· |ui(t)|²dt.

(23)

3.4: Optimality conditions 11 We can state the following properties ofJˆand the set of admissible controls:

Remark 3.6.

1. Jˆis continuous on U =L²(0, T;R^k). 2. Jˆis strictly convex in U.

3. U_ad is a non-empty, convex, bounded and closed subset of U. Now we can formulate an existence result for the optimal control:

Theorem 3.7. The optimal control problem( ˆP), and hence the equivalent problem(P), admits a unique optimal controlu^∗ with a corresponding optimal statey^∗.

Proof. This proof is carried out according to the proof of Theorem 2.14 in [Trö10].

The inmumJˆ^∗ := inf_u∈U_adJˆ(u)≥0 exists since J(u)ˆ ≥0 holds for all u∈U_ad. Let (u_n)_n ⊂U_ad be a minimizing sequence such thatJˆ(un)→Jˆ^∗ (n→ ∞). The Hilbert spaceU is reexive. SinceU_ad is a convex, bounded and closed subset ofU, the set U_ad is weakly sequentially compact. Thus, there is a subsequence (un_k)k which converges weakly to some u^∗ ∈U_ad for k→ ∞. The costJˆis convex and continuous and thereby weakly lower semicontinuous. Hence, we have

Jˆ(u^∗)≤lim inf

k→∞

J(uˆ _n_k)≤ lim

k→∞

Jˆ(u_n_k) = lim

n→∞

Jˆ(u_n) = ˆJ^∗≤J(uˆ ^∗).

Consequentlyu^∗ is an optimal control with associated optimal statey^∗=S(f, u^∗, y0).

3.4 Optimality conditions

Since(P) is a convex programming problem, necessary conditions are sucient for optimality. Hence, it suces to consider rst-order optimality conditions. They can be deduced using the formal Lagrange method. Since this method is not mathematically exact the obtained (in)equalities have to be proven. The approach is founded on the KKT theory, also called the exact Lagrange method, which is a mathematically rigorous theory. The basics are provided e.g. in [Trö10, Chapter 6].

For a proof of the optimality conditions, we rst adapt Theorem 3.18 in [Trö10] to our problem:

Theorem 3.8. Lety˜∈W(0, T)be the solution to the PDE cpy˜t−∆˜y+c˜y = 0 in Q,

∂_νy˜+q˜y = g in Σ,

˜

y(0) = 0 in Ω,







(3.4) with constants c_p > 0, q ≥ 0, a coecient function c ∈ L^∞(Q) and g ∈ L²(Σ). Furthermore, let p∈W(0, T)solve

−cppt−∆p+cp = aQ in Q,

∂νp+qp = 0 in Σ, p(T) = _c¹

paΩ in Ω,







(3.5)

for prescribed functionsaQ ∈L²(Q)andaΩ∈H. Then the following equality holds:

Z T 0

Z

Ω

aQy dxdt˜ + Z

Ω

aΩy(T)dx˜ = Z T

0

Z

Γ

gp dsdt. (3.6)

(24)

Proof. We compare the variational formulations of (3.4) with test functionpand (3.5) with test function

˜

y. Integration by parts yields

− Z T

0

cphpt(t),y(t)i˜ _V0,Vdt+ Z T

0

Z

Ω

∇˜y· ∇p+c˜yp dxdt+ Z T

0

Z

Γ

qyp dsdt˜

= Z T

0

Z

Γ

gp dsdt− Z

Ω

c_pp(T)˜y(T)dx

(3.7)

and

− Z T

0

c_phpt(t),y(t)i˜ _V0,V dt+ Z T

0

Z

Ω

∇˜y· ∇p+cyp dxdt˜ + Z T

0

Z

Γ

q˜yp dsdt= Z T

0

Z

Ω

a_Qy dxdt˜ (3.8) withp(T) = 1/c_p·a_Ω. The left hand sides of (3.7) and (3.8) are the same, thus, the right hand sides have to coincide as well. As a result we get the claim (3.6).

Now we can state and prove sucient rst order optimality conditions for(P):

Theorem 3.9. A controlu^∗ ∈U_ad with associated statey^∗∈W(0, T)is optimal for (P)if and only if the so-called adjoint state p∈W(0, T), determined by the adjoint equation

−cpp_t−∆p+cp = α_Q(y_Q−y^∗) inQ,

∂_νp+qp = 0 inΣ,

p(T) = _c¹

p·α_Ω(y_Ω−y^∗(T)) inΩ,







(AE) satises the variational inequality

Z T 0

Z

Γ

(γ· Bu^∗−p) (B(u−u^∗))dsdt≥0 for all u∈U_ad. (V I)

Proof.

1. Well-posedness of (AE):

By a transformation in time (substitution of t by τ =T −t for t∈[0, T]) the 'backward in time PDE' (AE) turns into a PDE that evolves forwards in time. As for (SE), its unique solvability in W(0, T)is given by [Trö10, p.165]. Transforming the solution back in time yields the unique weak solutionp∈W(0, T)of (AE). The proof is carried out in [Stu11, Lemma 2.3.2], for instance.

2. Variational inequality:

Since the cost functionJˆof the reduced problem (Pˆ) is convex inU, [Trö10, Lemma 3.21] provides the equivalence:

u^∗ solves (Pˆ) ⇔ Jˆ⁰(u^∗)(u−u^∗)≥0 for allu∈U_ad. (3.9) The Gâteaux derivative ofJˆinu^∗reads as follows for h∈U:

Jˆ⁰(u^∗)h= Z T

0

Z

Ω

α_Q(G_Σu^∗−y¯_Q)G_Σh dxdt+ Z

Ω

α_Ω(E_TG_Σu^∗−y¯_Ω)E_TG_Σh dx +

Z T 0

Z

Γ

γ(Bu^∗)(Bh)dsdt.

(3.10)

We reverse the substitutions from Corollary 3.5 and seth=u−u^∗ in (3.10). Then (3.9) yields the optimality condition

0≤ Z T

0

Z

Ω

αQ(y^∗−yQ)(y−y^∗)dxdt+ Z

Ω

αΩ(y^∗(T)−yΩ)(y(T)−y^∗(T))dx +

Z T 0

Z

Γ

γ(Bu^∗)(B(u−u^∗))dsdt for allu∈U_ad.

(3.11)

(25)

3.4: Optimality conditions 13 Now we use Theorem 3.8 to deduce (V I) from this inequality:

Lety=y(u)andy^∗=y(u^∗)be the associated states touandu^∗determined by (SE). This means that y˜= y−y^∗ solves PDE (3.4) for g = B(u−u^∗). Let pbe the solution of (AE). Hence, the adjoint statepsolves (3.5) foraQ =αQ(yQ−y^∗)andaΩ=αΩ(yΩ−y^∗(T)). As a result (3.6) yields Z T

0

Z

Ω

αQ(yQ−y^∗)(y−y^∗)dxdt+ Z

Ω

αΩ(yΩ−y^∗(T))(y(T)−y^∗(T))dx= Z T

0

Z

Γ

(B(u−u^∗))p dsdt.

Inserting this in (3.11) results in the variational inequality Z T

0

Z

Γ

(γ· Bu^∗−p) (B(u−u^∗))dsdt≥0 for allu∈U_ad.

In the following we rearrange the variational inequality for better applicability. We get for allu∈U_ad 0≤

Z T 0

Z

Γ

(γ· Bu^∗−p) (B(u−u^∗))dsdt

= Z T

0 k

X

j=1

"

Z

Γ_j

γ

k

X

l=1

u^∗_lχ_l−p

! _k X

i=1

(u_i−u^∗_i)χ_i

! ds

# dt

= Z T

0 k

X

j=1

"

Z

Γj

k

X

i=1

(γu^∗_i −p) (ui−u^∗_i)χi

! ds

# dt

= Z T

0 k

X

i=1

Z

Γ_i

(γu^∗_i −p) (ui−u^∗_i)χids

dt

= Z T

0 k

X

i=1

γ_iu^∗_i −

Z

Γi

p ds

(u_i−u^∗_i)dt. (3.12)

Using this we obtain the equivalences (V I) ⇔

Z T 0

k

X

i=1

γ_iu^∗_i −

Z

Γ_i

p ds

u^∗_idt≤ Z T

0 k

X

i=1

γ_iu^∗_i −

Z

Γ_i

p ds

u_idt for allu∈U_ad

⇔ Z T

0 k

X

i=1

γiu^∗_i −

Z

Γ_i

p ds

u^∗_idt= min

u∈U_ad

Z T 0

k

X

i=1

γiu^∗_i −

Z

Γ_i

p ds

uidt.

Hence, regarded pointwise, (VI) is equivalent to u^∗_i(t) =

ua,i(t) ifγiu^∗_i(t)−R

Γ_ip(t, x)ds(x)>0, ub,i(t) ifγiu^∗_i(t)−R

Γ_ip(t, x)ds(x)<0 and

u^∗_i(t)∈[ua,i(t), ub,i(t)] if γiu^∗_i(t)− Z

Γ_i

p(t, x)ds(x) = 0

for1≤i≤k, a.e. in[0, T]. The last case particularly states u^∗_i(t) = 1

γi

Z

Γ_i

p(t, x)ds(x)∈[u_a,i(t), u_b,i(t)] for1≤i≤k.

(26)

Eventually, we can conclude that the variational inequality is equivalent to the projection formula u^∗_i(t) =P[ua,i(t),ub,i(t)]

1 γ_i

Z

Γi

p(t, x)ds(x)

(P F) f.a.a. t in [0, T] and for all i ∈ {1, . . . , k}. P[a,b] denotes the projection fromR to [a, b] for real values a≤b.

We can summarize the optimality conditions for(P)in an optimality system. The triple (y^∗, u^∗, p)∈W(0, T)×U_ad×W(0, T)is optimal for(P), if and only if it satises

cpy^∗_t−∆y^∗+cy^∗ = f inQ,

∂νy^∗+qy^∗ = Pk

i=1u^∗_iχi inΣ, y^∗(0) = y0 inΩ,







(SE)

−cppt−∆p+cp = αQ(yQ−y^∗) inQ,

∂νp+qp = 0 inΣ,

p(T) = _c¹

pαΩ(yΩ−y^∗(T)) inΩ,







(AE)

u^∗_i(t) =P[u_a,i(t),u_b,i(t)]

1 γ_i

Z

Γ_i

p(t, x)ds(x)

f.a.a. t∈[0, T]andi∈ {1, . . . , k}. (P F)

(OS)

The optimality system can be solved using PDASS. The algorithm is presented in [Stu11, Section 2.4]

and can be adapted to our case of independent controls on disjoint boundary segments. In [KR02], Kunisch and Rösch discuss PDASS for a general class of optimal control problems and provide convergence analysis. Hintermüller, Ito and Kunisch prove superlinear convergence of the algorithm for specic linearly constrained quadratic problems in [HIK02]. Mesh-independence results for PDASS are presented in [HU04]. We outline the primal-dual active set method in Section 4.3 for the FE-Galerkin approximation of the problem and adapt it to the POD-Galerkin reduced problem in Section 6.6.

(27)

Chapter 4

Discretization of the optimal control problem

In the rst section of this chapter we derive an approximation of(P)by using the FE-Galerkin approach.

We just outline the main features of the method and refer to [Dzi10] and [BL13] for more details. Besides an introduction to the nite element method and the required analysis, the latter one also provides some of the most important applications and specic computer code using the numerical software Matlab¹. In the second section we complete the discretization process by applying the implicit Euler method.

Then we introduce the primal-dual active set method in the third section as a possible strategy to solve the completely discretized problem. Henceforth, we considerΩto be a subset ofR²by choosing m= 2.

4.1 The Finite Element Galerkin Method

The Finite Element approach for approximately solving a PDE principally consists of the following steps:

First of all the domainΩ ⊂R² is approximated by a union of triangles. The next step is to choose a function space whose elements are preferably simple on each triangle. For instance one could choose the space of piecewise linear ansatz functions on the triangles. The aim is to nd an approximate solution to the PDE in this reduced FE space. For this purpose the problem has to be transferred to the FE space. This can be done by applying the Galerkin method. Eventually, this yields a system of ordinary dierential equations which can be solved with common methods.

Denition 4.1.

1. For ϕ, ψ∈V we use the notation

h∇ϕ,∇ψi_(H)2 :=h∇ϕ,∇ψi_L2(Ω;R²)=

2

X

i=1

∂ϕ

∂xi

, ∂ψ

∂xi

L²(Ω)

.

2. For a shorter notation of the state equation (SE) and the adjoint equation (AE) we introduce the time-dependent, symmetric bilinear formd(t,·,·) :V ×V →Rby

(ϕ, ψ)7→d(t, ϕ, ψ) :=h∇ϕ,∇ψi_(H)2+hc(t)ϕ, ψi_H+hqϕ, ψi_L2(Σ) fort∈[0, T],

1MATLAB is a registered trademark of The MathWorks Inc.

15

Optimality System POD for Time-Variant, Linear-Quadratic Control Problems