• Keine Ergebnisse gefunden

Proper Orthogonal Decomposition

2.2 MOR Methods

2.2.2 Proper Orthogonal Decomposition

Proper orthogonal decomposition (POD) is a model reduction method that con-structs an optimal low-dimensional projection subspace based on given data. The idea of POD may appear under different names: Karhunen-Loève Decomposition or Principal Component Analysis (PCA) and in different fields other than MOR. It is said [22] that the idea of POD originated from some publications in the early of the 1940s [96,114,93]. It was first used as a model reduction tool in [116] for the investi-gation of inhomogeneous turbulence. Since then, in addition to being applied to the study of coherent structures and turbulence [85,162,21], POD was also exploited to solve numerous types of problems: data compression [9], image processing [144], fluid flows [148, 147], elliptic systems [92], control and inverse problems [98, 175].

The theoretical presentation in this part of this method is based on [86,174].

We will start with discrete data. Let X = [x1,· · · , xn] RN×n, n N be of rank d. In practice, X is generated by experiments or simulations of a given system. One can think of each column ofX as the state of the system that has been discretized into values at nodes taken at a time instant. They are so-calledsnapshots.

It is always a desire to find a smaller group of vectors, preferably orthonormal, i}ki=1, k ≤d, such that this group is the best representative of X. The task can be expressed as an optimization problem

argmax

νi∈RN

k i=1

n j=1

|xj, νi|2 such thatνi, νj=δij,1≤i, j≤k. (2.24) The SVD of matrices is an ideal tool to solve this problem. Let

X=UΛVT (2.25)

be the SVD of X. That is, U RN×N, V Rn×n are orthogonal, Λ RN×n is a diagonal matrix whose diagonal entries are 2

σ1 ≥σ2 ≥ · · · ≥σd>0 =· · ·= 0. By (2.25), the columns of U and V satisfy

Xvi=σui, i= 1,· · · , n, (2.26) XTui=σvi, i= 1,· · ·, n.

It follows that the columns of U and that of V are eigenvectors of the symmetric positive semi-definite matrices XXT andXTX, respectively:

XXTui=σ2iui, i= 1,· · · , N, (2.27) XTXvi =σi2vi, i= 1,· · · , n. (2.28)

2The last Nn columns ofU can be chosen freely such that they, together with the firstn columns, form an orthonormal basis.

Now, we turn back to the problem (2.24). For the case k = 1, let us define the associated Lagrange functional

L(ν, λ)

(ν,λ)∈RN×R= n j=1

|xj, ν|2+λ(1− ν).

The partial derivative ofL(ν, λ)with respect to ν is

∂L

∂u(ν, λ) =

∂u(νTX, XTν+λ(1−νTν))

= 2XXTν−2λν.

The first order necessary optimal condition leads to XXTν =λν.

Taking (2.27) into account, any column vector of U satisfies the necessary condi-tion. It remains to find one amongst them, which solves (2.24), i.e., satisfies the sufficient condition. Suppose that ν˜ RN is any vector of length one. Since U is an orthonormal basis ofRN,ν˜ can be represented as

˜

ν =U UTν.˜ As a consequence,

n j=1

|xj˜|2 = ˜νTXXTν˜

= ˜νTU UTXXTU UTν˜

= ˜νTU UTUΣVTVΣTUTU UT˜ν

= ˜νTUΣΣTUTν.˜

≤σ21ν˜TU IUT˜ν

=σ21

=uT1XXTu1 = n j=1

|xj, u1|2.

In the above argument, we have made use of the fact that ΣΣT = diag(σ12, σ22,· · ·, σd2,0,· · · ,0) RN×N. This leads to the answer that the first col-umn ofU, u1 is a solution to the problem (2.24) for the casek= 1 and the maximal value isσ21.

With the same argument, one can show that the solution of the problem argmax

ν∈RN

n j=1

|xj, ν|2 such thatν2 = 1 and ν, ν1= 0 isu2. This fact leads to the following statement.

Theorem 2.7 (e.g., [174], Theorem 1.1) With the above notations, for any k = 1,· · ·, d, the solution to the problem (2.24) is the set of first k left singular values {ui, i= 1,· · · , k} and the corresponding maximal value is k

i=1σ2i. By the result of this theorem, we define

Definition 2.13 The first k, k≤d, left eigenvectorsui, i= 1,· · ·, k, are called the POD basis of rank k.

For any set of orthonormal vectorsj, j= 1,· · · , k}, we have, n

i=1

xi k j=1

xi, νjνj2= n i=1

xi k j=1

xi, νjνj, xi k j=1

xi, νjνj

= n i=1

xi, xi n

i=1

k j=1

|xi, νj|2.

This suggests that the maximization problem (2.24) is equivalent to the following minimization problem

argmin

νi∈RN

n i=1

xi k j=1

xi, νjνj2 such thatνj, νj=δij,1≤i, j.

Moreover, denote by Υa matrix consisting of orthonormal column vectors νj, j = 1,· · ·, k. It follows that

n i=1

xi, xi n

i=1

k j=1

|xi, νj|2 =trace(XTX−XTΥΥTX)

=trace(XT(I−ΥΥT)X)

=trace(XT(I−ΥΥT)(I−ΥΥT)X)

=trace(((I−ΥΥT)X)T(I−ΥΥT)X)

=(I−ΥΥT)X2F

=X−ΥΥTX2F,

where · F denotes the Frobenius norm. A consequence of Theorem2.7is Corollary 2.8 With the aforementioned notations, we have

X−U(1 :k)U(1 :k)TX2F ≤ X−ΥΥTX2F, (2.29) where U(1 :k) denotes the matrix formed by the firstk columns of U.

In words, inequality (2.29) says that the subspace spanned by the POD basis min-imizes the Frobenius norm of the difference between X and its projection on all subspaces of the same dimension.

Remark In the POD model reduction framework, the dimension of the state space N is usually much larger than the number of snapshotsn. Hence, one would not com-pute ui by solving the N-dimensional eigenvalue problem (2.27). Based on (2.26), one first solves then-dimensioanl eigenvalues problem (2.28) and then computesui as

ui= 1

σiXvi, i= 1,· · ·, k.

To answer the question how large the size of the POD basis should be to ap-proximate the given data X well enough, there is so far no a priori criterion. One clue on which the decision can be based is the ratio

k

i=1σi d

i=1σi.

One can consider to chooseksuch that this ratio is near 1.

The inner product used in the above presentation is the usual Euclidean one.

In many cases where the system is governed by a partial differential equation, it is natural to use another inner product which is derived from the spatial discretization of the underlying equation rather than the original Euclidean product

x, yW =xtW y,

whereW RN×N is a positive definite matrix. More details are provided in [174].

Now, we turn our attention to the case of continuous data. Instead of a matrix, we are given a trajectory {x(t), t [0, T]} ⊂ RN and asked to find a set of k orthonormal vectors νi, i = 1,· · · , k which approximate the trajectory as good a possible. In other words, solve the optimization problem

argmin

νi∈RN T 0

x(t)k

i=1

x(t), νiνi2dt such thatνi, νj=δij,1≤i, j≤k. (2.30) As in the discrete data case, this problem is equivalent to

argmax

νi∈RN

k i=1

T 0

x(t), νi2dt such thatνi, νj=δij,1≤i, j≤k.

In order to clarify the first necessary optimality condition, we define R:RN −→ RN

ν −→ Rν =T

0 x(t), νx(t)dt.

It is shown in [174] that R is linear, bounded, non-negative, and symmetric. Thus Rhas a set of non-negative eigenvalues.

Rui =λiui, λ1 ≥λ2 ≥ · · · ≥λd>0 =· · ·= 0, (2.31) whered is the rank of R. One can observe that R plays the same role asU UT in the discrete data case. And as in the previous case, the eigenvectors of Rform the POD basis as stated in the following theorem, whose proof si given in [174].

Theorem 2.9 ([174], Theorem 1.12) Suppose thatx(t)∈ C([0, T],RN)is the unique solution of the state equation with a given initial condition. Then the solution to problem (2.30) is given by the firstk eigenvectors of R, λ1 ≥λ2≥ · · · ≥λk.

We show how to avoid solving the large eigenvalue problem (2.31) by themethod of snapshots [161]. The matrix representing operatorRinRN is

R =

T 0

x(t)xT(t)dt. (2.32) Now, instead of continuous data{x(t), t∈[0, T]} ⊂RN, we take some snapshots of that trajectory

x(tj),0 =t0 < t1< t2<· · ·< tn=T.

Matrix (2.32) can be approximated as R=

n j=1

x(tj)xT(tjj, whereΔj is the step size tj−tj−1. If we write

X=

x1(t1)

Δ1 · · · x1(tn) Δn

· · · · · · · · · xN(t1)

Δ1 · · · xN(tn) Δn

RN×n,

then matrix R can be written asR=XXT. As in the discrete data case, we solve then-dimensional eigenvalue problem

XTXvi =λivi and compute the first neigenvectors ofR

ui = 1

λiXvi, i= 1,· · · , k.

This argument, on the one hand, shows that discrete data and continuous data cases are treated in a unifying manner, on the other hand, is a crucial point to formulate the so-calledbalanced POD [147], which will be presented later as a remark.

Now, given a POD basis {ui, i = 1,· · ·, r} constructed from data which are taken from a dynamical system

˙

x(t) =Ax(t) +Bu(t),

y(t) =Cx(t), (2.33)

where A∈RN×N, B RN×m, C Rl×N, we demonstrate how to use this basis to produce a reduced system. Since the given data usually contain the most typical states [98], and moreover the POD basis is their representative, the state vector x(t) of the dimension N is approximated by Uxˆ(t), U = [u1,· · · , ur], where the

new state vector xˆ(t) is of the dimension r N. That is, xˆ(t) is the coordinate of the projection of a vector whose coordinate is x(t) on the subspace spanned by {ui, i= 1,· · ·, r}. System (2.33) becomes

Ux˙ˆ(t) =AUxˆ(t) +Bu(t),

y(t) =Cxˆ(t). (2.34)

To avoid the overdetermination of (2.34), one forces its residual to be orthogo-nal with an r-dimensional subspace of RN. The POD method chooses a Galerkin project framework, i.e., the chosen subspace is also the subspace spanned by {ui, i= 1,· · ·, r}. The reduced system is therefore formulated as,

˙ˆ

x(t) = ˆAxˆ(t) + ˆBu(t), ˆ

y(t) = ˆCxˆ(t), whereAˆ=UTAU,Bˆ =UTB,Cˆ=CU.

Remark Note that the application of POD to MOR is not restricted to linear systems. In fact, it is a favorite reduction method for non-linear systems. For a general model of the form (2.1), the associated reduced order model is

˙ˆ

x(t) =UTf(t, Uxˆ(t), u(t)), t∈T, ˆ

y(t) =η(Uxˆ(t), u(t)).

Recall the definition (2.9) and (2.10) of reachability and observability gramians.

If we denote the columns of the input matrix B as b1,· · ·, bn, then the impulse responseeAtBcan be treated as the group of the response state vectorsxi(t) =eAtbi to thei-th unit impulseδ(t)ei, whereei is thei-th unit vector ofRm. Accordingly, the reachability gramian can be written as

R=

0

m i=1

xi(t)xiT(t)dt.

Likewise, the observability gramian is Q=

0

m i=1

zi(t)ziT(t)dt,

where zi(t) = eATtcTi , ci is the i-th row of C. In balanced truncation one has to solve Lyapunov equations (2.11) and (2.12), which is expensive. In practice, impulse response state vectorsxi(t), zi(t)are given at time instantst1,· · ·, tn. The two gramians can be approximated by

R= n j=1

m i=1

xi(tj)xiT(tjj, Q=

n j=1

m i=1

zi(tj)ziT(tjj.

Let us set X =

x11(t1)

Δ1 · · · x11(tn)

Δn · · · xm1 (t1)

Δ1 · · · xm1 (tn) Δn

· · · · · · · · · · · · · · · · · · · · · x1N(t1)

Δ1 · · · x1N(tn)

Δn · · · xmN(t1)

Δ1 · · · xmN(tn) Δn

RN×(mn).

Accordingly,

R=XXT. Likewise,

Q=Y YT. Let

YTX=UΣVT =

U1 U2 Σ1 0 0 Σ2

V1 V2

be the SVD of YTX and Σ1 Rr×r, r <rank(YTX), and Φ1 =XV1Σ12, Ψ1= Σ12U1TYT.

Then Φ1 is composed of the first r columns of the approximate balancing transfor-mation, Ψ1 is the set of the first r rows of its inverse. That is, the new system

ˆ

x(t) = Ψ1AΦ1xˆ+ Ψ1Bu(t), ˆ

y(t) =CΦ1xˆ(t)

will be the reduced system of system (2.33) produced by the approximate balanced truncation. A proof of this can be found in [147]. One can observe that the main ad-vantage of balanced POD is that one needs not compute the two gramians. Instead, only two matricesX, Y, which can be determined from simulations or experiments, are needed. This actually shares the same idea with the original balanced truncation method proposed in [121]. Therefore, balanced POD method is an approximation of the balanced truncation method.

There have been various improvements of POD other than its primary version.

The optimality of snapshot locations was addressed in [99], while quite many re-searches were trying to preserve some property of the original models. The stability was preserved during the POD reduction in [137], the Lagrangian structure in [100].

In [140], POD was applied to non-linear ODE initial value problems; the error and the effect of perturbation in data was analyzed. Some others focused on dealing

with PDSs [92,111,7,29,6,45].