Proper Orthogonal Decomposition

2.2 MOR Methods

2.2.2 Proper Orthogonal Decomposition

Proper orthogonal decomposition (POD) is a model reduction method that con-structs an optimal low-dimensional projection subspace based on given data. The idea of POD may appear under diﬀerent names: Karhunen-Loève Decomposition or Principal Component Analysis (PCA) and in diﬀerent ﬁelds other than MOR. It is said [22] that the idea of POD originated from some publications in the early of the 1940s [96,114,93]. It was ﬁrst used as a model reduction tool in [116] for the investi-gation of inhomogeneous turbulence. Since then, in addition to being applied to the study of coherent structures and turbulence [85,162,21], POD was also exploited to solve numerous types of problems: data compression [9], image processing [144], ﬂuid ﬂows [148, 147], elliptic systems [92], control and inverse problems [98, 175].

The theoretical presentation in this part of this method is based on [86,174].

We will start with discrete data. Let X = [x₁,· · · , x_n] ∈ R^N^×n, n ≤ N be of rank d. In practice, X is generated by experiments or simulations of a given system. One can think of each column ofX as the state of the system that has been discretized into values at nodes taken at a time instant. They are so-calledsnapshots.

It is always a desire to ﬁnd a smaller group of vectors, preferably orthonormal, {ν_i}^k_i=1, k ≤d, such that this group is the best representative of X. The task can be expressed as an optimization problem

argmax

νi∈R^N

k i=1

n j=1

|x_j, ν_i|² such thatν_i, ν_j=δ_ij,1≤i, j≤k. (2.24) The SVD of matrices is an ideal tool to solve this problem. Let

X=UΛV^T (2.25)

be the SVD of X. That is, U ∈R^N×N, V ∈R^n×n are orthogonal, Λ ∈R^N×n is a diagonal matrix whose diagonal entries are ²

σ₁ ≥σ₂ ≥ · · · ≥σ_d>0 =· · ·= 0. By (2.25), the columns of U and V satisfy

Xv_i=σu_i, i= 1,· · · , n, (2.26) X^Tu_i=σv_i, i= 1,· · ·, n.

It follows that the columns of U and that of V are eigenvectors of the symmetric positive semi-deﬁnite matrices XX^T andX^TX, respectively:

XX^Tu_i=σ²_iu_i, i= 1,· · · , N, (2.27) X^TXv_i =σ_i²v_i, i= 1,· · · , n. (2.28)

2The last N−n columns ofU can be chosen freely such that they, together with the ﬁrstn columns, form an orthonormal basis.

Now, we turn back to the problem (2.24). For the case k = 1, let us deﬁne the associated Lagrange functional

L(ν, λ)

(ν,λ)∈R^N×R= n j=1

|x_j, ν|²+λ(1− ν).

The partial derivative ofL(ν, λ)with respect to ν is

∂L

∂u(ν, λ) = ∂

∂u(ν^TX, X^Tν+λ(1−ν^Tν))

= 2XX^Tν−2λν.

The ﬁrst order necessary optimal condition leads to XX^Tν =λν.

Taking (2.27) into account, any column vector of U satisﬁes the necessary condi-tion. It remains to ﬁnd one amongst them, which solves (2.24), i.e., satisﬁes the suﬃcient condition. Suppose that ν˜ ∈ R^N is any vector of length one. Since U is an orthonormal basis ofR^N,ν˜ can be represented as

ν =U U^Tν.˜ As a consequence,

n j=1

|x_j,ν˜|² = ˜ν^TXX^Tν˜

= ˜ν^TU U^TXX^TU U^Tν˜

= ˜ν^TU U^TUΣV^TVΣ^TU^TU U^T˜ν

= ˜ν^TUΣΣ^TU^Tν.˜

≤σ²₁ν˜^TU IU^T˜ν

=σ²₁

=u^T₁XX^Tu₁ = n j=1

|x_j, u₁|².

In the above argument, we have made use of the fact that ΣΣ^T = diag(σ₁², σ²₂,· · ·, σ_d²,0,· · · ,0)∈ R^N×N. This leads to the answer that the ﬁrst col-umn ofU, u₁ is a solution to the problem (2.24) for the casek= 1 and the maximal value isσ²₁.

With the same argument, one can show that the solution of the problem argmax

ν∈R^N

n j=1

|x_j, ν|² such thatν² = 1 and ν, ν₁= 0 isu₂. This fact leads to the following statement.

Theorem 2.7 (e.g., [174], Theorem 1.1) With the above notations, for any k = 1,· · ·, d, the solution to the problem (2.24) is the set of ﬁrst k left singular values {u_i, i= 1,· · · , k} and the corresponding maximal value is _k

i=1σ²_i. By the result of this theorem, we deﬁne

Deﬁnition 2.13 The ﬁrst k, k≤d, left eigenvectorsu_i, i= 1,· · ·, k, are called the POD basis of rank k.

For any set of orthonormal vectors{ν_j, j= 1,· · · , k}, we have, n

i=1

x_i− k j=1

x_i, ν_jν_j²= n i=1

x_i− k j=1

x_i, ν_jν_j, x_i− k j=1

x_i, ν_jν_j

= n i=1

x_i, x_i − n

i=1

k j=1

|x_i, ν_j|².

This suggests that the maximization problem (2.24) is equivalent to the following minimization problem

argmin

νi∈R^N

n i=1

x_i− k j=1

x_i, ν_jν_j² such thatν_j, ν_j=δ_ij,1≤i, j.

Moreover, denote by Υa matrix consisting of orthonormal column vectors ν_j, j = 1,· · ·, k. It follows that

n i=1

x_i, x_i − n

i=1

k j=1

|x_i, ν_j|² =trace(X^TX−X^TΥΥ^TX)

=trace(X^T(I−ΥΥ^T)X)

=trace(X^T(I−ΥΥ^T)(I−ΥΥ^T)X)

=trace(((I−ΥΥ^T)X)^T(I−ΥΥ^T)X)

=(I−ΥΥ^T)X²_F

=X−ΥΥ^TX²_F,

where · _F denotes the Frobenius norm. A consequence of Theorem2.7is Corollary 2.8 With the aforementioned notations, we have

X−U(1 :k)U(1 :k)^TX²_F ≤ X−ΥΥ^TX²_F, (2.29) where U(1 :k) denotes the matrix formed by the ﬁrstk columns of U.

In words, inequality (2.29) says that the subspace spanned by the POD basis min-imizes the Frobenius norm of the diﬀerence between X and its projection on all subspaces of the same dimension.

Remark In the POD model reduction framework, the dimension of the state space N is usually much larger than the number of snapshotsn. Hence, one would not com-pute u_i by solving the N-dimensional eigenvalue problem (2.27). Based on (2.26), one ﬁrst solves then-dimensioanl eigenvalues problem (2.28) and then computesu_i as

u_i= 1

σ_iXv_i, i= 1,· · ·, k.

To answer the question how large the size of the POD basis should be to ap-proximate the given data X well enough, there is so far no a priori criterion. One clue on which the decision can be based is the ratio

i=1σ_i _d

i=1σ_i.

One can consider to chooseksuch that this ratio is near 1.

The inner product used in the above presentation is the usual Euclidean one.

In many cases where the system is governed by a partial diﬀerential equation, it is natural to use another inner product which is derived from the spatial discretization of the underlying equation rather than the original Euclidean product

x, y_W =x^tW y,

whereW ∈R^N×N is a positive deﬁnite matrix. More details are provided in [174].

Now, we turn our attention to the case of continuous data. Instead of a matrix, we are given a trajectory {x(t), t ∈ [0, T]} ⊂ R^N and asked to ﬁnd a set of k orthonormal vectors ν_i, i = 1,· · · , k which approximate the trajectory as good a possible. In other words, solve the optimization problem

argmin

νi∈R^N T 0

x(t)−^k

i=1

x(t), ν_iν_i²dt such thatν_i, ν_j=δ_ij,1≤i, j≤k. (2.30) As in the discrete data case, this problem is equivalent to

argmax

νi∈R^N

k i=1

T 0

x(t), ν_i²dt such thatν_i, ν_j=δ_ij,1≤i, j≤k.

In order to clarify the ﬁrst necessary optimality condition, we deﬁne R:R^N −→ R^N

ν −→ Rν =_T

0 x(t), νx(t)dt.

It is shown in [174] that R is linear, bounded, non-negative, and symmetric. Thus Rhas a set of non-negative eigenvalues.

Ru_i =λ_iu_i, λ₁ ≥λ₂ ≥ · · · ≥λ_d>0 =· · ·= 0, (2.31) whered is the rank of R. One can observe that R plays the same role asU U^T in the discrete data case. And as in the previous case, the eigenvectors of Rform the POD basis as stated in the following theorem, whose proof si given in [174].

Theorem 2.9 ([174], Theorem 1.12) Suppose thatx(t)∈ C([0, T],R^N)is the unique solution of the state equation with a given initial condition. Then the solution to problem (2.30) is given by the ﬁrstk eigenvectors of R, λ₁ ≥λ₂≥ · · · ≥λ_k.

We show how to avoid solving the large eigenvalue problem (2.31) by themethod of snapshots [161]. The matrix representing operatorRinR^N is

R =

T 0

x(t)x^T(t)dt. (2.32) Now, instead of continuous data{x(t), t∈[0, T]} ⊂R^N, we take some snapshots of that trajectory

x(t_j),0 =t₀ < t₁< t₂<· · ·< t_n=T.

Matrix (2.32) can be approximated as R=

n j=1

x(t_j)x^T(t_j)Δj, whereΔj is the step size t_j−t_j−1. If we write

⎡

⎣x₁(t₁)√

Δ1 · · · x₁(t_n)√ Δn

· · · · · · · · · x_N(t₁)√

Δ1 · · · x_N(t_n)√ Δn

⎤

⎦∈R^N×n,

then matrix R can be written asR=XX^T. As in the discrete data case, we solve then-dimensional eigenvalue problem

X^TXv_i =λ_iv_i and compute the ﬁrst neigenvectors ofR

u_i = √1

λ_iXv_i, i= 1,· · · , k.

This argument, on the one hand, shows that discrete data and continuous data cases are treated in a unifying manner, on the other hand, is a crucial point to formulate the so-calledbalanced POD [147], which will be presented later as a remark.

Now, given a POD basis {u_i, i = 1,· · ·, r} constructed from data which are taken from a dynamical system

x(t) =Ax(t) +Bu(t),

y(t) =Cx(t), (2.33)

where A∈R^N×N, B ∈R^N×m, C ∈R^l×N, we demonstrate how to use this basis to produce a reduced system. Since the given data usually contain the most typical states [98], and moreover the POD basis is their representative, the state vector x(t) of the dimension N is approximated by Uxˆ(t), U = [u₁,· · · , u_r], where the

new state vector xˆ(t) is of the dimension r N. That is, xˆ(t) is the coordinate of the projection of a vector whose coordinate is x(t) on the subspace spanned by {u_i, i= 1,· · ·, r}. System (2.33) becomes

Ux˙ˆ(t) =AUxˆ(t) +Bu(t),

y(t) =Cxˆ(t). (2.34)

To avoid the overdetermination of (2.34), one forces its residual to be orthogo-nal with an r-dimensional subspace of R^N. The POD method chooses a Galerkin project framework, i.e., the chosen subspace is also the subspace spanned by {u_i, i= 1,· · ·, r}. The reduced system is therefore formulated as,

˙ˆ

x(t) = ˆAxˆ(t) + ˆBu(t), ˆ

y(t) = ˆCxˆ(t), whereAˆ=U^TAU,Bˆ =U^TB,Cˆ=CU.

Remark Note that the application of POD to MOR is not restricted to linear systems. In fact, it is a favorite reduction method for non-linear systems. For a general model of the form (2.1), the associated reduced order model is

˙ˆ

x(t) =U^Tf(t, Uxˆ(t), u(t)), t∈T, ˆ

y(t) =η(Uxˆ(t), u(t)).

Recall the deﬁnition (2.9) and (2.10) of reachability and observability gramians.

If we denote the columns of the input matrix B as b₁,· · ·, b_n, then the impulse responsee^AtBcan be treated as the group of the response state vectorsxⁱ(t) =e^Atb_i to thei-th unit impulseδ(t)e_i, wheree_i is thei-th unit vector ofR^m. Accordingly, the reachability gramian can be written as

∞ 0

m i=1

xⁱ(t)x^iT(t)dt.

Likewise, the observability gramian is Q=

∞ 0

m i=1

zⁱ(t)z^iT(t)dt,

where zⁱ(t) = e^A^T^tc^T_i , c_i is the i-th row of C. In balanced truncation one has to solve Lyapunov equations (2.11) and (2.12), which is expensive. In practice, impulse response state vectorsxⁱ(t), zⁱ(t)are given at time instantst₁,· · ·, t_n. The two gramians can be approximated by

R= n j=1

m i=1

xⁱ(t_j)x^iT(t_j)Δj, Q=

n j=1

m i=1

zⁱ(t_j)z^iT(t_j)Δj.

Let us set X =

⎡

⎣x¹₁(t₁)√

Δ1 · · · x¹₁(t_n)√

Δn · · · x^m₁ (t₁)√

Δ1 · · · x^m₁ (t_n)√ Δn

· · · · · · · · · · · · · · · · · · · · · x¹_N(t₁)√

Δ1 · · · x¹_N(t_n)√

Δn · · · x^m_N(t₁)√

Δ1 · · · x^m_N(t_n)√ Δn

⎤

⎦∈R^N^×(mn).

Accordingly,

R=XX^T. Likewise,

Q=Y Y^T. Let

Y^TX=UΣV^T =

U₁ U₂ Σ1 0 0 Σ2

V₁ V₂

be the SVD of Y^TX and Σ1 ∈R^r×r, r <rank(Y^TX), and Φ1 =XV₁Σ⁻¹², Ψ1= Σ⁻¹²U₁^TY^T.

Then Φ1 is composed of the ﬁrst r columns of the approximate balancing transfor-mation, Ψ1 is the set of the ﬁrst r rows of its inverse. That is, the new system

x(t) = Ψ1AΦ1xˆ+ Ψ1Bu(t), ˆ

y(t) =CΦ1xˆ(t)

will be the reduced system of system (2.33) produced by the approximate balanced truncation. A proof of this can be found in [147]. One can observe that the main ad-vantage of balanced POD is that one needs not compute the two gramians. Instead, only two matricesX, Y, which can be determined from simulations or experiments, are needed. This actually shares the same idea with the original balanced truncation method proposed in [121]. Therefore, balanced POD method is an approximation of the balanced truncation method.

There have been various improvements of POD other than its primary version.

The optimality of snapshot locations was addressed in [99], while quite many re-searches were trying to preserve some property of the original models. The stability was preserved during the POD reduction in [137], the Lagrangian structure in [100].

In [140], POD was applied to non-linear ODE initial value problems; the error and the eﬀect of perturbation in data was analyzed. Some others focused on dealing

with PDSs [92,111,7,29,6,45].

Im Dokument Interpolation Based Parametric Model Order Reduction (Seite 32-38)