Optimal interpolation-based model reduction

(1)

Optimal interpolation-based model reduction

von Dorota Kubali´nska

Dissertation

zur Erlangung des Grades einer Doktorin der Naturwissenschaften – Dr. rer. nat. –

Vorgelegt im Fachbereich 3 (Mathematik & Informatik) der Universit¨at Bremen

(2)

Gutachter: Prof. Dr. Angelika Bunse-Gerstner, Universit¨at Bremen

(3)

iii Abstract

This dissertation is devoted to the development and study of new techniques in the field of model reduction for large-scale linear time-invariant (LTI) dynamical systems. The behavior of processes in electrical networks, mechanics, aeronautics, civil en-gineering, micro-electro-mechanical-systems, weather prediction and many others can be described by mathematical models. Such models are usually determined by suitable systems of partial differential equations. Their linearization and discretiza-tion by means of finite element or finite difference methods lead to high-dimensional systems of linear ordinary differential or difference equations. The number of re-sulting equations depends on the quality of the discretization and is typically very large. It can easily reach a few millions.

Model reduction methods can then be helpful as they provide automatic processes which construct a reduced order system of the same form but of lower complexity, whose input-output behavior approximates the behavior of the original system. Most of the current methods are designed for approximating asymptotically stable systems. However, some processes such as weather development are unstable. A stan-dard approach to reduce the order of an unstable system is to divide the system into a completely stable and completely unstable subsystems and then apply model re-duction only to the stable part. This approach does not always give a satisfactory result. In contrast, in this thesis new interpolation-based methods are proposed that aim to compute an optimal reduced-order model for stable as well as for un-stable systems. For these optimization problems tangential interpolation-based first order necessary optimality conditions are derived. On the basis of the established theory, a MIMO (multiple-input-multiple-output) Iterative Rational Interpolation Algorithm (MIRIAm) is proposed which, if it converges, provides a reduced-order system that satisfies the aforementioned first order necessary conditions.

For asymptotically stable systems several different sets of the first order necessary optimality conditions have already been developed. Among them, the gramian-based conditions given by Wilson and Hyland-Bernstein are of special interest. In this the-sis a generalization of these conditions to unstable continuous-time and discrete-time LTI dynamical systems is introduced and shown to be equivalent to the tangen-tial interpolation-based first order necessary conditions. The major drawback of the gramian-based approach is its high computational cost caused by the need for solving two large Lyapunov equations for continuous-time systems and of two Stein equations for discrete-time systems in each iteration. The complexity of the tan-gential interpolation-based approach (MIRIAm), introduced in this thesis, is much smaller.

In numerical experiments using an example of the CD player, the accuracy of the new method is illustrated and compared with other existing methods. The beneﬁt of the new approach to model order reduction of unstable systems is also demonstrated in several numerical experiments with shallow water test models. In case of systems with large number of small unstable poles, the method is shown to be signiﬁcantly better than the currently used approach based on the decomposition of the system into stable and unstable part.

(4)

Acknowledgement

First of all my sincere gratitude goes to Prof. Dr. Angelika Bunse-Gerstner for her inspirational supervision, courteous support and incredible openness throughout the duration of this doctoral thesis.

I also wish to thank Prof. Dr. Heike Faßbender, who has kindly agreed at the last minute to proofread and review my thesis at a very short time.

I am deeply grateful to Prof. Dr. Wojciech Okrasi´nski for his good advice and endless support all the way through my Master and Ph.D. work.

I am also thankful to all academic, administrative and technical staff of the ’Zentrum für Technomathematik’ for their unrelenting support. My fellow colleagues at the graduate school Scientific Computing in Engineering (SCiE) have been fantastic. I would like to thank: my Tandem-partner Shyam Praveen Vudathu, for the suc-cessful cooperation, Dr. Yakub Tijani, for his friendship and unequaled language support, Dr. Jonathan Montalvo Urquizo, Dr. Ludger Prünte, Zafer Akbay, Yuhang Wang, Dr. Sebastian Meier and Dr. Dirk Lorenz for their productive discussions and good working relationship. I will always remember the nice time I had with them.

I also thank my colleagues and friends: Dr. Caroline B¨oß, Dr. Iwona Piotrowska, Daniel Wilczek, Justyna Przybysz-Jarnut, Agnieszka Suchocka, Dr. Georg Vossen and Kamil Kazimierski for endless fruitful discussions - both mathematically and personally.

I am grateful to my parents, who are always proud of me and to my brothers with whom I share several unforgettable moments in our home in Poland. I especially thank my mother and my father for their dedication to me. I hope this Ph.D. thesis shows them that everything they have gone through is not in vain.

Last but not least, my deepest gratitude goes to someone special, my beloved hus-band Michal Kubali´nski for the company, love and encouragement which helped me to never give up.

Finally, thank you Lord for the goodness I experience every single day. Dorota Kubali´nska

(5)

Chapter 1 Introduction

The behavior of processes in electrical networks, mechanics, aeronautics, civil engineering, micro-electro-mechanical-systems (MEMS), weather prediction and many others can be described by mathematical models. Such models are usually determined by suitable systems of partial differential equations (PDEs). Their linearization and discretization by means of finite element or finite difference methods lead to high-dimensional systems of linear ordinary differential or difference equations. The number of resulting equations depends on the quality of discretization and is typically very large. It can easily reach a few millions.

An area of research, where such large-dimensional systems often arise is, for instance, the field of MEMS. In the last years, a notable success in developing state-of-the-art MEMS products with innovative designs and processes has been achieved, see e.g. [15, 20, 31] and references therein. MEMS are finding applications in many parts of the engineering world like automotive, commu-nication, mechanical, biomedical and even in aerospace engineering. While a single component can be easily simulated on a regular desktop computer, modeling and simulation of the behavior of a system consisting of many single devices still presents a challenge in the development of microsystem applica-tions. These processes demand advanced simulation techniques for control and optimization. It is a difficult task if complex structures and coupled field effects have to be considered. Usually the accurate modeling of such a system leads to high order models. Even though modern supercomputers that might be able to solve such huge problems exist, the computational cost and time needed for simulation would be extremely high if the full order models were used directly. Several inspiring examples of large-dimensional models of MEMS devices are presented in the Oberwolfach benchmark collection [2].

Large-dimensional problems also arise in the ﬁeld of data assimilation, for instance, in numerical weather prediction [22, 70]. For a global forecast,

(10)

the atmosphere is discretized by a three dimensional mesh. In each of the grid points, we are interested in several parameters such as temperature, pressure, three wind velocities, density and other physical parameters. All these quantities from each grid point form a large-dimensional vector describing the true state of the atmosphere. The size of such a vector typically is of order 107_{− 10}8_{. Observational data for the true state of the atmosphere are provided}

by weather satellites, radiosondes and other instruments. Due to diﬃculties in measuring the atmospheric data over the oceans and parts of continents, the number of available observations is signiﬁcantly smaller than the order of the state of the atmosphere. Data assimilation methods are thus necessary to estimate the true state of the atmosphere. The linear system underlying the data assimilation problem is not a control system of standard form. In [22], it is shown how the statistical information of the data assimilation problem can be used to formulate this system as a linear control system of standard form. However, numerical treatment of this system is very expensive for large-scale models that appear in weather prediction and data assimilation.

In many cases, it is possible to reduce the size of a model and, consequently, the computational cost. One possible way to deal with the problem is to decrease the size of the system by choosing a discretization grid of lower resolution. In this approach, based mainly on practical considerations, some important features of the process might not be captured. In consequence, the need for efficient model reduction techniques that create accurate lower order dynamical models preserving the essential behavior of the original one is greater than ever. It has recently become a major goal of simulation and modeling research. In this thesis, a special class of dynamical systems is considered, namely the class of linear time-invariant (LTI) dynamical systems. They commonly arise in practical applications when physical processes have to be modeled and simu-lated. These LTI models are described by systems of differential (continuous-time LTI systems) and difference (discrete-(continuous-time LTI systems) equations of the form ¨ ˙x(t) = Ax(t) + Bu(t), y(t) = Cx(t) and ¨ x_k+1 = Ax_k+ Bu_k, y_k = Cx_k, (1.0.1)

respectively, or in the frequency domain, represented by their transfer function

H(s) = C(sI_N − A)−1B, (1.0.2)

where A∈ CN×N _{is the state matrix, B} _{∈ C}N×m _{and C} _{∈ C}p×N _{are input and}

output distribution arrays, respectively. The vectors x(t), x_k∈CN_{, y(t), y} k∈Cp,

and u(t), u_k∈Cm_{represent the state, the output and the input of the}

(11)

3

The purpose of model order reduction is to construct a reduced-order system

( ˙ˆx(t) = ˆAˆx(t) + ˆBu(t), ˆ y(t) = C ˆˆx(t) and ( ˆ x_k+1 = Aˆˆx_k+ ˆBu_k, ˆ y_k = C ˆˆx_k, (1.0.3)

with transfer function

ˆ

H(s) = ˆC(sI_n− ˆA)−1B,ˆ (1.0.4)

where ˆA ∈ Cn×n, ˆB ∈ Cn×m, ˆC ∈ Cp×n and n N, which has input-output behavior that approximates the input-output behavior of the large system. The quality of this approximation could be measured by the closeness of the transfer functions, i.e.

H(s) − ˆH(s) < ε

for a given accuracy ε and a suitable norm, for instance:

• for continuous-time LTI systems: H(·)H2,α = 1 2π ∞ Z −∞

trace [H(α + iω)]∗[H(α + iω)] dω

1 2 and H(·)H∞,α = sup ω∈R σ_max(H(α + iω)).

• for discrete-time LTI systems: H(·)h2,α = 1 2π 2π Z 0

trace H(αeiθ)∗H(αeiθ)dθ

1 2

and

H(·)h∞,α = sup

θ∈[0,2π]

σ_max(H(αeiθ)).

In this thesis, we introduce the use of these norms in model reduction. In control theory and current approaches to model reduction only special cases of these norms are commonly used, namelyH_2,0- andH_∞,0-norm for continuous-time systems and h_2,1- and h_∞,1-norm for discrete-time systems. For simplicity, they are usually denoted by H₂,H_∞, h₂ and h_∞, respectively.

Current reduction methods can be divided into two groups. On the one hand, there are truncation methods (see Section 3.3) using singular value decom-position (SVD)-based techniques to select the important part of the system and neglecting the rest. A well-grounded method in this group is balanced

(12)

truncation, first proposed by Moore [80]. The basic idea of this technique is to compute a special realization of a given system, which is called balanced realization. It is achieved by representing the system in a basis where the states which are least reachable are simultaneously least observable. Then the difficult to reach and difficult to observe states are truncated, and the easily reachable and observable are retained. The advantage of this technique is that it preserves stability and global error bounds can be derived. The complexi-ty of the computation, however, is of order N3_{. Therefore the method is by}

far too expensive for very large systems. New approaches try to compute the transformation matrix approximately with lower costs (see for example [13, 14, 71, 85, 99] and references given there), but then the global error bounds can be lost. To maintain the bound for this case an adaptation would be needed taking into account the error in the approximation. Among SVD-based me-thods frequently used are also proper orthogonal decomposition [7, 75], Hankel norm approximation [52] and singular perturbation approximation [19]. On the other hand, we have (Krylov) interpolation-based methods using large system techniques, which can handle large dimensions in the computation of the reduced system (see Section 3.4). The weakness of these methods is their lack of computable global error bounds. Moreover, the preservation of asymptotic stability often cannot be guaranteed. The transfer function of the reduced-order model constructed by these methods matches a certain number of moments, proportional to the values and derivatives of the transfer function of the original model at a chosen point. Therefore, these techniques are often referred to as model order reduction by moment matching. A reduced-order system which matches the Markov parameters of the original model, i.e. mo-ments around inﬁnity, is known as partial realization, see e.g. [25, 66]. This gives a good approximation in high frequencies but often is inaccurate in low frequencies. Analogously, matching many moments around zero or any other point provides a reduced-order model with a good local approximation around the a-priori chosen interpolation point and potentially bad elsewhere. Overall accurate results are obtained when the reduced-order system is constructed via rational Hermite interpolation, sometimes called multipoint moment matching. For a given transfer function of the system (1.0.1) and given interpolation points σ₁,· · · , σ_n, model reduction by rational Hermite interpolation involves ﬁnding a system (1.0.3) with transfer function ˆH, such that ˆH(s) interpolates H(s) and certain numbers μ_k of its derivatives in all points σ_k, i.e.

H(σ_k) = ˆH(σ_k), for k = 1, 2, . . . , n, dH ds(σk) =d _Hˆ ds(σk), for = 1, 2, . . . , μk− 1. (1.0.5)

It is well-known that computing these moments explicitly leads to numerical instabilities, especially for large dimensions. The remedy for this problem are Krylov subspace based techniques that impose moment matching conditions

(13)

5

without computing the moments explicitly (e.g. [54, 110, 111]). Consequently, the crucial problem is to ﬁnd a choice of interpolation points such that ˆH

is a good overall approximation of H. One way to deal with this problem is to aim to construct the reduced-order model that minimizes the H₂-norm for continuous-time, or h₂-norm for discrete-time LTI systems, of the error between the transfer functions H and ˆH. In 1967, Meier and Luenberger [78]

showed that for SISO (single-input-single-output) continuous-time LTI systems matching the value and the ﬁrst derivative of H at the mirror images of eigen-values of the reduced-order system is a necessary condition for minimizing

H − ˆH_H₂. Unfortunately, these points are not known a-priori. Recently, Gugercin et. al. [57] proposed an iterative algorithm that, if it converges, con-structs a reduced-order model satisfying the ﬁrst order necessary H₂-norm optimality conditions. Analogous results for discrete-time LTI systems are also presented.

In the case of MIMO (multiple-input-multiple-output) systems, where ˆH is

a p× m matrix-valued function, satisfying equalities (1.0.5) is too strong a con-dition. Therefore a generalization, called tangential interpolation, was deve-loped [48]. Instead of matching moments of H at various points of the complex plane, one requires only that the transfer function ˆH matches moments of H at different points in various left and right tangential directions. This technique is called tangential rational Hermite interpolation. These conditions are weaker than the moment matching conditions. But now in addition to a good choice of interpolation points we need suitable tangential directions in order to obtain a reduced system that approximates the original one well. Such methods are also widely considered as providing reduced systems with only a good local approximation around the a-priori chosen interpolation points. In this thesis, it is shown that by the correct choice of interpolation data one can construct a reduced-order approximant that satisfies the first order nece-ssary H₂- and h₂-norm optimality conditions for MIMO continuous-time and discrete-time LTI systems, respectively. Similar results, for continuous-time case, were obtained simultaneously and independently by Gugercin et. al. [58] and Van Dooren et. al. [36]. Here, we prove also that the derived tangen-tial interpolation-based first order necessary conditions are equivalent to well-known gramian-based first order necessary optimality conditions for MIMO systems established by Wilson [106, 107] and Hyland and Bernstein [64]. There are also methods which are a combination of these two approximation methods. A recent and rather complete study of model reduction techniques with an emphasis on Krylov and SVD-Krylov methods can be found in [4] and references therein. We also refer to [18, 82, 88, 100] as sources of valuable information about model order reduction.

Most of the existing model reduction techniques are designed for approximating asymptotically stable LTI systems, i.e. systems with all poles lying in the open

(14)

left half plane for continuous-time systems, or inside the unit circle for discrete-time systems. Some processes, such as weather development, are in their nature unstable and the unstable part plays a very important role. A common approach here is to divide the system into stable and unstable parts. Then the unstable part is retained and reduction techniques are applied only to the stable part. There are two main disadvantages of this approach. On the one hand, the process of dividing the system into stable and unstable parts is computationally expensive for large-dimensional systems. On the other hand, reducing only the stable part is not very beneﬁcial when the dimension of unstable part of the system is large. The main contribution of this thesis is the development of new interpolation-based model reduction techniques that aim to compute an optimal reduced-order model for unstable systems, simultaneously avoiding the earlier mentioned diﬃculties. In numerical experiments, we demonstrate the quality of our methods and compare it with the standard approach. The method, which we propose, is based on a crucial observation that for all nonsingular LTI systems, both stable and unstable, there exists an α such that the system is α-bounded. It means that all poles of the considered systems lie in C−_α := {s ∈ C : (s) < α} for continuous-time LTI models, or inside D−

α := {s ∈ C : |s| < α, α > 0}, in the discrete-time case. We stress the

fact that α-boundedness of a system ensures that its transfer function is either an element of Hardy space H_2,α (continuous-time) or h_2,α (discrete-time). It follows that in these spaces the norm of the error between original and reduced-order systems is well-defined and can be computed for both stable and unstable systems. Construction of the reduced-order α-bounded model that aims to minimize the H_2,α-norm (continuous-time) and h_2,α-norm (discrete-time) of this error is the goal of model reduction techniques introduced in this thesis. For this optimization problem, we derive the first order necessary optimality conditions. These involve satisfying certain tangential interpolation conditions for the transfer function of the original and the reduced system. The proof is based on a reformulation of the H_2,α- and h_2,α-norm involving the poles and residues of the system which are established by means of basic results from the theory of complex functions. These conditions also imply an H_2,α- and h_2,α -error expression for an optimal reduced-order continuous-time and discrete-time LTI system, respectively. Out of this core theory, a MIMO Iterative Rational Interpolation Algorithm (MIRIAm) is presented. This algorithm, if it converges, provides a reduced system that satisfies the interpolation-based first order necessary conditions for the above stated optimization problems. The efficiency of the proposed algorithm is presented on several examples. It is also shown that an analogue of the previously mentioned gramian-based first order H₂- and h₂-optimality conditions can be derived for α-bounded systems and that all three sets of conditions are equivalent. Moreover, we show how the standard model reduction techniques for stable systems can be generalized to α-bounded ones.

(15)

7

In the following, we give the outline of this thesis.

In Chapter 2, some of the standard facts on control theory of linear time-invariant dynamical systems, including a brief exposition of their crucial proper-ties such as stability, observability and reachability, are reviewed. Special attention is also given to the study of input-output theory with emphasis on essential properties of transfer functions. A section is also devoted to norms, particularly for H_2,α and h_2,α Hardy spaces.

Chapter 3 contains a discussion of the current model reduction techniques. Al-most all methods presented here, except explicit moment matching, construct the reduced-order model by projection. In addition to balanced truncation, other gramian-based reduction techniques such as singular perturbation and Hankel norm approximation are introduced. It is followed by an overview of interpolation- and tangential interpolation-based model reduction procedures. A more detailed exposition of what is known to date aboutH₂- and h₂-optimal reduction methods, being of special interest in this thesis, is presented. In the next chapter, theH_2,α-optimal model reduction problem for continuous-time LTI systems is treated. First, we derive a new expression for the H_2,α -norm of the system. Based on this expression, the first order necessary con-ditions for minimizing the H_2,α-norm of the error between the original model and the reduced-order approximant are determined. Moreover, an iterative algorithm, MIRIAm, computing the reduced-order system which satisfies the aforementioned conditions, is presented in Section 4.3.3. The potential of the new method is illustrated by several examples and also a comparison with other existing methods is presented. Concluding this chapter, we extend the gramian-based first order necessary H₂-optimality conditions to the case of

α-bounded systems and show that they are equivalent to those developed in

this thesis.

Chapter 5 is devoted to the study of model reduction techniques minimizing the h_2,α-norm of the error system. Achievements of Chapter 4 for continuous-time LTI systems are carried over for discrete-continuous-time models. In numerical ex-periments with a shallow water test model a discrete version of the MIRIAm algorithm is applied to a newly developed technique [22] for approximating incremental 4D-Var data assimilation by means of model reduction.

Some conclusions and a brief discussion of open problems, possibilities for improvements and future research are oﬀered in Chapter 6.

(16)

(17)

Chapter 2 Preliminaries

In this chapter, we review some of the basic facts on control theory of linear dynamical systems which are essential for model reduction and understanding of the new approach presented in this thesis. Most of the materials presented here are standard and can be found in many textbooks. However, we mainly relied on [4, 26, 62, 91, 97]. In Section 2.1 we introduce the general concept of a linear time-invariant dynamical system and give a brief exposition of some crucial features of these systems. A short overview of the input-output theory, with special focus on essential properties of transfer functions, is presented in Section 2.2. Finally, Section 2.3 is devoted to the study of the most commonly used norms.

We note that we will use MATLAB notation throughout the thesis.

2.1 Basics on Control Theory

In this section we have compiled some basic results concerning a very important class of models which is used, for example, to analyze and design systems arising from electrical networks, micro-electro-mechanical-systems (MEMS), mechanics, aeronautics, civil engineering, etc., namely linear time-invariant (LTI), continuous-time and discrete-time ﬁrst order dynamical systems. Consider the following three linear spaces: the state spaceX = {x : R → KN_},

the input space U = {u : R → Km_{} and the output space Y = {y : R → K}p_}.

The letterK will denote the real (K = R) or complex (K = C) ﬁeld throughout this thesis.

A linear system described by the following set of algebraic and diﬀerential equations Σ : ¨ ˙x(t) = Ax(t) + Bu(t), y(t) = Cx(t) (2.1.1) 9

(18)

is called linear time-invariant, continuous-time ﬁrst order dynamical system. Here t∈ R₊ is the time variable, vector x(t)∈ KN _{denotes the state of the}

system, u(t)∈ Km _{is the input vector and y(t)} _{∈ K}p _{is the output}

measure-ment vector.

Analogously one can describe the discrete-time LTI system.

Consider three linear spaces: • the state space X = (KN₎Z ₌_{{x : Z → K}N_}, • the input space U = (Km₎Z₌_{{u : Z → K}m_}, • the output space Y = (Kp₎Z₌_{{y : Z → K}p_}.

A linear system described by the following set of algebraic and diﬀerence equations

Σ :

¨

x_k+1= Ax_k+ Bu_k,

y_k= Cx_k (2.1.2)

is called linear time-invariant, discrete-time ﬁrst order dynamical system. Ve-ctors x_k= x(t_k)∈KN_{, u}

k= u(tk)∈ Km and yk= y(tk)∈ Kp are the state, the

input and the output functions at time t_k∈ R₊, respectively.

In both cases, A∈ KN×N _{is the system matrix, B} _{∈ K}N×m _{and C} _{∈ K}p×N _are

input and output distribution arrays, respectively. In most practical cases, we can assume that m and p are much smaller than N . Note that ”time-invariant” in ”LTI” refers to the entries of the state matrix A, input matrix B and output matrix C not depending on the time variable t.

For m = p = 1, the systems (2.1.1) and (2.1.2) are called single-input-single-output (SISO) LTI systems. Otherwise, when m > 1 and p > 1, these sys-tems are known as multiple-input-multiple-output (MIMO) LTI dynamical systems.

In electrical engineering, both types of linear time-invariant dynamical systems are often described by the symbol

Σ = A B C .

Let T ∈ KN×N _{be a regular matrix with x(t) = T}−1_{x(t) and x}_˜

k = T−1x˜k.

By inserting these formulas into equations (2.1.1) and (2.1.2) and multiplying from the left with T we get

1. continuous-time system ˙˜x(t) = T AT−1_{x(t) + T Bu(t),}_˜ y(t) = CT−1x(t);˜ (2.1.3) 2. discrete-time system ˜ x_k+1 = T AT−1x˜_k+ T Bu_k, y_k = CT−1x˜_k. (2.1.4)

(19)

2.1. BASICS ON CONTROL THEORY 11

Equations (2.1.1) and (2.1.3) [(2.1.2) and (2.1.4)] are just two diﬀerent reali-zations describing the same continuous [discrete] LTI system, in short notation:

˜ Σ = A_˜˜ B˜ C ! = T AT−1 T B CT−1 .

These state-space representations of the systems are referred to as being alge-braically equivalent. There are inﬁnitely many equivalent realizations of the same LTI system. From Linear Algebra, we know that the matrices A and

T AT−1 have the same eigenvalues.

Deﬁnition 2.1.1 (Order of Σ). The dimension N of the state matrix A is

called the order of the corresponding system Σ.

One can also observe that there exist realizations of the systems (2.1.1) and (2.1.2) of arbitrarily high order. However, there is a lower limit on the order of the system. This number is called the McMillan degree of the system. Deﬁnition 2.1.2 (Minimal realization). Let N be the McMillan degree of the

LTI system Σ. Then a realization of order N of this system is called a minimal realization.

Deﬁnition 2.1.3 (State transition function). A function ϕ, deﬁned for all

pairs (t₁, t₀)∈ (T × T )₊ (where (T × T )₊ :={t₁ ∈ T, t₀ ∈ T, t₁ ≥ t₀}), for all

initial states x₀ ∈ X and all inputs u ∈ U, such that x(t₁) := ϕ(t₁, t₀, x₀, u), is called the state transition function.

The function ϕ delivers the state at time t₁ reached from the state x₀ at an earlier time t₀ as a result of the input u.

It follows directly from the theory of ordinary differential and difference equa-tions that for any given x(0) = x₀ and any given admissible input u ∈ U, the solutions x(t) and x_k to the first equation of (2.1.1) and (2.1.2), respectively, are uniquely determined by

x(t) = ϕ(t, 0, x₀, u) = eAtx₀+

t

Z

0

eA(t−s)Bu(s)ds, for t≥ 0 (2.1.5) and

x_k = ϕ(t_k, 0, x₀, u) = Akx₀+

k−1_X j=0

(20)

After inserting these solutions into the second equations of (2.1.1) and (2.1.2), we conclude that the output y(t) and y_k can be represented in the following form

y(t) = Cϕ(t, 0, x₀, u) = CeAtx₀+ C

t

Z

0

eA(t−s)Bu(s)ds, for t≥ 0 (2.1.7)

and y_k = Cϕ(t_k, 0, x₀, u) = CAkx₀+ k−1 X j=0 CAk−j−1Bu_j, for t_k≥ t₀. (2.1.8)

2.1.1 Stability

Here, we shall discuss one of the essential properties of dynamical systems, namely its stability.

Theorem 2.1.1. An LTI continuous-time dynamical system (2.1.1) is

asym-ptotically stable if, and only if, all eigenvalues of A have negative real parts, i.e., A is Hurwitz.

An exact analogue of Theorem 2.1.1 for discrete-time systems can be stated as follows:

Theorem 2.1.2. An LTI discrete-time dynamical system (2.1.2) is

asympto-tically stable if, and only if, all eigenvalues of A lie inside the unit disc, i.e., |λ(A)| < 1,

In sequel, if it is not stated otherwise, we identify stability with asymptotic stability. The interested reader is referred to [4] and [62] for further information on the stability problem.

2.1.2 Reachability

We shall now introduce two fundamental concepts in control theory, namely the reachability and observability (in Section 2.1.3). They describe the inter-action in a system between the external (inputs and outputs) and the internal variables (states). Loosely speaking, reachability indicates the extent to which the state of the system x can be manipulated through the input u, while ob-servability allows us to identify if the internal behavior can be detected at its outputs.

(21)

Deﬁnition 2.1.4 (Reachability).

i) A state ¯x∈ X is reachable from the zero state if there exist

– (continuous-time systems) an input function ¯u(t) ∈ U of ﬁnite energy and a ﬁnite time ¯T such that

¯ x = ϕ( ¯T , 0, 0, ¯u) = ¯ T Z 0 eA( ¯T −s)B ¯u(s)ds.

– (discrete-time systems) an input function ¯u = {¯u_i} ∈ U of ﬁnite energy and a ﬁnite time t¯_k with ¯k <∞ such that

¯ x = ϕ(t_k¯, 0, 0, ¯u) = ¯ k−1 X j=0 CA¯k−j−1B ¯u_j.

ii) The reachable subspaceXreach_{⊂ X of Σ is the set containing all reachable} states of Σ.

iii) The system Σ is (completely) reachable if Xreach ₌_{X, i.e. starting from} zero initial state, any state can be reached via a suitable control.

iv) Moreover,

R(A, B) = [B AB A2_B _{· · · A}N−1_B,_{· · · ]} _(2.1.9) is called the (inﬁnite) reachability matrix of Σ.

One can observe that the rank of the (infinite) reachability matrix R and the span of its columns are determined at most by the first N terms, i.e., AkB, k = 0, 1, . . . , N − 1. Thus, for computational purposes, the following (finite)

reachability matrix

RN(A, B) = [B AB A2B · · · AN−1B] (2.1.10)

is of interest.

Let us now introduce the concept of the reachability gramian. It is an impor-tant tool used in model order reduction (see Section 3.3).

Deﬁnition 2.1.5 (Finite reachability gramian). Let Σ =

A B

C

.

i) The ﬁnite reachability gramianP_c(T ) at time T <∞ of the

continuous-time system (2.1.1) is deﬁned as

Pc(T ) = T

Z

0

(22)

ii) The ﬁnite reachability gramianP_d(T ) on the interval [t₀, t_T], for T < ∞,

of the discrete-time system (2.1.2) is deﬁned as

Pd(T ) =RT(A, B)R∗T(A, B) = T −1_X j=0

AjBB∗(A∗)j, T ∈ Z₊. (2.1.12) If the system Σ is stable, then both (2.1.11) and (2.1.12) are also deﬁned for

T → ∞. In this case, Pc = lim T →∞Pc(T ) = ∞ Z 0 eAτBB∗eA∗τdτ (2.1.13) and Pd = lim T →∞Pd(T ) = ∞ X j=0 AjBB∗(A∗)j, (2.1.14) are called the inﬁnite reachability gramian of the continuous-time system (2.1.2) and the discrete-time system (2.1.2), respectively.

The infinite reachability gramians P_c,P_d and the infinite observability gra-miansQ_c,Q_d (see (2.1.26) and (2.1.27)) are possibly the most important ma-trices in the field of model order reduction.

Proposition 2.1.1. [4]

• Given the stable, continuous-time system (2.1.1). The associated inﬁnite reachability gramian (2.1.13) is the unique symmetric positive deﬁnite solution of the Lyapunov equation

AP_c +P_cA∗+ BB∗ = 0. (2.1.15)

• Given the stable, discrete-time system (2.1.2). The associated inﬁnite reachability gramian (2.1.14) is the unique symmetric positive deﬁnite solution of the Stein equation

AP_dA∗+ BB∗ =P_d. (2.1.16) By an abuse of notation we denote the inﬁnite reachability gramians P_c and

Pd of the continuous-time and discrete-time LTI systems, respectively, with

the same symbolP. For simplicity of notation, we will also write P(T ) instead of P_c(T ) and P_d(T ).

Various conditions for reachability are summarized in the following theorem: Theorem 2.1.3 (Reachability conditions). The following are equivalent [4]:

1. The pair (A, B), A ∈ RN×N_{, B} _{∈ R}N×m_{, is reachable.}

(23)

3. The reachability gramian is positive deﬁnite: P(t) > 0, for some t > 0. 4. No left eigenvector v of A is in the left kernel of B : v∗A = λv∗⇒v∗B=0. 5. rank(μI_N − A, B) = N, for all μ ∈ C.

6. The polynomial matrices sI_N − A and B are left coprime.

An important notion, from model reduction point of view, is the concept of the input energy, i.e. the energy that is required to reach certain state. Deﬁnition 2.1.6 (Input with the smallest amount of energy). Let ¯u ∈ U be an input that steers the continuous-time LTI system (2.1.1) from the zero state x(0) = 0 to the desired state x( ¯T ) = ¯x at a given time ¯T . That means

¯ x = ϕ( ¯T , 0, 0, ¯u) = ¯ T Z 0 eA( ¯T −s)B ¯u(s)ds. (2.1.17)

The minimal input energy ε(reach)_T_¯ (¯x) required to reach the state ¯x from zero state after a time ¯T is deﬁned as

ε(reach)_T_¯ (¯x) := min{¯u2₂| ¯u : [0, ¯T ]→ Kmsatisﬁes (2.1.17)}. (2.1.18) For discrete-time systems, it can be stated as follows:

Deﬁnition 2.1.7 (Input with the smallest amount of energy). Suppose that

¯ u = 2 6 6 6 6 4 ¯ u_{T −1}¯ ¯ u_{T −2}¯ .. . ¯ u₀ 3 7 7 7 7 5, where ¯uk ∈ K m_{, for all k = 0, . . . , ¯}_T _{− 1}

is an input that steers the discrete-time LTI system (2.1.2) from the zero state x₀ = 0 to a state x_T¯ = ¯x in ¯T time steps. That means

¯ x = ϕ(t_T¯, 0, 0, ¯u) = ¯ T −1_X j=0 AT −j−1¯ B ¯u_j. (2.1.19)

The minimal input energy ε(reach)_¯

T (¯x) required to reach the state ¯x from the zero state after ¯T time steps is deﬁned as

ε(reach)_¯ T (¯x) := min{¯u 2 2| ¯u = 2 6 6 6 6 4 ¯ u_{T −1}¯ ¯ u_{T −2}¯ .. . ¯ u₀ 3 7 7 7 7 5satisﬁes (2.1.19)}. (2.1.20)

(24)

The following proposition provides a tool that can be used to ﬁnd such a minimal energy of an input, which drives 0 into ¯x.

Proposition 2.1.2. [4] Let Σ be a reachable system with initial condition

x(0) = 0.

i) The minimal input energy ε(reach)_¯

T (¯x) required to reach the state ¯x from the zero state after a time ¯T is given by

ε(reach)_¯

T (¯x) = ¯x∗P( ¯T )−1x,¯ (2.1.21) where P is the reachability gramian.

ii) If ¯T₂ ≥ ¯T₁, then P( ¯T₂)≥ P( ¯T₁). This implies that P( ¯T₂)−1 ≤ P( ¯T₁)−1. Therefore ε(reach)_T_¯ 2 (¯x)≤ ε (reach) ¯ T1 (¯x),

i.e. if we allow for more time then reaching the state ¯x from state x(0) = 0 is cheaper.

Here, P₁ ≥ P₂, for symmetric positive semi-deﬁnite matrices P₁ and P₂, means that P₁ − P₂ is also symmetric positive semi-deﬁnite.

Hence, the inverse reachability gramian reﬂects the minimal cost to reach a state ¯x from the zero state x₀ = 0 by applying suitable input function. If P is nearly singular then there exist states ¯x that are diﬃcult to reach as the

minimal control energy, ε(reach)_T_¯ (¯x), required to reach these states may become

extremely large.

2.1.3 Observability

We shall now come to the concept which was alluded to in the introduction of the previous section, namely the observability.

Deﬁnition 2.1.8. The states ¯x ∈ X and ˜x ∈ X are distinguishable, if there exist a time t and an input u(t) [time t_k and an input u_k] such that x₀ = ¯x and x₀ = ˜x in (2.1.7) [(2.1.8)] result in diﬀerent y(t)’s.

Proposition 2.1.3. [97] For LTI dynamical systems, ¯x and ˜x are distingui-shable if, and only if,

• ¯x − ˜x is distinguishable from 0,

(25)

Deﬁnition 2.1.9 (Observability).

i) The matrix

O(C, A) = [C∗ _A∗_C∗ _(A2₎∗_C∗ _{· · · (A}N−1₎∗_C∗_,_{· · · ]}∗ _(2.1.22) is called the (inﬁnite) observability matrix of Σ.

ii) A nonzero state ¯x ∈ X is called observable if ¯x is distinguishable from the zero state, i.e.

– (continuous-time systems) for u(t)≡ 0 exists t ≥ 0 such that

y(t) = Cϕ(t, 0, ¯x, 0) = CeAtx¯= 0.

– (discrete-time systems) for u(k) ≡ 0 exists a time step t_k, k ≥ 0 such that

y_k= Cϕ(t_k, 0, ¯x, 0) = CAkx¯= 0.

iii) The observable subspaceXobs _{⊂ X is the set of all observable states of Σ.} iv) The system Σ is (completely) observable if Xobs ₌ _{X, i.e. if any two}

distinct states are distinguishable.

The rank of the (inﬁnite) observability matrixO and the span of its rows are determined at most by the ﬁrst N terms, i.e., CAk_{, k = 0, 1, . . . , N} _{− 1. Thus}

for computational purposes the following (ﬁnite) observability matrix

ON(C, A) = [C∗ A∗C∗ (A2)∗C∗ · · · (AN−1)∗C∗]∗ (2.1.23)

is of interest.

Now we shall introduce another important tool used in model order reduction (see Section 3.3), namely the observability gramian.

Deﬁnition 2.1.10 (Finite observability gramian). Let Σ =

A B

C

. i) The ﬁnite observability gramian Q_c(T ) at time T < ∞ of

continuous-time system (2.1.1) is deﬁned as

Qc(T ) = T

Z

0

eA∗τC∗CeAτdτ, T ∈ R₊. (2.1.24)

ii) The ﬁnite observability gramianQ_d(T ) on the interval [t₀, t_T], for T <∞,

of discrete-time system (2.1.2) is deﬁned as

Qd(T ) =O∗T(C, A)OT(C, A) = T −1_X j=0

(26)

If the system Σ is stable, then both (2.1.24) and (2.1.25) are also deﬁned for T → ∞. In this case, Qc = lim T →∞Qc(T ) = ∞ Z 0 eA∗τC∗CeAτdτ (2.1.26) and Qd= lim T →∞Qd(T ) = ∞ X j=0 (Aj)∗C∗CAj (2.1.27) are called the inﬁnite observability gramian of the continuous-time LTI system (2.1.1) and discrete-time LTI system (2.1.2), respectively.

Proposition 2.1.4. [4]

• Given the stable, continuous-time system (2.1.1). The associated inﬁnite observability gramian (2.1.26) is the unique symmetric positive deﬁnite solution of the Lyapunov equation

A∗Q_c+Q_cA + C∗C = 0. (2.1.28)

• Given the stable, discrete-time system (2.1.2). The associated inﬁnite observability gramian (2.1.27) is the unique symmetric positive deﬁnite solution of the Stein equation

A∗Q_dA + C∗C =Q_d. (2.1.29)

By an abuse of notation we denote the inﬁnite reachability gramians Q_c and

Qd of the continuous-time and discrete-time LTI systems, respectively, with

the same symbolQ. For simplicity of notation, we will also write Q(T ) instead of Q_c(T ) and Q_d(T ).

Theorem 2.1.4 (Observability conditions). The following are equivalent [4]:

1. The pair (C, A), C ∈ Rp×N_{, A}_{∈ R}N×N_{, is observable.}

2. The rank of the observability matrix is full: rankO(C, A) = N.

3. The observability gramian is positive deﬁnite: Q(T ) > 0, for some T > 0. 4. No right eigenvector v of A is in the right kernel of C : Av = λv∗⇒Cv =0. 5. rank(μI_N − A∗, C∗) = N, for all μ ∈ C.

(27)

An important notion, from model reduction point of view, is the concept of the observation energy, i.e. the energy of the output function y at a certain time caused by the initial state ¯x.

Deﬁnition 2.1.11 (Observation energy). The energy of the output function

y ∈ Y at time ¯T caused by the initial state x(0) = ¯x with zero input function, u(t) = 0, over the interval [0, ¯T ], denoted by ε(observ)_¯

T (¯x) or by y is called the energy produced by ¯x over [0, ¯T ]. This energy can be expressed as

ε(observ)_T_¯ (¯x) :=y2₂ = ¯ T Z 0 y∗(t)y(t)dt = ¯ T Z 0 ¯ x∗eA∗tC∗CeAtxdt.¯ (2.1.30)

In the same form one can deﬁne ε(observ)_T_¯ (¯x) for discrete-time systems.

Deﬁnition 2.1.12 (Observation energy). The energy ε(observ)_¯

T (¯x) produced by the initial state x(0) = ¯x with zero input function, u_k = 0, over the interval [t₀, t_T¯] can be expressed as ε(observ)_¯ T (¯x) :=y 2 2 = ¯ T X k=0 y∗_ky_k = ¯ T X k=0 ¯ x∗(Ak)∗C∗CAkx.¯ (2.1.31) Proposition 2.1.5. [4] Let Σ be an observable continuous-time or

discrete-time LTI dynamical system with initial condition x(0) = ¯x.

i) In terms of the observability gramian, the energy ε(observ)_¯

T (¯x) produced by the initial state ¯x with zero input function after a time ¯T is given by

ε(observ)_T_¯ (¯x) = ¯x∗Q( ¯T )¯x. (2.1.32)

ii) If ¯T₂ ≥ ¯T₁, then Q( ¯T₂)≥ Q( ¯T₁). Therefore

ε(observ)_T_¯

2 (¯x)≥ ε

(observ) ¯

T1 (¯x),

i.e. if we allow for more time, then the energy produced by the (ob-servable) initial state ¯x with zero input function u(t) = 0 increases with increasing time.

Thus, the observability gramian reflects the effect of initial states on the output of the system when the input is set to 0. If Q is nearly singular, then there exist states which are difficult to observe in the sense that their observation energy ε(observ)_¯

T (¯x) may be very small.

Proposition 2.1.6. [97]. Let (2.1.1) [(2.1.2)] be a realization of the

continuous-time [discrete-continuous-time] LTI system Σ, then (2.1.1) [(2.1.2)] is a minimal realiza-tion if, and only if, it is both reachable and observable.

(28)

It is easy to check that observability gramian Q and reachability gramian P are not input-output invariant. For this purpose, consider a realization (2.1.3) of the continuous-time system Σ. The gramians ˜P and ˜Q of this realization are unique symmetric positive deﬁnite solutions of the following Lyapunov equations:

AT−1PT˜ −∗+ T−1PT˜ −∗A∗+ BB∗ = 0 and A∗T∗QT + T˜ ∗QT A + C˜ ∗C = 0.

By the uniqueness of gramians, it may be concluded that ˜

P = T PT∗ _{and ˜}_{Q = T}−∗_QT−1_.

However, it can be observed that the product of the two gramians ˜

P ˜Q = T PQT−1 _(2.1.33)

undergoes a similarity transformation and from this, it follows that the eigen-values ofPQ are invariant under state-space transform. These eigenvalues turn out to be the squares of very important Hankel singular values (see Section 2.2.4). Similar consideration apply to discrete-time systems.

2.2 Input-Output Behavior

In many applications, only the eﬀect which the applied inputs have on observed outputs is of interest. Therefore physical systems are often described not by their state-space representation but rather through their input-output behavior. In this section, we introduce the concept of transfer function which completely describes, in the frequency domain, the input-output relation for LTI systems. The relationship between transfer functions and other descriptions of the sys-tem dynamics is also presented.

2.2.1 Transfer functions

Transforming the equation from the time domain to the frequency domain by means of the Laplace transform is a common way to solve a linear diﬀerential equation.

Deﬁnition 2.2.1 (Laplace transform). For a function f (·) ∈ KN and s ∈ C F (s) =L [f(t)] (s) =

Z _∞

0 f (t)e

−st_dt,

(29)

2.2. INPUT-OUTPUT BEHAVIOR 21

This transform, if applied to the system (2.1.1) with initial state x(0) = 0, leads to a purely algebraic system of equations in the frequency domain

sX(s) = AX(s) + BU (s), Y (s) = CX(s).

Here, X(s) = L [x(t)] (s), U(s) = L [u(t)] (s) and Y (s) = L [y(t)] (s) denote the Laplace transformed variables x(t), u(t) and y(t), respectively. There-fore X(s) = (sI_N − A)−1BU (s) and so the input-output behavior of the LTI

continuous-time system in the Laplace domain is described by

Y (s) = C(sI_N − A)−1B

| {z }

H(s)

U (s).

The matrix-valued function

H(s) = C(sI_N − A)−1B ∈ Cp×m

is called the transfer function of the system (2.1.1). It can be easily seen that the function H(s) maps the Laplace transform of the input onto the Laplace transform of the output, hence it completely describes the system. Note that transfer functions of linear time-invariant systems are rational matrix-valued functions which are deﬁned everywhere except the set λ(A) of eigenvalues of

A. A transfer function H(s) is called proper if lim

|ω|→∞|H(iω)| < ∞ and strictly

proper if lim

|ω|→∞H(iω) = 0. Throughout the thesis, we only consider systems

with strictly proper transfer functions.

Deﬁnition 2.2.2 (z-transform). Consider a sequence f (·) ∈ KN_{. The} z-trans-form of f is the z-trans-formal power series in s−1 :

Z [f(k)] (s) :=X∞ k=0

f (k)s−k.

By applying the algebraic z-transform to the discrete-time LTI systems of the form (2.1.2), for arbitrary input sequences u(·) : N → Km_{, we obtain}

sX(s) = AX(s) + BU (s), Y (s) = CX(s),

where X(s) = Z [x(t_k)] (s), U (s) = Z [u(t_k)] (s) and Y (s) = Z [y(t_k)] (s) de-note the z-transformed variables x(t_k), u(t_k) and y(t_k), respectively. There-fore X(s) = (sI_N − A)−1BU (s) and so the input-output behavior of the LTI

discrete-time system in the z-domain is described by

Y (s) = C(sI_N − A)−1B

| {z }

H(s)

U (s).

It follows immediately that a formula for the transfer function of a discrete-time LTI system (2.1.2) as well as of a continuous-discrete-time system (2.1.1) is the same, namely

(30)

2.2.2 Properties of transfer functions

We now have the necessary prerequisites to present some important properties of transfer functions.

Deﬁnition 2.2.3 (Poles). Eigenvalues of the state matrix A are called poles

of the corresponding dynamical system Σ.

We note that throughout the thesis we only consider systems with simple, i.e. pairwise distinct, poles. We make this assumption for original as well as for reduced-order models.

Deﬁnition 2.2.4 (Residue). Suppose that H(s) is the transfer function of the

SISO system Σ. Let H(s) have a nonremovable isolated singularity at the point λ. Then H(s) has the unique Laurent series representation for all s in some ring D_R(λ) :={s ∈ C : 0 < s − λ < R} given by H(s) = ∞ X k=−∞ γ_k(s− λ)k.

The coeﬃcient γ₋₁ of _s−λ1 is called the residue of H(s) at s = λ and we use the notation

Res (H(s), s = λ) = γ₋₁.

The calculation of a Laurent series expansion is tedious in most circumstances. Therefore we have to ﬁnd a method for calculating the residue from special information about the nature of the singularity at λ. Theorem 2.2.1 gives methods for evaluating residues at poles.

Theorem 2.2.1. [77]

1. If H(s) has a simple pole at λ then

Res (H(s), s = λ) = lim

s→λ(s− λ)H(s). 2. If H(s) has a pole of order k at λ then

Res (H(s), s = λ) = 1

(k− 1)!s→λlim dk−1

dsk−1((s− λ)

k_H(s)).

Let λ_k, k = 1, . . . , N, be pairwise distinct poles of the system and let φ_k be the corresponding residues of H(s) at λ_k, φ_k = Res (H(s), s = λ_k). By partial fraction expansion of such systems we obtain

H(s) = N X k=1 φ_k s− λ_k. (2.2.2)

(31)

We now turn our attention to MIMO systems. It may be worth recalling that, in this thesis, we only consider LTI systems with single (pairwise distinct) poles. Therefore without loss of generality, we assume that A is in diagonal form, i.e. A = diag(λ₁, . . . , λ_N), where λ_i = λ_j for i = j. Otherwise, a state-space transformation x = S ˜x, where the columns of S are the eigenvectors of A, yields an equivalent system with ˜A = S−1AS, ˜B = S−1B and ˜C = CS

where ˜A is diagonal. We denote

B = (b_ij)_i,j = [b∗₁ . . . b∗_N]∗, C = (c_ij)_i,j = [c₁ . . . c_N], (2.2.3) with row vectors b_k = [b_k1. . . b_km] and column vectors c_k = [c_1k. . . c_pk]T,

k = 1, . . . , N . Note that column vector Be_l∈ CN _{represents the l-th input and}

row vector eT

qC∈ CN represents the q-th output of the system, for l = 1, . . . , m

and q = 1, . . . , p. Here e_j denotes the j-th column unit vector of appropriate dimension. The transfer function H is a (p× m)-dimensional matrix-valued function with components Hql_{, l = 1, . . . , m, q = 1, . . . , p:}

H(s) = H11(s) · · · H1m(s) .. . . .. ... Hp1(s) · · · Hpm(s) , Hql(s) = eT_qC(sI_N − A)−1Be_l. (2.2.4)

Each Hql _{can be interpreted as a SISO transfer function with input Be} l and

output eT_qC. Applying (2.2.2) to each Hql, yields

Hql(s) = N X k=1 φql_k s− λ_k, l = 1, . . . , m, q = 1, . . . , p. (2.2.5)

Comparison of coeﬃcients in (2.2.4) with those in (2.2.5) yields

φql_k = c_qkb_kl, l = 1, . . . , m, q = 1, . . . , p, k = 1, . . . , N. (2.2.6) Remark 2.2.1. We note that for minimal MIMO systems with p > 1 or

m > 1, some φql_k can be zero in each component Hql _{whereas for minimal SISO} systems all φ_k:= φ11

k , k = 1, . . . , N , are nonzero. In this case, φkis the residue of H in the pole λ_k denoted by Res(H(s), s = λ_k). It is well-known that a SISO

system can be written in a canonical form such that A = diag(λ₁, . . . , λ_N) is

diagonal and B = [1, . . . , 1]∗. In this canonical form, we have C = [φ₁, . . . , φ_N]. Each element Hql_{(s) is a rational scalar-valued function with numerator and}

denominator polynomial Ξ_N−1 and Ψ_N of degree at most N − 1 and N, re-spectively. Thus Hql_{(s) can now be represented in the following form}

Hql(s) = Ξ ql N−1 Ψql_N = N−1P k=0 χql_ksk N P k=0 ψ_kqlsk . (2.2.7)

(32)

Here and throughout the thesis, we assume that the transfer function H(s) is strictly proper. Therefore the degree of the polynomial Ξql_N−1 must be smaller than the degree of Ψql_N, for all q = 1, . . . , p and l = 1, . . . , m.

Remark 2.2.2. [62] Let C+ α:={s ∈ C : (s) > α}, D+ α:={s ∈ C : |s| > α}, C− α:={s ∈ C : (s) < α}, D− α:={s ∈ C : |s| < α}.

Transfer function H(s) of the continuous [discrete] LTI system with all its poles lying in C−_α [D−_α] is analytic on C+

α [D+α], bounded and continuous on

C+

α [D+α].

Remark 2.2.3. Two diﬀerent realizations of the same system Σ have the same

transfer function, since

˜

H(s) = CT (sST − SAT )−1SB = CT [S(sI_N − A)T ]−1SB = C(sI_N − A)−1B = H(s).

That means, the transfer function does not depend on the realization of the system.

2.2.3 Impulse response, moments and Markov

parame-ters of LTI systems

Impulse response plays a very important role in control theory of LTI dynami-cal systems. It turns out (see Subsection 2.2.4) that for all inputs u ∈ U the output function can be calculated in terms of the input and the impulse response. This indicates that every LTI system is completely characterized by its impulse response.

Deﬁnition 2.2.5 (Impulse response).

• (continuous-time system). Let δ denote the so-called Dirac delta (im-pulse) function:

δ(x) =

¨

0 for x= 0

undef ined at x = 0

with the requirement that the function has unit area, i.e. ∞

Z

−∞

(33)

The (p× m)-matrix function h(t) :=

¨

0, if t < 0

CeAtB, if t≥ 0, (2.2.8)

being the response of the continuous-time system (2.1.1) with initial con-dition x(0) = 0 to the unit impulse δ is called the impulse response of the system (2.1.1).

• (discrete-time system). Let δ ∈ U, deﬁned by δ(t) :=

¨

1 for t = 0 0 for t= 0

be the single impulse. The sequence h ={h_k} ∈ Y, where the j-th column of each entry

h_k :=

¨

0, if k ≤ 0

CAk−1B, if k > 0, (2.2.9)

represents the response of the discrete-time system (2.1.2) with initial condition x(0) = 0 to the j-th unit impulse u_j = e_jδ(t), is called the impulse response of system (2.1.2). Here e_j denotes the j-th canonical unit vector.

The transfer function H(s) = C(sI−A)−1B is a p×m matrix-valued function.

By assuming that A is nonsingular and, for suitable η, making use of Neumann expansion (I− ηA)−1 = ∞ X k=0 (ηA)k, we obtain (sI− A)−1 =−A(I − sA−1)−1 =− ∞ X k=0 (sA−1)kA−1.

Hence, the Taylor series expansion of the transfer matrix H(s) = C(sI−A)−1B

about zero is H(s) = ∞ X k=0 −C(A−1₎k_A−1_Bsk_.

Deﬁnition 2.2.6 (Low frequency moments). Let Σ be the system (2.1.1) or

(2.1.2) and suppose that A is nonsingular. The k-th coeﬃcient

h_k(0) =−C(A−1)kA−1B ∈ Kp×m k = 0, 1, . . . (2.2.10)

of the Taylor series expansion of

H(s) = C(sI − A)−1B = ∞

X

k=0

h_k(0)sk

about zero is called the k-th moment (about zero). Coeﬃcients h_k(0), k = 0, 1, . . .

(34)

By choosing s₀ ∈ C such that ˜A := A− s₀I is invertible and rewriting the

transfer matrix as

H(s) = C(s− s₀) I− ˜A−1B,

one can also deﬁne moments about points s₀ = 0.

Deﬁnition 2.2.7 (Moments about s₀). Let Σ be the system (2.1.1) or (2.1.2)

and let ˜A := A− s₀I be nonsingular. The k-th coeﬃcient

h_k(s₀) = −C( ˜A−1)kA˜−1B ∈ Kp×m k = 0, 1, . . . (2.2.11)

of the Taylor series expansion of

H(s) = C((s− s₀)I− ˜A)−1B = ∞

X

k=0

h_k(s₀)(s− s₀)k

about s₀ is called the k-th moment of the system Σ about s₀.

Remark 2.2.4. Up to a constant, the moments h_k(s₀) are values of the

tran-sfer function H(s) and its derivatives at point s₀, i.e. h₀(s₀) = H(s₀)

h_k(s₀) = 1

k!d

k_H

dsk(s0) k = 1, 2, . . .

An expansion of H(s) when s₀ → ∞ takes the form

H(s) = ∞

X

k=0

h_ks−k−1, h_k = CAkB ∈ Kp×m.

Deﬁnition 2.2.8 (High frequency moments). Let Σ be the system (2.1.1) or

(2.1.2). The k-th Markov parameter is deﬁned as

h_k= CAkB ∈ Kp×m k = 0, 1, . . . . (2.2.12)

Markov parameters are also called high frequency moments.

The k-th Markov parameter is also known [65] to be the value of the k-th derivative of the impulse response at time zero. So, the ﬁrst Markov parameter,

h₀, is the impulse response at t = 0.

Deﬁnition 2.2.9 (Hankel matrix). The matrix H of Markov parameters:

H = 2 6 6 6 6 6 6 6 4 h₀ h₁ h₂ h₃ · · · h₁ h₂ h₃ h₄ · · · h₂ h₃ h₄ h₅ · · · h₃ h₄ h₅ h₆ · · · .. . ... ... ... . .. 3 7 7 7 7 7 7 7 5 (2.2.13)

is called Hankel matrix. It has block Hankel structure, i.e. (H)_i,j = h_i+j−2, for i, j > 0 with inﬁnitely many rows and columns.

Optimal interpolation-based model reduction

Optimal interpolation-based model reduction

Contents

Chapter 1

Introduction

Chapter 2

Preliminaries

2.1

Basics on Control Theory

2.1.1

Stability

2.1.2

Reachability

2.1.3

Observability

2.2

Input-Output Behavior

2.2.1

Transfer functions

2.2.2

Properties of transfer functions

2.2.3

Impulse response, moments and Markov

parame-ters of LTI systems