Dynamic programming, optimal control and model predictive control

(1)

Dynamic Programming, Optimal Control and Model Predictive Control

Lars Gr¨une

AbstractIn this chapter, we give a survey of recent results on approximate optimality and stability of closed loop trajectories generated by model predictive control (MPC). Both stabilizing and economic MPC are considered and both schemes with and without terminal conditions are analyzed. A particular focus of the chapter is to highlight the role dynamic programming plays in this analysis. As we will see, dynamic programming arguments are ubiquitous in the analysis of MPC schemes.

1 Introduction

Model Predictive Control (MPC), also known as Receding Horizon Control, is one of the most successful modern control techniques, both regarding its popularity in academics and its use in industrial applications [6, 10, 14, 28]. In MPC, the control input is synthesized via the repeated solution of finite horizon optimal control problems on overlapping horizons. Among the most fundamental properties to be investigated when analyzing MPC schemes are the stability and (approximate) optimality properties of the closed loop solutions generated by MPC. One interpretation of MPC is that an infinite horizon optimal control problem is split up into the repeated solution of auxiliary finite horizon problems [12].

Dynamic Programming (DP) is one of the fundamental mathematical techniques for dealing with optimal control problems [4, 5]. It provides a rule to split up a high (possibly infinite) dimensional optimization problem over a long (possibly infinite) time horizon into auxiliary optimization problems on shorter horizons, which are much easier to solve. While at a first glance this appears similar to the procedure just described for MPC, the approach is different, in the sense that in DP the exact information about the future of the optimal trajectories — by means of Lars Gr¨une

Mathematical Institute, University of Bayreuth, 95440 Bayreuth, Germany, e-mail:

lars.gruene@uni-bayreuth.de

1

(2)

the corresponding optimal value function — is included in the auxiliary problem.

Thus, it provides a characterization of theexact solution, at the expense that the auxiliary problems are typically difficult to formulate and the number of auxiliary problems becomes huge — the (in)famous “curse of dimensionality”. In MPC, the future information is only approximated (for schemes with terminal conditions) or even completely disregarded (for schemes without terminal conditions). This makes the auxiliary problems easy to formulate and to solve and keeps the number of these problems low, but now at the expense that it doesnotyield an exact optimal solution of the original problem anymore.

However, it may still be possible that the solution trajectories generated by MPC are stable and approximately optimal, and the key for proving such statements is to make sure that the neglected future information only slightly affects the solution. The present chapter presents a survey of a selection of results in this direction and in particular shows that ideas from dynamic programming are essential for this purpose. As we will show, dynamic programming methods can be used for estimat- ing near optimal performance under suitable conditions on the future information (Proposition 6 and Theorem 15 are examples for such statements) but also for en- suring that the future information satisfies these conditions (as, e.g., in Proposition 8 or Lemma 14(ii)). Moreover, dynamic programming naturally provides ways to derive stability or convergence from optimality via Lyapunov functions arguments, as in Proposition 3.

The chapter is organized as follows. In Section 2 we describe the setting and the MPC algorithm we consider in this chapter. Section 3 collects the results from dynamic programming we will need in the sequel. Section 4 then presents results for stabilizing MPC, in which the stage cost penalizes the distance to a desired equilibrium. Both schemes with and without terminal conditions are discussed. Section 5 extends this analysis to MPC schemes with more general stage costs, which is usually referred to as economic MPC. Section 6 concludes the chapter.

2 Setting, definitions and notation

In this chapter we consider discrete time optimal control problems of the form MinimizeJ_N(x₀,u)with respect to the control sequenceu, (1) whereN∈N∞:=N∪ {∞}and

J_N(x₀,u) =

N−1 k=0

∑

`(x(k),u(k)),

subject to the dynamics and the initial condition

x(k+1) = f(x(k),u(k)), x(0) =x₀ (2)

(3)

and the combined state and input constraints

(x(k),u(k))∈Y ∀k=0, . . . ,N−1 and x(N)∈X (3) for all k∈Nfor which the respective values are defined. HereY⊂X×U is the constraint set, X andU are the the state and input value set, respectively, and X:={x∈X| ∃u∈Uwith(x,u)∈Y} is the state constraint set. The sets X and U are metric spaces with metricsd_X(·,·)andd_U(·,·). Because there is no danger of confusion we usually omit the indices X andU in the metrics. We denote the solution of (2) byx_u(k,x₀). Moreover, for the distance of a pointx∈X to another pointy∈Xwe use the short notation|x|_y:=d(x,y).

Forx₀∈XandN∈Nwe define the set of admissible control sequences as U^N(x₀):={u∈U^N|(x_u(k,x₀),u(k))∈Y ∀k=0, . . . ,N−1 andx_u(N,x₀)∈X} and

U^∞(x₀):={u∈U^∞|(x_u(k,x₀),u(k))∈Y ∀k∈N}

Since feasibility issues are not the topic of this chapter, we make the simplifying assumption thatU^N(x₀)6=/0 for allx0∈Xand allN∈N∞. If desired, this assumption can be avoided using the techniques from, e.g., [9], [14, Chapter 7], [20, Chapter 5], or [27].

Corresponding to the optimal control problem (1) we define the optimal value function

V_N(x₀):= inf

u∈U^N(x₀)

J(x₀,u)

and we say that a control sequenceu^?_N∈U^N(x₀)is optimal for initial valuex₀∈X ifJ(x₀,u^?_N) =V_N(x₀)holds.

It is often desirable to solve optimal control problems with infinite horizon N =∞, for instance because the control objective under consideration naturally leads to an infinite horizon problem (like stabilization or tracking problems) or because an optimal control is needed for an indefinite amount of time (as in many regulation problems). For such problems the optimal control is usually desired in feedback form, i.e., in the formu^?_N(k) =µ(x(k))for a feedback mapµ:X→U. Except for special cases like linear quadratic problems without constraints, com- puting infinite horizon optimal feedback laws is in general a very difficult task. On the other hand, very accurate approximations to optimal control sequencesu^?_N for finite horizon problems, particularly with moderateN, can be computed easily and fast (sometimes within a few milliseconds), and often also reliably with state-of-the- art numerical optimization routines, even for problems in which the dynamics (2) are governed by partial differential equations. The following Receding Horizon or Model Predictive Control algorithm (henceforth abbreviated by MPC) is therefore an attractive alternative to solving an infinite horizon optimal control problem.

(4)

Algorithm 1 (Basic Model Predictive Control Algorithm) (Step 0) Fix a (finite) optimization horizon N∈Nand set k:=0;

let an initial value x_MPC(0)be given

(Step 1) Compute an optimal control sequence u^?_N of Problem(1)for x₀=x_MPC(k) (Step 2) Define the MPC feedback law valueµ_N(x_MPC(k)):=u^?_N(0)

(Step 3) Set x_MPC(k+1):=f(x_MPC(k+1),µN(x_MPC(k))), k:=k+1 and go to (Step 1)

We note that although derived from an open loop optimal control sequenceu^?_N,µN

is indeed a map fromXtoU, however, it will in general not be given in the form of an explicit formula. Rather, givenx_MPC(k), the valueµN(x_MPC(k))is obtained by solving the optimal control problem in Step 1 of Algorithm 1, which is usually done numerically.

In MPC, one often introduces additional terminal conditions, consisting of a terminal constraint setX0⊆Xand a terminal costF:X0→R. To this end, the optimization objectiveJ_N is modified to

J_N^tc(x,u) =

N−1 k=0

∑

`(x(k),u(k)) +F(x(N))

and the last constraint in (3) is tightened to x(N)∈X0.

Moreover, we denote the corresponding space of admissible control sequences by U^N0(x₀):={u∈U^N(x₀)|x_u(N,x₀)∈X0}

and the optimal value function by

V_N^tc(x₀):= inf

u∈U^N₀(x₀)

J(x₀,u).

Observe that the problem without terminal conditions is obtained forF ≡0 and X0=X.

Again, a controlu^tc?_N ∈U^N₀(x₀)is called optimal ifV_N^tc(x₀) =J_N^tc(x₀,u^tc?_N ). Due to the terminal constraints it is in general not guaranteed thatU^N₀(x₀)6=/0 for all x₀∈X. We therefore defineXN:={x₀∈X|U^N₀(x₀)6=/0}. For MPC in whichJ^tc_N is minimized in Step 1 we denote the resulting feedback law byµ_N^tc. Note thatµ_N^tcis defined onXN.

A priori, it is not clear, at all, whether the trajectoryx_MPCgenerated by the MPC algorithm enjoys approximate optimality properties or qualitative properties like stability. In the remainder of this chapter, we will give conditions under which such properties can be guaranteed. In order to measure the optimality of the closed loop trajectory, we introduce its closed loop finite and infinite horizon values

(5)

J_K^cl(x,µ_N):=

K−1

∑

k=0

`(x_MPC(k),µ_N(x_MPC(k)))

and

J_∞^cl(x,µN):=lim sup

K→∞

J_K^cl(x_MPC(0),µN) where in both cases the initial valuex_MPC(0) =xis used.

3 Dynamic programming

Dynamic programming is a name for a set of relations between optimal value functions and optimal trajectories at different time instants. In what follows we state those relations which are important for the remainder of this chapter. For their proofs we refer to [14, Chapters 3 and 4].

For thefinite horizon problem without terminal conditionsthe following equations and statements hold for allN∈Nand allK∈NwithK≤N(usingV₀(x)≡0 in caseK=N):

V_N(x) = inf

u∈U^K(x)

{J_K(x,u) +V_N−K(x_u(K,x))} (4)

Ifu^?_N∈U^N(x)is an optimal control for initial valuexand horizonN, then V_N(x) =J_K(x,u^?_N) +VN−K(x_u^?

N(K,x)) (5)

and

the sequenceu_K:= (u^?_N(K), . . . ,u^?_N(N−1))∈U^N−K(x_u^?

N(K,x)) is an optimal control for initial valuex_u^?

N(K,x)and horizonN−K. (6) Moreover, for allx∈Xthe MPC feedback lawµN satisfies

V_N(x) =`(x,µN(x)) +VN−1(f(x,µN(x))). (7) For thefinite horizon problem with terminal conditionsthe following holds for allN∈Nand allK∈NwithK≤N(usingV₀^tc(x) =F(x)in caseK=N):

V_N^tc(x) = inf

u∈U^KN−K(x)

{J_K(x,u) +V_N−K^tc (x_u(K,x))}, (8)

whereU^KN−K(x₀):={u∈U^K(x₀)|x_u(N,x₀)∈XN−K}. Ifu^tc?_N ∈U^N₀(x)is an optimal control for initial valuexand horizonN, then

V_N^tc(x) =J_K(x,u^tc?_N ) +V_N−K^tc (x_utc?

N (K,x)) (9)

and

(6)

the sequenceu^tc_K := (u^tc?_N (K), . . . ,u^tc?_N (N−1))∈U^N−K(x_utc?

N (K,x)) is an optimal control for initial valuex_u^tc?

N (K,x)and horizonN−K. (10) Moreover, for allx∈Xthe MPC feedback lawµ_N^tcsatisfies

V_N^tc(x) =`(x,µ_N^tc(x)) +V_N−1^tc (f(x,µ_N^tc(x))). (11) Finally, for theinfinite horizon problemthe following equations and statements hold for allK∈N:

V∞(x) = inf

u∈U^K(x)

{J_K(x,u) +V∞(x_u(K,x))} (12) Ifu^?_∞is an optimal control for initial valuexand horizonN, then

V_∞(x) =J_K(x,u^?_∞) +V_∞(x_u^?

∞(K,x)) (13)

and

the sequenceu_K := (u^?_∞(K),u^?_∞(K+1), . . .)∈U^∞(x_u^?

∞(K,x)) is an optimal control for initial valuex_u^?

∞(K,x). (14)

The equations just stated can be used as the basis of numerical algorithms, see, e.g., [5, 17] and the references therein. Here, however, we rather use them as tools for the analysis of the performance of the MPC algorithm. Besides the equalities, above, which refer to the optimal trajectories, we will also need corresponding inequalities. These will be used in order to estimateJ_K^cl andJ_∞^cl as shown in the following proposition.

Proposition 2 Assume there is function ε:X→Rsuch that the approximate dynamic programming inequality

VN(x) +ε(x)≥`(x,µN(x)) +V_N(f(x,µN(x))) (15) holds for all x∈X. Then for each MPC closed loop solution x_MPC and all K∈N the inequality

J_K^cl(x_MPC(0),µ_N)≤V_N(x_MPC(0))−V_N(x_MPC(K)) +

K−1

∑

k=0

ε_k (16)

holds for ε_k =ε(x_MPC(k)). If, in addition, εˆ :=lim sup_K→∞∑^K−1_k=0ε_k <∞ and lim infK→∞V_N(x_MPC(K))≥0hold, then also

J_∞^cl(x_MPC(0),µ_N)≤V_N(x_MPC(0)) +εˆ

holds. The same statements are true when VNandµNare replaced by their terminal conditioned counterparts V_N^tcandµ_N^tc, respectively.

Proof. Observing thatx_MPC(k+1) = f(x,µN(x))forx=x_MPC(k)and using (15) with thisxwe have

(7)

J_K^cl(x_MPC(0),µ_N) =

K−1

∑

k=0

`(x_MPC(k),µ_N(x_MPC(k)))

≤

K−1

∑

k=0

[V_N(x_MPC(k))−V_N(x_MPC(k+1)) +ε_k]

=V_N(x_MPC(0))−V_N(x_MPC(K)) +

K−1

∑

k=0

ε_k,

which shows the first claim. The second claim follows from the first by taking the upper limit forK→∞. The proof for the terminal conditioned case is identical. ut

4 Stabilizing MPC

Using the dynamic programming results just stated, we will now derive estimates forJ_∞^cl in the case of stabilizing MPC. Stabilizing MPC refers to the case in which the stage cost`penalizes the distance to a desired equilibrium. More precisely, let (x∗,u∗)∈Ybe an equilibrium, i.e., f(x∗,u∗) =x∗. Then throughout this section we assume that there isα1∈K∞such that¹`satisfies

`(x∗,u∗) =0 and `(x,u)≥α₁(|x|_x_∗) (17) for allx∈X. Moreover, for the terminal costF we assume

F(x)≥0 for allx∈X0. (18)

We note that (18) trivially holds in case no terminal cost is used, i.e., ifF≡0.

The purpose of this choice of` is to force the optimal trajectories — and thus hopefully also the MPC trajectories — to converge tox∗. The following proposition shows that this hope is justified under suitable conditions, where the approximate dynamic programming inequality (15) plays a pivotal role.

Proposition 3 Let the assumptions of Proposition 2,(17)and(18)(in case of terminal conditions) hold withε(x)≤η α1(|x|_x_∗)for all x∈Xand someη<1. Then x_MPC(k)→x∗as k→∞.

Proof. We first observe that the assumptions implyVN(x)≥0 orV_N^tc(x)≥0, respectively. We continue the proof forV_N, the proof forV_N^tc is identical. Assume x_MPC(k)6→x∗, i.e., there areδ >0 and a sequencek_p→∞with|x_MPC(k_p)|_x_∗ ≥δ for allp∈N. Then by induction over (15) withx=x_MPC(k)we get

1The spaceK∞consists of all functionsα:[0,∞)→[0,∞)withα(0) =0 which are continuous, strictly increasing and unbounded.

(8)

V_N(x_MPC(K))≤V_N(x_MPC(0))−

K−1

∑

k=0

`(x_MPC(k),µ_N(x_MPC(k)))−ε(x_MPC(k))

≤V_N(x_MPC(0))−

K−1

∑

k=0

(1−η)α₁(|x_MPC(k)|_x_∗)

≤V_N(x_MPC(0))−

∑

p∈N kp≤K−1

(1−η)α₁(|x_MPC(k_p)|_x_∗)

≤V_N(x_MPC(0))−#{p∈N|k_p≤K}(1−η)α₁(δ).

Now asK→∞the number #{p∈N|k_p≤K}grows unboundedly, which implies thatV_N(x_MPC(K))<0 for sufficiently largeKwhich contradicts the non-negativity ofV_N. ut

We remark that under additional conditions (essentially appropriate upper bounds onVN orV_N^tc, respectively), asymptotic stability ofx∗can also be established, see, e.g., [14, Theorem 4.11] or [28, Theorem 2.22].

4.1 Terminal conditions

In this section we use the terminal conditions in order to ensure that the approximate dynamic programming inequality (15) holds with ε(x)≤0 andV_N^tc(x)≥0. Then Proposition 2 applies and yieldsJ_∞^cl(x_MPC(0),µ_N^tc)≤V_N^tc(x_MPC(0))while Proposi- tion 3 impliesx_MPC(k)→x∗. The key for making this approach work is the following assumption.

Assumption 4 For each x∈Xthere is ux∈U with(x,ux)∈Y, f(x,ux)∈Xand

`(x,u_x) +F(f(x,ux))≤F(x).

While conditions like Assumption 4 were already developed in the 1990s, e.g., in [7, 22, 25], it was the paper [23] published in 2000 which established this condition asthestandard assumption for stabilizing MPC with terminal conditions. The particular caseX={x∗}was investigated in detail already in the 1980s in the seminal paper [19].

Theorem 5.Consider the MPC scheme with terminal conditions satisfying (17), (18)and Assumption 4. Then the inequality J_∞^cl(x,µ_N^tc)≤V_N^tc(x)and the convergence x_MPC(k)→x∗for k→∞hold for all x∈XN and the closed loop solution x_MPC(k) with x_MPC(0) =x.

Proof. As explained before the theorem, it is sufficient to prove (15) withε(x)≤0 andV_N^tc(x)≥0; then Propositions 2 and 3 yield the assertions. The inequalityV_N^tc(x)≥

(9)

0 is immediate from (17) and (18). For proving (15) withε(x)≤0, usingu_x from Assumption 4 withx=x_u(N−1,x₀)we get

V_N−1^tc (x₀) = inf

u∈U^N−1₀ (x₀) N−2

∑

k=0

`(x_u(k,x₀),u(k)) +F(x_u(N−1,x₀))

≥ inf

u∈U^N−1₀ (x0) N−2

k=0

∑

`(x_u(k,x₀),u(k)) +`(x,u_x) +F(f(x,u_x))

≥ inf

u∈U^N₀(x₀) N−1

∑

k=0

`(x_u(k,x0),u(k)) +F(x_u(N,x0)) =V_N^tc(x₀)

Inserting this inequality forx₀=f(x,µ_N^tc(x))into (11) we obtain

V_N^tc(x) =`(x,µ^tc_N(x)) +V_N−1^tc (f(x,µ_N^tc(x)))≥`(x,µ_N^tc(x)) +V_N^tc(f(x,µ_N^tc(x))) and thus (15) withε≡0. ut

A drawback of the inequality J_∞^cl(x,µ_N^tc)≤V_N^tc(x)is that it is in general quite difficult to give estimates forV_N^tc(x). Under reasonable assumptions it can be shown thatV_N^tc(x)→V_∞(x)forN→∞[14, Section 5.4]. This implies that the MPC solution is near optimal for the infinite horizon problem forNsufficiently large. However, it is in general difficult to make statements about the speed of convergence ofV_N^tc(x)→ V_∞(x)asN→∞and thus to estimate the length of the horizonNwhich is needed for a desired degree of suboptimality.

4.2 No terminal conditions

The decisive property induced by Assumption 4 and exploited in the proof of Theo- rem 5 is the fact thatV_N−1^tc (x₀)≥V_N^tc(x₀). Without this inequality, (11) implies that (15) withε≡0 cannot in general be satisfied. Without terminal conditions and under the condition (17) it is, however, straightforward to see that the opposite inequality V_N−1^tc (x₀)≤V_N^tc(x₀)holds, where in most cases this inequality is strict. This means that without terminal conditions we need to work with positiveε. The following proposition, which was motivated by a similar “relaxed dynamic programming” inequality used in [21], introduces a variant of Proposition 2 which we will use for this purpose.

Proposition 6 Assume there is a constantα∈(0,1]such that the relaxed dynamic programming inequality

V_N(x)≥α`(x,µ_N(x)) +V_N(f(x,µ_N(x))) (19) holds for all x∈X. Then for each MPC closed loop solution x_MPC and all K∈N the inequality

(10)

J_∞^cl(x_MPC(0),µ_N)≤V∞(x_MPC(0))/α

and, if additionally(17)holds, the convergence x_MPC(k)→x∗for k→∞hold.

Proof. Applying Proposition 2 withε(x) = (1−α)`(x,µN(x))yields J_K^cl(x_MPC(0),µ_N)≤V_N(x_MPC(0))−V_N(x_MPC(K))

+ (1−α)

K−1 k=0

∑

`(x_MPC(k),µN(x_MPC(k)))

| {z }

=J_K^cl(x_MPC(0),µ_N)

.

UsingV_N ≥0 this implies αJ_K^cl(x_MPC(0),µ_N)≤V_N(x_MPC(0)) which implies the first assertion by lettingK→∞and dividing byα. The convergencex_MPC(k)→x∗

follows from Proposition 3. ut

A simple condition under which we can guarantee that (19) holds is given in the following assumption.

Assumption 7 There are constantsγ_k>0, k∈Nwithsup_k∈_Nγ_k<∞and V_k(x)≤γ_k inf

u∈U,(x,u)∈Y

`(x,u).

A sufficient condition for Assumption 7 to hold is that`is a polynomial satisfying (17) and the system can be controlled tox∗ exponentially fast. However, via an appropriate choice of`Assumption 7 can also be satisfied if the system is not exponentially controllable, see, e.g., [14, Example 6.7].

The following theorem, taken with modifications from [29], shows that Assump- tion 7 implies (19).

Proposition 8 Consider the MPC scheme without terminal conditions satisfying Assumption 7. Then(19)holds withα=1−(γ₂−1)(γ_N−1)∏^N−1_k=0

γ_k−1 γ_k

. Proof. First note that forx=x∗(19) always holds because all expressions vanish.

Forx6=x∗, we consider the MPC solution x_MPC(·)withx_MPC(0) =x, abbreviate λ_k=`(x_u^?

N(k,x),u^?_N(k))withu^?_Ndenoting the optimal control for initial valuex0=x, andν=V_N(f(x,µN(x))) =V_N(x_MPC(1)). Then (19) becomes

N−1

∑

k=0

λk−ν≥α λ0 (20)

We prove the theorem by showing the inequality

λN−1≤(γ_N−1)

N−1 k=2

∏

γk−1 γ_k

λ0 (21)

(11)

for all feasibleλ₀, . . . ,λN−1. From this (20) follows since the dynamic programming equation (4) withx=x_MPC(1)andK=N−2 implies

ν≤

N−2

∑

n=1

`(x_u?

N(n,x),u^?_N(n)) +V₂(x_u?

N(N−1,x))≤

N−2

∑

n=1

λ_n+γ₂λN−1

and thus (21),γ2≥1 andλ0=1 yield

N−1

∑

n=0

λ_n−ν≥λ₀+ (1−γ₂)λN−1≥λ₀−(γ₂−1)(γ_N−1)

N−1

∏

k=2

γ_k−1 γk

λ₀=α λ₀.

i.e., (20). In order to prove (21), we start by observing that sinceu_K:= (u^?_N(K), . . ., u^?_N(N−1))is an optimal control for initial valuex_u^?

N(K,x)and horizonN−K, we obtain∑^N−1_k=pλ_k=VN−p(x_u^?

N(p+1))≤γN−pλp, which implies

N−1

∑

k=p+1

λ_k≤(γN−p−1)λ_p (22) forp=0, . . . ,N−2. From this we can conclude

λ_p+

N−1

∑

k=p+1

λ_k≥∑^N−1_k=p+1λ_k γN−p−1 +

N−1

∑

k=p+1

λ_k= γN−p

γN−p−1

N−1

∑

k=p+1

λ_k.

Using this inequality inductively forp=1, . . . ,N−2 yields

N−1 k=1

∑

λk≥

N−2

∏

k=1

γN−k

γN−k−1

λN−1=

N−1 k=2

∏

γk

γ_k−1

λN−1.

Using (22) forp=0 we then obtain (γ_N−1)λ₀≥

N−1

∑

k=1

λ_k≥

N−1

∏

k=2

γ_k γ_k−1

λN−1

which implies (21). ut

This proposition immediately leads to the following theorem.

Theorem 9.Consider the MPC scheme without terminal conditions satisfying As- sumption 7. Then for all sufficiently large N∈Nthe inequality J_∞^cl(x,µN)≤V_∞(x)/α and the convergence x_MPC(k)→x∗for k→∞hold for all x∈Xand the closed loop solution x_MPC(k)with x_MPC(0) =x, withα from Proposition 8.

Proof. Sinceγ∞:=sup_k∈_Nγ_k<∞it follows that(γ_k−1)/γ_k≤(γ_∞−1)/γ_∞<1 for allk∈N, implying thatα from Proposition 8 satisfiesα ∈(0,1] for sufficiently largeN. For theseNthe assertion follows from Proposition 6. ut

(12)

We note thatα from Proposition 8 is not optimal. In [15] (see also [30] and [14, Chapter 6]) the optimal bound

α=1− (γ_N−1)∏^N_k=2(γ_k−1)

∏^N_k=2γk−∏^N_k=2(γ_k−1) (23) is derived, however, at the expense of a much more involved proof than that of Proposition 8. The difference between the two bounds can be illustrated if we as- sumeγk=γ for allk∈Nand compute the minimalN∈Nsuch thatα>0 holds, i.e., the minimalNfor which Theorem 9 ensures the convergencex_MPC(k)→x∗. For αfrom Proposition 8 we obtain the conditionN>2+2 ln(γ−1)/(lnγ−ln(γ−1)) while forα from (23) we obtainN>2+ln(γ−1)/(lnγ−ln(γ−1)). The optimal α hence reduces the estimate forNroughly by a factor of 2.

The analysis can be extended to the situation in whichαin (19) cannot be found for allx∈X. In this case, one can proceed similarly as in the discussion after Theo- rem 15, below, in order to obtain practical asymptotic stability, i.e., inequality (34), on bounded subsets ofX.

5 Economic MPC

Economic MPC has become the common name for MPC schemes in which the stage cost`does not penalize the distance to an equilibriumx∗which was deter- mined a priori. Rather,`models economic objectives, like high output, low energy consumption etc. or a combination thereof.

For such general`many of the arguments from the previous section do not work for several reasons. First, the costJ_N and thus the optimal value functionV_N is not necessarily nonnegative, a fact which was exploited in several places in the proofs in the last section. Second, the infinite sum in the infinite horizon objective need not converge and thus it may not make sense to talk about infinite horizon performance.

Finally, optimal trajectories need not stay close or converge to an equilibrium, again a fact that was used in various places in the last section.

A systems theoretic property which effectively serves as a remedy for all these difficulties is contained in the following definition.

Definition 10 (Strict Dissipativity and Dissipativity) We say that an optimal control problem with stage costìsstrictly dissipativeat an equilibrium(xê,uê)∈Y if there exists a storage functionλ :X→Rbounded from below and satisfying λ(xê) =0, and a functionρ∈K∞such that for all(x,u)∈Ythe inequality

`(x,u)−`(xê,uê) +λ(x)−λ(f(x,u))≥ρ(|x|_xê) (24) holds. We say that an optimal control problem with stage cost` isdissipative at (xê,uê)if the same conditions hold withρ≡0.

(13)

We note that the assumptionλ(x^e) =0 can be made without loss of generality because adding a constant toλ does not invalidate (24).

The observation that strict dissipativity is the “right” property in order to analyze economic MPC schemes was first made by Diehl, Amrit and Rawlings in [8], where strict duality, i.e., strict dissipativity with a linear storage function, was used. The extension to the nonlinear notion of strict dissipativity was then made by Angeli and Rawlings in [3]. Although recent studies show that for certain classes of systems this property can be further (slightly) relaxed (see [26]), here we work with this condition because it provides a mathematically elegant way for dealing with economic MPC.

Remark 11 Strict dissipativity implies several important properties:

(i) The equilibrium(xê,uê)∈Yfrom Definition 10 is a strict optimal equilibrium in the sense that`(xê,uê)< `(x,u)for all other admissible equilibria of f , i.e., all other(x,u)∈Ywith f(x,u) =x. This follows immediately from(24).

(ii) The optimal equilibrium x^e has the turnpike property, i.e., the following holds:

For eachδ>0there existsσ_δ ∈L such that²for all N,P∈N, x∈Xand u∈ U^N(x)with J_Nûc(x,u)≤N`(xê,uê) +δ, the setQ(x,u,P,N):={k∈ {0, . . . ,N− 1} | |x_u(k,x)|_xe≥σ_δ(P)}has at most P elements. A proof of this fact can be found, e.g., in [14, Proposition 8.15]. The same property holds for all near optimal trajectories of the infinite horizon problem, provided it is well defined, cf. [14, Proposition 8.18].

(iii) If we define the modified or rotated cost`(x,˜ u):=`(x,u)−`(x^e,u^e) +λ(x)− λ(f(x,u)), then this modified cost satisfies(17), i.e., the basic property we exploited in the previous section.

The third property enables us to use the optimal control problem with modified cost ˜`as an auxiliary problem in our analysis. The way this auxiliary problem is used crucially depends on whether we use terminal conditions or not. We start with the case with terminal conditions. Throughout this section, we assume that all functions under consideration are continuous inx^e.

5.1 Terminal conditions

For the economic MPC problem with terminal conditions we make exactly the same assumption on the terminal constraint setX0and the terminal costFas in the stabilizing case, i.e., we again use Assumption 4. We assume without loss of generality thatF(x^e) =0, which implies thatF may attain negative values, because`may be negative, too.

2The spaceLcontains all functionsσ:[0,∞)→[0,∞)which are continuous and strictly decreas- ing with lim_t→∞σ(t) =0.

(14)

Now the main trick — taken from [1] — lies in the fact that we introduce an adapted terminal cost for the problem with the modified cost ˜`. To this end, we define the terminal costF(x)e :=F(x) +λ(x). We denote the cost functional for the modified problems without and with terminal conditions byJeNandJe_N^tc, respectively, and the corresponding optimal value functions byVeN andVe_N^tc. Then a straightforward computation reveals that

Je_N^tc(x,u) =J_N^tc(x,u) +λ(x)−N`(xê,uê), (25) which means that the original and the modified optimization objective only differ in terms which do not depend onu. Hence, the optimal trajectories corresponding to Ve_N^tcandV_N^tccoincide and the MPC scheme using the modified costs ˜àndFeyields exactly the same closed loop trajectories as the scheme usingàndF.

One easily sees thatFeand ˜`also satisfy Assumption 4, i.e., that for eachx∈X0

there isu_x∈Uwith(x,u_x)∈Y, f(x,u_x)∈X0and

`(x,u˜ _x) +Fe(f(x,u_x))≤Fe(x) (26)

if`andFsatisfy this property.

Moreover, if Fe is bounded on X0, then (26) impliesFe(x)≥0 for all x∈X0. In order to see this, assumeF(x₀)<0 for somex0∈X0and consider the control sequence defined by u(k) =ux withux from (26) forx=xu(k,x0). Then, ˜`≥0 implies F(x_u(k,x))≤F(x₀)<0 for allk∈N. Moreover, similar as in the proof of Theorem (3), the fact that ˜` satisfies (17) implies thatx_u(k,x₀)→x^e, because otherwise F(x_u(k,x₀))→ −∞which contradicts the boundedness of F. But then continuity ofFinx^eimplies

F(x^e) =lim

k→∞F(x_u(k,x₀))≤F(x₀)<0

which contradicts F(x^e) =0. HenceF(x)e ≥0 follows for allx∈X0(for a more detailed proof see [14, Proof of Theorem 8.13]).

As a consequence, the problem with the modified costs ˜`andFesatisfies all the properties we assumed for the results in Section 4.1. Hence, Theorem 5 applies and yields the convergencex_MPC(k)→x^eand the performance estimate

Je_∞^cl(x,µ_N^tc)≤Ve_N^tc(x).

As in the stabilizing case, under suitable conditions we obtainVe_N^tc(x)→Ve_∞(x)as N→∞. However, this only gives an estimate for the modified objectiveJe_∞^cl with stage cost ˜`but not for the original objectiveJ_∞^clwith stage cost`.

In order to obtain an estimate forJ_∞^cl, one can proceed in two different ways:

either one assumes`(xê,uê) =0 (which can always be achieved by adding`(xê,uê) to`) and that the infinite horizon problem is well defined, which in particular means that |V_∞(x)| is finite. Then, from the definition of the problems, one sees that the relations

(15)

Je_∞^cl(x,µ_N^tc) =J_∞^cl(x,µ_N^tc)−lim

k→∞λ(x_MPC(k)) and

Ve_∞(x)≤V_∞^cl(x)−lim

k→∞λ(x_u^?

∞(k,x)) and V_∞(x)≤Ve_∞^cl(x) +lim

k→∞λ(x_u_˜^?

∞(k,x)) hold forx_MPC(0) =xand ˜u^?_∞andu^?_∞denoting the optimal controls corresponding to Ve_∞(x)andV_∞(x), respectively.

Now strict dissipativity impliesx_u_˜^?

∞(k,x)→x^eandx_u^?

∞(k,x)→xêask→∞(for details see [14, Proposition 8.18]), moreover, we already know thatx_MPC(k)→xê ask→∞. Sinceλ(xê) =0 andλis continuous inxêthis implies

J_∞^cl(x,µ_N^tc)→V_∞(x)

asN→∞, i.e., near optimal infinite horizon performance of the MPC closed loop for sufficiently largeN.

The second way to obtain an estimate is to look atJ_K^cl(x,µ_N^tc), which avoids setting`(xê,uê) =0 and making assumptions on|V∞|. However, whilex_MPC(k)→xê, in the economic MPC context — even in the presence of strict dissipativity — the optimal trajectoryx_u^?

N(k,x)will in general not end nearx^e, see, e.g., the examples in [11, 12] or [14, Chapter 8]. Hence, comparingJ_K^cl(x,µ_N^tc)andV_K(x)will in general not be meaningful. However, if forx=x_MPC(0)we setδ(k):=|x_MPC(k)|_x^eand define the class of controls

U^Kδ(K)(x):={u∈U^K(x)| |x_u(K,x)|_xe≤δ(K)} (27) then it makes sense to compareJ_K^cl(x,µ_N^tc)and inf_u∈

U^K_δ_(K)(x)J_K(x,u). More precisely, in [13] (see also [14, Section 8.4]) it was shown that there are error termsδ1(N)and δ2(K), converging to 0 asN→∞orK→∞, respectively, such that the estimate

J_K^cl(x,µ_N^tc)≤ inf

u∈U^K_δ(K)(x)

J_K(x,u) +δ1(N) +δ2(K) (28)

holds. In other words, among all solutions steeringxinto theδ(K)-neighborhood of the optimal equilibriumx^e, MPC yields the cheapest one up to error terms vanishing asKandNbecome large.

In summary, except for inequality (28) which requires additional arguments, by using terminal conditions the analysis of economic MPC schemes is not much more difficult than the analysis of stabilizing MPC schemes. However, in contrast to the stabilizing case, so far no systematic procedure for the construction of terminal costs and constraint sets satisfying (26) is known. Hence, it appears attractive to avoid the use of terminal conditions.

(16)

5.2 No terminal conditions

If we want to avoid the use of terminal conditions, the analysis becomes consider- ably more involved. The reason is that without terminal conditions the relation (25) changes to

Je_N(x,u) =J_N(x,u) +λ(x)−λ(x_u(N,x))−N`(x^e,u^e). (29) This means that the difference betweenJ_N andJe_N now depends onu and conse- quently the optimal trajectories do no longer coincide. Moreover, the central property exploited in the proof of Proposition 8, whose counterpart in the setting of this section would be thatλN−1=`(x_u^?

N(N−1,x),u^?_N(N−1))is close to`(x^e,u^e), is in general not true for economic MPC, not even for simple examples, see [11, 12] or [14, Chapter 8]. Hence, we cannot expect the arguments from the stabilizing case to work.

For these reasons, we have to use different arguments, which are combinations of arguments found in [11, 12, 18]. To this end we make the following assumptions.

Assumption 12 (i) The optimal control problem is strictly dissipative in the sense of Definition 10.

(ii) There exist functionsγV,γ

Ve andγ_λ ∈K∞as well asω,ω˜ ∈L such that the following inequalities hold for all x∈Xand all N∈N∞:

(a) |V_N(x)−V_N(x^e)| ≤ γV(|x|_xe) +ω(N) (b) |eV_N(x)−Ve_N(x^e)| ≤ γ

Ve(|x|_xê) +ω(N)˜ (c) |λ(x)−λ(xê)| ≤ γ_λ(|x|_xê)

Part (ii) of this assumption is a uniform continuity assumption inxê. For the optimal value functionsV_NandVe_Nit can, e.g., be guaranteed by local controllability around xê, see [11, Theorem 6.4]. We note that this assumption together with the obvious inequalityV_N(xê)≤N`(xê,uê)and boundedness ofXimpliesV_N(x)≤N`(xê,uê)+δ withδ=sup_x∈_Xγ_V(|x|_xê) +ω(0). Hence, the optimal trajectories have the turnpike property according to Remark 11(ii).

For writing (in)equalities that hold up to an error term, we use the following convenient notation: for a sequence of functionsaJ:X→R,J∈N, and another functionb:X→Rwe writeaJ(x)≈_Jb(x)if limJ→∞sup_x∈_XaJ(x)−b(x) =0 and we writea_J(x)<∼Jb(x)if lim sup_J→∞sup_x∈_Xa_J(x)−b(x)≤0. In words,≈_Jmeans

“=up to terms which are independent ofxand vanish asJ→∞”, and <

∼Jmeans the same for≤.

With these assumptions and notation we can now prove the following relations. For simplicity of exposition in what follows we limit ourselves to a bounded state spaceX. If this is not satisfied, the following considerations can be made for bounded subsets ofX. As we will see, dynamic programming arguments are ubiquitous in the following considerations.

(17)

Lemma 13 LetXbe bounded. Then under Assumptions 12 the following approximate equalities hold.

(i) V_N(x) ≈_S J_M(x,u^?_N) +VN−M(xê) for all M6∈Q(x,u^?_N,P,N) (ii) V_N(xê) ≈_S M`(xê,uê) +VN−M(xê) for all M6∈Q(xê,u^?e_N,P,N) (iii) Ve_N(x) ≈_N V_N(x) +λ(x)−V_N(xê)

Here P∈Nis an arbitrary number, S:=min{P,N−M}, u^?_Nis the control minimizing JN(x,u), u^?e_N is the control minimizing JN(x^e,u), andQis the set from Remark 11(ii).

Moreover, (i) and (ii) also apply to the optimal control problem with stage cost`.˜ Proof. (i) Observe that using the constant controlu≡uêwe can estimateV_N(xê)≤ J_N(xê,u) =N`(xê,uê). Thus, using Assumption 12 we getJ_N(x,u^?_N)≤N`(xê,uê) + γ_V(|x|_xê) +ω(N), hence the turnpike property from Remark 11(ii) applies to the optimal trajectory withδ=γV(|x|_xe)+ω(N). This in particular ensures|x_u^?

N(M,x)|_xe≤ σ_δ(P)for allM6∈Q(x,u^?_N,P,N).

Now the dynamic programming equation (5) yields V_N(x) =J_M(x,u^?_N) +VN−M(x_u^?

N(M,x)).

Hence, (i) holds with remainder termsR₁(x,M,N) =VN−M(x_u^?

N(M,x))−VN−M(x^e).

For anyP∈Nand anyM6∈Q(x,u^?_N,P,N)we have|R₁(x,M,N)| ≤γ_V(|x_u^?

N(M,x)|_x^e) +ω(N−M)≤γV(σ_δ(P)) +ω(N−M)and thus (i).

(ii) From the dynamic programming equation (4) andu≡uêwe obtain V_N(xê)≤M`(xê,uê) +VN−M(xê).

On the other hand, from (5) we have V_N(x^e) =J_M(x,u^?e_N) +VN−M(x_u?e

N(M,x^e))

=Je_M(x,u^?e_N)

| {z }

≥0

−λ(x^e) +λ(x_u?e

N(M,xê)) +M`(xê,uê) +V_N−M(x_u?e

N(M,x^e))

≥VN−M(xê) +M`(xê,uê) +h

VN−M(x_u^?e

N(M,x^e))−VN−M(x^e)i +h

λ(x_u^?e

N(M,x^e))−λ(x^e)i Now sinceVN−Mandλsatisfy Assumption 12(ii) andx_u^?e

N(M,x^e)≈_Px^efor allM6∈

Q(x^e,u^?e_N,P,N), we can conclude that the differences in the squared brackets have values≈_S0 which shows the assertion.

(iii) Fixx∈Xand letu^?_N and ˜u^?_N∈U^N(x)denote the optimal control minimiz- ingJ_N(x,u)andJe_N(x,u), respectively. We note that if the optimal control problem with costìs strictly dissipative then the problem with cost ˜ìs strictly dissipative, too, with bounded storage function λ ≡0 and sameρ∈K∞. Moreover,V_N(x)≤ N`(xê,uê) +γ_V(|x|_xe) +ω(N)andVe_N(x)≤N`(x˜ ê,uê) +γ

Ve(|x|_xe), sinceV_N(x^e)≤

(18)

N`(xê,uê)andVe_N(xê) =0. Hence, the turnpike property from Remark 11(ii) applies to the optimal trajectories for both problems, yieldingσ_δ ∈L andQ(x,u^?_N,P,N)for x_u^?

N and ˜σδ˜ andQe(x,u˜^?_N,P,N)forx_u_˜^?

N. For allM6∈Qe(x,u˜^?_N,P,N)∪Q(x^e,u^?e_N,P,N) we can estimate

V_N(x) ≤ J_M(x,u˜^?_N) +VN−M(x_u_˜^?

N(M))

≤ J_M(x,u˜^?_N) +VN−M(x^e) +γV(σ˜δ˜(P)) +ω(N−M)

≤ JeM(x,u˜^?_N)−λ(x) +λ(xê) +M`(xê,uê) +VN−M(xê) +γV(σ˜δ˜(P)) +γ_λ(σ˜δ˜(P)) +ω(N−M)

<∼SVe_N(x)−λ(x) +V_N(x^e)

forS=min{P,N−M}, where we have applied the dynamic programming equation (4) in the first inequality, the turnpike property forx_u_˜^?

N and Assumption 12 and (29) in the second and third inequality and (i) applied toVe_N, and (ii) applied toìn the last step. Moreover,λ(xê) =0 andVe_N(xê) =0 were used.

By exchanging the two optimal control problems and using the same inequalities as above, we get

Ve_N(x)<

∼SV_N(x) +λ(x)−V_N(x^e) for allM6∈Q(x,u^?_N,P,N)∪Qe(x^e,u˜^?e_N,P,N). Together this implies

Ve_N(x)≈_SV_N(x) +λ(x)−V_N(x^e)

for allM6∈Q(x,u^?_N,P,N)∪Qe(x,u^?_N,P,N)∪Q(x,u^?e_N,P,N)∪Qe(x^e,u˜^?e_N,P,N)andS= min{P,N−M}.

Now, choosingP=bN/5c, the union of the fourQ-sets has at most 4N/5 elements, hence there existsM≤N/5 for which this approximate inequality holds.

This yieldsS=bN/5cand thus≈_Simplies≈_N, which shows (ii). ut

We note that precise quantitative statements can be made for the error terms “hid- ing” in the≈_J-notation. Essentially, these terms depend on the distance between the optimal trajectories to the optimal equilibrium in the turnpike property, as measured by the functionσ_δ in Remark 11(ii), and by the functions from Assumption 12. For details we refer to [14, Chapter 8].

Now, as in the previous section we can proceed in two different ways. Again, the first way consists in assuming`(x^e,u^e) =0 and the infinite horizon problem is well defined, implying that|V_∞(x)|is finite for allx∈X. In this case, we can derive the following additional relations.

Lemma 14 LetXbe bounded, let Assumption 12 hold and assume `(x^e,u^e) =0.

Then the following approximate equalities hold.

(i) V_∞(x) ≈_P J_M(x,u^?_∞) +V_∞(x^e) for all M6∈Q(x,u^?_∞,P,∞) (ii) J_M(x,u^?_∞) ≈_S J_M(x,u^?_N) for all M6∈Q(x,u^?_N,P,N)

∪Q(x,u^?_∞,P,∞).