Ensemble smoothing under the influence of nonlinearity

(1)

Ensemble smoothing

under the influence of nonlinearity

Lars Nerger

Alfred Wegener Institute for Polar and Marine Research Bremerhaven, Germany

Svenja Schulte and Angelika Bunse-Gerstner

University of Bremen, Germany University of Reading, July 2, 2013

(2)

Lars Nerger – Nonlinearity and smoothing

Outline

  Ensemble smoothers

  Influence of nonlinearity

  Influence of localization

  Smoothing in a real model

(3)

Ensemble Smoothers

(4)

Smoothers

Filters (e.g. Ensemble Kalman filter)

  Estimate using observations until analysis time Smoothers perform retrospective analysis

  Use future observations for estimation in the past

  Example applications:

  Reanalysis

  Parameter estimation

(5)

Smoother:

X

^a_k ₁_|_k

= X

^a_k ₁_|_k ₁

C

_k

Ensemble smoothing

  Smoothing is very simple (ensemble matrix )

(see e.g. Evensen, 2003)

Filter: X

^a_k_|_k

= X

^f_k_|_k ₁

C

_k

In the numerical experiments, the matrix ˜ D

_δ

is constructed using a 5th order polynomial function (Eq. 4.10 of Gaspari and Cohn 1999), which mimicks a Gaussian function but has compact support. The distance between the analysis and observation grid points at which the functions becomes zero is used here to a define the localization length.

c. The smoother extension ESTKS

The smoother extension of the ESTKF is formulated analogous to the ensemble Kalman smoother (EnKS, Evensen 2003). The sequential smoother computes a state correction at an earlier time t

_i

, i < k utilizing the filter analysis update at time t

_k

.

For the smoother, the notation is extended according to the notation used in estimation theory (see, e.g., Cosme et al. 2010): A subscript i | j is used, where i refers to the time that is represented by the state vector and j refers to the latest time for which observations are taken into account. Thus, the former analysis state x

^a_k

is written as x

^a_k_|_k

and the forecast state x

^f_k

is denoted as x

^f_k_|_k₋₁

. In this notation, the superscripts a and f are redundant.

To formulate the smoother, the transformation equation (14) is first written as a product of the forecast ensemble with a weight matrix as

X

^a_k_|_k

= X

^f_k_|_k₋₁

G

_k

(19)

with

G

_k

= 1

_(m)

+ T !

W

_k

+ W

_k

"

. (20)

Here the relation X

^f_k_|_k₋₁

= X

^f_k_|_k₋₁

1

_(m)

is used with the matrix 1

_(m)

that contains the value m

⁻¹

in all entries. The smoothed state ensemble at time t

_k₋₁

taking into account all obser-

8   Ensemble smoothing is cheap

  e.g. E. Kalnay: “no-cost smoother”

  weight matrix already computed in filter

  just recombine previous ensemble states (actually the most costly part of the filter)

  but: smoothing is recursive – application of each

for all previous times within lag C

_k

(6)

Smoother with linear model

Smoother is optimal for linear systems:

➜  Forecast of smoothed state = filter analysis at later time

X

^a_i_|_k

= X

^a_i,i

Y

k

j=i+1

C

_j

X

^a_k_|_k

= M

_k,i

X

^a_i,i

Y

k

j=i+1

C

_j

➜  Based on ensemble cross-correlation between two time instances

➜  Each additional lag reduces error

(if covariances are correctly estimated, Cohn et al. 1994) (Ensemble perturbation matrix )

X

⁰

= X X ¯

x

^a_i_|_k

= x

^a_i_|_k ₁

+ X

⁰_i^a_|_k ₁

⇣

X

⁰_k^a_|_k

⌘

T

E

(7)

Smoother and Nonlinearity

(8)

Smoother and nonlinearity

  Optimality doesn’t hold with nonlinear systems!

influenced by nonlinear model

➜   What is the effect of the nonlinearity?

➜   Do ensembles just decorrelate?

(mentioned e.g. by Cosme et al. 2010)

➜  Consider smoother performance relative to filter (Smoother reduces estimation error from the filter)

x

^a_i_|_i+1

= x

^a_i_|_i

+ X

⁰_i^a_|_i

⇣

X

⁰_i+1^a _|_i+1

⌘

T

E ˜

(9)

Numerical study with Lorenz-96

  Cheap and small model (state dimension 40)

  Local and global filters possible

  Nonlinearity controlled by forcing parameter F

  Up to F=4: periodic waves; perturbations damped

  F>4: non-periodic

  Nonlinearity of assimilation also influenced by forecast length

  Experiments over 20,000 time steps

  Use smoother with ESTKF (Nerger et al., 2012)

  Tune covariance inflation for minimal RMS errors

  Implemented in open source assimilation software PDAF

(http://pdaf.awi.de)

(10)

Square root of covariance matrix (ensemble size N, state dim n)

T is specific for filter algorithm:

ETKF:

T removes ensemble mean

(usually, compute directly ) Z has dimension nN

SEIK:

T removes ensemble mean and drops last column Z has dimension n(N-1)

Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(167) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(168)

Z

^f_k

= X

^f_k

X

^f_k

(169)

Z = X

^f

T (170)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(171) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(172) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(173) G := 1

N 1 I (174)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(175) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

^o_k

H

_k

x

^f_k

⌅

(176)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(177) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(178) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(179) A

¹

= I + (HZ)

^T

R

¹

HZ

^f

(180)

P

^a

= ZAZ

^T

(181)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (182)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(183) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(184) P

^a_k

= 1

N 1 Z

^a_k

(Z

^a_k

)

^T

(185) Z

^a_k

= ⇥

N 1Z

^f_k

A

^1/2_k

(186)

Z

^a_k

= Z

^f_k

W

_k

(187)

W

_k

= ⇥

N 1U

_k

S

_k ^1/2

U

^T_k

(188)

U

_k

S

_k

V

_k

= A

_k ¹

(189)

15 Analysis

X ^f _k = ⌃

x ^f _k ⁽¹⁾ , . . . , x ^f _k ^(N ⁾ ⌥

(167) X ^f _k = ⌃

x ^f _k , . . . , x ^f _k ⌥

(168)

Z ^f _k = X ^f _k X ^f _k (169)

Z = X ^f T (170)

P ^f = ZZ ^T (171)

P ˇ ^f _k = 1

N 1

⇧ N

l=1

⇤ x ^f _k ^(l) x ^f _k ⌅⇤

x ^f _k ^(l) x ^f _k ⌅ T

(172) P ˇ ^f _k = 1

N 1 Z ^f _k ⇤

Z ^f _k ⌅ T

(173) P ˇ ^f _k = Z ^f _k G ⇤

Z ^f _k ⌅ T

(174) G := 1

N 1 I (175)

x ^a _k = x ^f _k + Z ^f _k w _k (176) w _k = A _k (H _k Z ^f _k ) ^T R _k ¹ ⇤

y _k ^o H _k x ^f _k ⌅

(177)

A _k ¹ = G ¹ + (H _k Z ^f _k ) ^T R _k ¹ H _k Z ^f _k (178) A _k ¹ = (N 1)I + (H _k Z ^f _k ) ^T R _k ¹ H _k Z ^f _k (179) P ˇ ^a _k = Z ^f _k A _k (Z ^f _k ) ^T (180) A = G + (HZ) ^T R ¹ HZ ⇥ ¹

(181)

P ^a = ZAZ ^T (182)

Ensemble transformation

X ^a = X ^a + X ^f _k W (183)

X ^a _k = X ^f _k + Z ^f _k W _k + W _k ⇥

(184) W _k = ⌃

w _k , . . . , w _k ⌥

(185) P ^a _k = 1

N 1 Z â _k (Z â _k ) ^T (186) Z â _k = ⇥

N 1Z ^f _k A ^1/2 _k (187)

Z ^a _k = Z ^f _k W _k (188)

W _k = ⇥

N 1U _k S _k ^1/2 U ^T _k (189)

U _k S _k V _k = A _k ¹ (190)

15

Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(168) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(169)

Z

^f_k

= X

^f_k

X

^f_k

(170)

Z = X X (171)

Z = X

^f

T (172)

P

^f

= ZZ

^T

(173)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(174) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(175) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(176) G := 1

N 1 I (177)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(178) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

_k^o

H

_k

x

^f_k

⌅

(179)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(180) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(181) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(182) A = G + (HZ)

^T

R

¹

HZ ⇥

¹

(183)

P

^a

= ZAZ

^T

(184)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (185)

X

^a

⇥ ZW (186)

WW

^T

= A (187)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(188) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(189)

15 Transformation matrix in ensemble space (small matrix)

ETKF:

A has dimension N

²

G = I (identity matrix) SEIK:

A has dimension (N-1)

²

G = ( T T

^T

)

^-1

Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(167) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(168)

Z

^f_k

= X

^f_k

X

^f_k

(169)

Z = X

^f

T (170)

P

^f

= ZZ

^T

(171)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(172) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(173) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(174) G := 1

N 1 I (175)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(176) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

^o_k

H

_k

x

^f_k

⌅

(177)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(178) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(179) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(180) A = G + (HZ)

^T

R

¹

HZ ⇥

1

(181)

P

^a

= ZAZ

^T

(182)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (183)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(184) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(185) P

^a_k

= 1

N 1 Z

^a_k

(Z

^a_k

)

^T

(186) Z

^a_k

= ⇥

N 1Z

^f_k

A

^1/2_k

(187)

Z

^a_k

= Z

^f_k

W

_k

(188)

W

_k

= ⇥

N 1U

_k

S

_k ^1/2

U

^T_k

(189)

U

_k

S

_k

V

_k

= A

_k ¹

(190)

15 Analysis state covariance matrix

Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(167) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(168)

Z

^f_k

= X

^f_k

X

^f_k

(169)

Z = X

^f

T (170)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(171) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(172) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(173) G := 1

N 1 I (174)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(175) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

_k^o

H

_k

x

^f_k

⌅

(176)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(177) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(178) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(179)

A

¹

= I + (HZ)

^T

R

¹

HZ (180)

P

^a

= ZAZ

^T

(181)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (182)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(183) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(184) P

^a_k

= 1

N 1 Z

^a_k

(Z

^a_k

)

^T

(185) Z

^a_k

= ⇥

N 1Z

^f_k

A

^1/2_k

(186)

Z

^a_k

= Z

^f_k

W

_k

(187)

W

_k

= ⇥

N 1U

_k

S

_k ^1/2

U

^T_k

(188)

U

_k

S

_k

V

_k

= A

_k ¹

(189)

15 The ESTKF: First compare ETKF and SEIK

Ensemble transformation based on square root of A

Very efficient:

Transformation matrix computed in space of dim. N or N-1

Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(167) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(168)

Z

^f_k

= X

^f_k

X

^f_k

(169)

Z = X

^f

T (170)

P

^f

= ZZ

^T

(171)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

^T

(172) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(173) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(174) G := 1

N 1 I (175)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(176) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

_k^o

H

_k

x

^f_k

⌅

(177)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(178) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(179) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(180) A = G + (HZ)

^T

R

¹

HZ ⇥

¹

(181)

P

^a

= ZAZ

^T

(182)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (183)

X

^a

⇥ X

^f

L (184)

LL

^T

= A (185)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(186) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(187)

15 Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(167) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(168)

Z

^f_k

= X

^f_k

X

^f_k

(169)

Z = X

^f

T (170)

P

^f

= ZZ

^T

(171)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(172) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(173) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(174) G := 1

N 1 I (175)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(176) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

^o_k

H

_k

x

^f_k

⌅

(177)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(178) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(179) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(180) A = G + (HZ)

^T

R

¹

HZ ⇥

¹

(181)

P

^a

= ZAZ

^T

(182)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (183)

X

^a

⇥ ZL (184)

LL

^T

= A (185)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(186) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(187)

15

L. Nerger et al., Monthly Weather Review 140 (2012) 2335-2345

(11)

The T matrix

Matrix T projects onto the error space spanned by ensemble SEIK and ETKF use different projections T

For identical forecast ensembles both filters

  yield identical analysis state

  perform slightly different ensemble transformations

  also: SEIK is slightly faster than ETKF

Analysis

X

^f_k

= ⌃

x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

⌥

(167) X

^f_k

= ⌃

x

^f_k

, . . . , x

^f_k

⌥

(168)

Z

^f_k

= X

^f_k

X

^f_k

(169)

Z = X

^f

T (170)

P ˇ

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(171) P ˇ

^f_k

= 1

N 1 Z

^f_k

⇤

Z

^f_k

⌅

T

(172) P ˇ

^f_k

= Z

^f_k

G ⇤

Z

^f_k

⌅

T

(173) G := 1

N 1 I (174)

x

^a_k

= x

^f_k

+ Z

^f_k

w

_k

(175) w

_k

= A

_k

(H

_k

Z

^f_k

)

^T

R

_k ¹

⇤

y

_k^o

H

_k

x

^f_k

⌅

(176)

A

_k ¹

= G

¹

+ (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(177) A

_k ¹

= (N 1)I + (H

_k

Z

^f_k

)

^T

R

_k ¹

H

_k

Z

^f_k

(178) P ˇ

^a_k

= Z

^f_k

A

_k

(Z

^f_k

)

^T

(179) A

¹

= I + (HZ)

^T

R

¹

HZ

^f

(180)

P

^a

= ZAZ

^T

(181)

Ensemble transformation

X

^a

= X

^a

+ X

^f_k

W (182)

X

^a_k

= X

^f_k

+ Z

^f_k

W

_k

+ W

_k

⇥

(183) W

_k

= ⌃

w

_k

, . . . , w

_k

⌥

(184) P

^a_k

= 1

N 1 Z

^a_k

(Z

^a_k

)

^T

(185) Z

^a_k

= ⇥

N 1Z

^f_k

A

^1/2_k

(186)

Z

^a_k

= Z

^f_k

W

_k

(187)

W

_k

= ⇥

N 1U

_k

S

_k ^1/2

U

^T_k

(188)

U

_k

S

_k

V

_k

= A

_k ¹

(189)

15   ETKF provides minimum transformation

  desirable for least disturbing ensemble states

  How to get minimum transformation into SEIK?

(12)

Error Subspace Transform Kalman Filter (ESTKF)

Combine advantages of SEIK and ETKF

Redefine T:

1.  Remove ensemble mean from all columns

2.  Subtract fraction of last column from all others 3.  Drop last column

L. Nerger et al., Monthly Weather Review 140 (2012) 2335-2345

Features of the ESTKF:

•  Same ensemble transformation as ETKF

•  Slightly cheaper computations

•  Direct access to ensemble-spanned error space

(13)

T-matrix in SEIK and ESTKF

  Efficient implementation as subtraction of means & last column

  ETKF: improve compute performance using a matrix T

SEIK:

Analysis

X

^f_k

= x

^f_k⁽¹⁾

, . . . , x

^f_k^(N⁾

✏

(76)

P ˇ

^f_k

= 1

N 1

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(77)

P ˇ

^f_k

= 1

N 1 X

^f_k

T(T

^T

T)

¹

T

^T

(X

^f_k

)

^T

(78)

T :=

⇧

↵ I

_r_⇥_r

0

₁_⇥_r

⌃

1 N

⇤ 1

_N_⇥_r

⌅

(79)

T

_i,j

=

⌥ ⌦

⌦ ⌦

⌦

1

_N¹

for i = j, i < N

1

N

for i ⇥ = j, i < N

1

N

for i = N

(80)

P ˇ

^f_k

= L

_k

GL

^T_k

(81)

L

_k

:= X

^f_k

T , G := 1

N 1 T

^T

T ⇥

1

(82)

U

_k¹

= G

¹

+ (H

_k

L

_k

)

^T

R

_k ¹

H

_k

L

_k

(83)

x

^a_k

= x

^f_k

+ K ˇ

_k

⇤

y

_k^o

H

_k

x

^f_k

✏ ⌅

(84) x

^a_k

= x

^f_k

+ K ˇ

_k

⇤

y

^o_k

H

_k

x

^f_k

⌅

(85) K ˇ

_k

= L

_k

U

_k

L

^T_k

H

^T_k

R

_k ¹

(86)

P ˇ

^a_k

= L

_k

U

_k

L

^T_k

(87)

Re-Init

P ˇ

^a_k

= L

_k

C

^T_k ^T_k _k

C

_k

L

^T_k

(88)

C

_k ¹

(C

_k ¹

)

^T

= U

_k ¹

(89)

x

^a(l)_k

= x

^a_k

+ ⇤

N 1 L

_k

C

^T_k ^T_k,l

(90)

X

^a_k

= X

^a_k

+ ⇤

N 1 L

_k

C

^T_k ^T_k

(91)

7 ESTKF:

13 ESTKF

Init

x

^a₀

⇤ R

ⁿ

(200)

P

^a₀

:= 1

N 1 L

₀

L

^T₀

, L

₀

⇤ R

ⁿ^⇥^N ¹

(201) { x

^a(l)₀

, l = 1, . . . , N } (202) X

^a₀

= ⌦

x

^a(1)₀

, . . . , x

^a(N₀ ⁾

↵

(203) L

^a_k

= X

^a_k

; ⇤ R

^N^⇥^N ¹

(204)

T ˆ

_i,j

=

⇧

⌥

⌃

1

_N¹ p1¹

N +1

for i = j, i < N

1 N

11

pN +1

for i ⌅ = j, i < N

p1

N

for i = N

(205)

x

^a₀

⇥ x

^a₀

(206)

P ˇ

^a₀

:= 1

N 1

N

l=1

⇤ x

^a(l)₀

x

^a₀

⌅⇤

x

^a(l)₀

x

^a₀

⌅

T

(207)

P ˇ

^a₀

:= 1

N 1

⇤ X

^a_k

X

^a_k

⌅⇤

X

^a_k

X

^a_k

⌅

T

(208)

X

^a₀

= [x

^a₀

, . . . , x

^a₀

] (209)

X

^a₀

= x

^a₀

, . . . , x

^a₀

⇥

(210)

Forecast

x

^f_i ^(l)

= M

_i,i ₁

[x

^a(l)_i ₁

] +

^(l)_i

(211)

17

(14)

Effect of forcing on the smoother – optimal lag

  Assimilate at each time step

  Ensemble size N=34

  Global ESTKF

  Inflation tuned for minimal RMS errors (account

for inflation in smoother)

0 50 100 150 200

0 0.05 0.1 0.15 0.2

mean RMS error for different forcings

lag [time steps]

mean RMS error

F=10 F=8 F=6 F=5 F=4

  Up to F=4

  very small RMS errors

  F>4

  Strong growth in RMS

  Clear impact of smoother

  Optimal lag:

minimal RMS error (red lines)

(15)

Stronger nonlinearity

  F=7

  Forecast length: 9 steps

  Clear error minimum at lag=2 analysis steps

➜   the optimal lag

  Error increase beyond optimal lag (here 50%!)

➜   spurious correlations

0 50 100 150 200

0.97 0.975 0.98 0.985 0.99 0.995 1

relative error reduction by smoother

lag [analysis steps]

RMS error relative to lag=0

Optimal lag 50% less smoother effect

(16)

2 4 6 8 10

0 0.05 0.1 0.15 0.2 0.25

mean RMS error at optimal lag

F

mean RMS error

Filter Smoother

2 4 6 8 10

0 50 100 150 200

Optimal lag

F

optimal lag [time steps]

7x error

doubling time

2 4 6 8 10

0 50 100 150 200

Optimal lag

F

N=34 N=20

2 4 6 8 10

0 0.05 0.1 0.15 0.2 0.25

F

mean RMS error

N34 N20

Impact of smoothing

  Optimal lag (minimal RMS error)

  Behavior similar to error-doubling time

  RMS error at optimal lag

  Smoother reduces error by 50% for all F>4

  Effect of sampling errors visible with smaller ensemble

(17)

Vary forecast length (F=7)

  Forecast length = time steps over which nonlinearity acts on ensemble

  Longer forecasts:

➜  Optimal lag shrinks

➜  RMS errors grow for filter and smoother

➜  Improvement by smoother shrinks (depends on forcing strength)

2 4 6 8

20 40 60 80 100 120

Optimal lag

forecast length [time steps]

~2x error doubling time

2 4 6 8

0 0.1 0.2 0.3 0.4 0.5 0.6

mean RMS error

Filter Smoother

(18)

Vary forecast length – different forcing strength

  Improvement by smoother depends on forcing strength

  Small forcing (F=5)

➜  Approx. constant improvement by smoother

  Larger forcing (F=7)

➜  Decreasing smoother effect

2 4 6 8

20 40 60 80 100 120

Optimal lag

~2x error doubling time

2 4 6 8

0 0.1 0.2 0.3 0.4 0.5 0.6

mean RMS error

Filter Smoother

2 4 6 8

0 0.1 0.2 0.3 0.4 0.5 0.6

mean RMS error

Filter

Smoother

F=5 F=7

(19)

Impact of Localization

(20)

Domain & observation localization

Local Analysis:

  Update small regions

(like single vertical columns)

  Observation localizations:

Observations weighted according to distance

  Consider only observations with weight >0

  State update and ensemble transformation fully local

Similar to localization in LETKF (e.g. Hunt et al, 2007)

S: Analysis region

D: Corresponding data region

(21)

10 20 30 40 50 60 70 80

0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24

Localization radius

MRMSE

F=8, ensemble size N=34 Thin lines: global analysis

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 90

Localization radius l opt [time steps]

Optimal lag

Influence of Localization on Smoothing

  Reduced RMS errors from filter and smoother by localization

  localization is useful even for N=34

Mean RMS error of optimal lag

Filter

Smoother

  Localization increases optimal lag

  more observational information useable

(22)

10 20 30 40 50 60 70 80

−0.09

−0.085

−0.08

−0.075

−0.07

−0.065

−0.06

−0.055

Localization radius MRMSE smoother− MRMSE filter

Influence of Localization on Smoothing (2)

  Use filter error as baseline

  Smoother results in additional reduction

  Smoother is more efficient with localization than for global filter

Error reduction by smoother

(23)

10 20 30 40 50 60 70 80

0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24

Localization radius

MRMSE

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 90

Localization radius l opt [time steps]

N=34 N=20 N=15 N=10

Smoothing with localization – smaller ensembles

  Larger effect of localization with smaller ensembles

  Optimal lag shrinks (impact of sampling errors)

  Localization radius for maximum optimal lag slightly larger than for minimum RMS error

Optimal lag Mean RMS error of optimal lag

Filter

Smoother

(24)

10 20 30 40 50 60 70 80

−0.09

−0.085

−0.08

−0.075

−0.07

−0.065

−0.06

−0.055

N=34 N=20 N=15 N=10

Smoother error reduction – smaller ensembles

  Smoother impact grows with ensemble size

  Effect of sampling errors

  RMS error from smoother decreases faster than from filter

  Amplification effect (multiple use of matrix C)

(25)

10 20 30 40 50 60 70 80

−0.09

−0.085

−0.08

−0.075

−0.07

−0.065

−0.06

−0.055

N=34 N=20 N=15 N=10

10 20 30 40 50 60 70 80

0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24

Localization radius

MRMSE

Optimal localization radius

Same localization radius for

  minimum filter RMS error

  largest smoother impact

➜  No re-tuning of localization radius for optimal smoothing!

Mean RMS error of optimal lag

Filter

Smoother

(26)

Smoothing in a Real Model

(27)

Global ocean model

FESOM (Finite Element Sea-ice Ocean model, Danilov et al. 2004) Global configuration

  1.3^o resolution, 40 levels

  horizontal refinement at equator

  state vector size 10⁷

  weak nonlinearity (not easy to change)

Drake passage

Twin experiments with sea surface height data

  ensemble size 32

  assimilate each 10th day over 1 year

  ESTKF with smoother extension and localization (using PDAF environment as single program)

  inflation tuned for optimal performance (ρ=0.9)

  run using 2048 processor cores

(Timings: forecasts 8800s, filter+smoother 200s)

(28)

Effect of smoothing on global model

Typical behavior

  RMSe reduced by smoother Error reductions:

~15% at initial time

~8% over the year

  Large impact of each lag up to 60 days

  Further reduction over full experiment

(optimal lag = 350 days)

0 100 200 300

0.005 0.01 0.015 0.02 0.025 0.03 0.035

day

RMS error

SSH: RMS errors over time

forecast & analysis smoothed (50 days)

0 50 100 150 200 250 300 350

0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02

lag [days]

RMS error

SSH: RMS error for different lags initial error

mean error

(29)

0 50 100 150 200 250 300 350

0.017 0.0172 0.0174

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.017 0.0171 0.0172

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.156 0.158 0.16

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.044 0.045 0.046 0.047

lag [days]

RMS error

Multivariate effect of smoothing – 3D fields

temperature salinity

merid. velocity zonal velocity

-1.0% at lag 40 -2.9% at lag 350

-0.9% at lag 40 -1.3% at lag 250

3D fields:

  Multivariate impact smaller & specific for each field

  Optimal lag specific for field

  Optimal lag smaller than for SSH (e.g. temperature directly influenced by atmospheric forcing, Brusdal et al. 2003)

(30)

0 50 100 150 200 250 300 350

0.0246 0.0248 0.025 0.0252

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.172 0.173 0.174 0.175

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.0256 0.0258 0.026

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.0256 0.0258 0.026

lag [days]

RMS error

0 50 100 150 200 250 300 350

0.088 0.09 0.092 0.094

lag [days]

RMS error

Multivariate effect of smoothing – surface fields

temperature salinity

merid. velocity zonal velocity

-0.9% at lag 30 -3.7% at lag 350

-0.9% at lag 30 -0.9% at lag 20

Ocean surface:

  Relative smoother impact not larger than for full 3D

  Deterioration for meridional velocity at long lags

➜  What is the optimal lag for multivariate assimilation?

(31)

Conclusion

  Multivariate assimilation:

➜  Lag specific for field

➜  Choose overall optimal lag or separate lags

➜  Best filter configuration also good for smoother

  Nonlinearity:

➜  Introduces spurious correlations in smoother

➜  Error increase beyond optimal lag

➜  Optimal lag: few times error doubling time

  Localization:

➜  Increases smoother impact

➜  Increases optimal lag

Lars.Nerger@awi.de – Nonlinearity and smoothing

Thank you!

(32)

Web-Resources

www.data-assimilation.net

Lars.Nerger@awi.de – Nonlinearity and smoothing

pdaf.awi.de

Ensemble smoothing under the influence of nonlinearity