Ensemble Data Assimilation: Algorithms and Software

(1)

Ensemble Data Assimilation:

Algorithms and Software

Lars Nerger

Alfred Wegener Institute

Helmholtz Center for Polar and Marine Research Bremerhaven, Germany

and

Bremen Supercomputing Competence Center BremHLR Bremen, Germany

Lars.Nerger@awi.de

Seminar at NMEFC, Beijing, China, October 10, 2014

(2)

Lars Nerger – Ensemble Data Assimilation

Outline

!  Ensemble-based Kalman filters

!  Implementation aspects

!  Assimilation software PDAF

(3)

Motivation

Information: Model Information: Observations

Model surface temperature Satellite surface temperature

•  Generally correct, but has errors

•  all fields, fluxes, …

•  Generally correct, but has errors

•  sparse information

(only surface, data gaps, one field) Combine both sources of information

quantitatively by computer algorithm

➜  data assimilation

Losa, S.N. et al. J. Marine Syst. 105 (2012) 152-162

(4)

Data Assimilation

"  Combine model with real data

"  Optimal estimation of system state:

• 

initial conditions (for weather/ocean forecasts, …)

• 

state trajectory (temperature, concentrations, …)

•  parameters (growth of phytoplankton, …)

•  fluxes (heat, primary production, …)

•  boundary conditions and ‘forcing’ (wind stress, …)

"  Also: Improvement of model formulation

•  parameterizations (biogeochemistry, sea-ice, …)

€

"  Characteristics of system:

• 

high-dimensional numerical model –

O

⁽¹⁰⁶^-10⁹⁾

•  sparse observations

•  non-linear

(5)

Data Assimilation

Consider some physical system (ocean, atmosphere,…)

time

observation truth

model

state

Variational assimilation

Sequential assimilation Two main approaches:

Optimal estimate basically by least-squares fitting

(6)

Ensemble-based Kalman Filters

(7)

Ensemble-based Kalman Filter

First formulated by G. Evensen (EnKF, 1994)

Kalman filter: express probability distributions by mean and covariance matrix

EnKF: Use ensembles to represent probability distributions

observation

time 0 time 1 time 2

analysis ensemble

forecast

ensemble transformation initial

sampling state

estimate

forecast

Looks simple!

BUT:

There are many possible

choices!

What is optimal?

(8)

Data Assimilation – Model and Observations

Two components:

1. State:

Dynamical model

€

x 2 R

ⁿ

x

_i

= M

_i _1,i

[x

_i ₁

]

2. Obervations:

Observation equation (relation of observation to state x):

Observation error covariance matrix:

y 2 R

^m

y = H [x]

R

(9)

a

The Ensemble Kalman Filter (EnKF, Evensen 94)

Ensemble

Analysis step:

Update each ensemble member

Kalman filter

5 EnKF

Init

x

^a₀

⌅ R

ⁿ

, P

^a₀

⌅ R

ⁿ^⇥ⁿ

(41) { x

^a(l)₀

, l = 1, . . . , N } (42) x

^a₀

= 1

N

⇧

N

l=1

x

^a(l)₀

⇥ x

^t₀

⇥

(43)

P ˜

^a₀

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)₀

x

^a₀

⌅⇤

x

^a(l)₀

x

^a₀

⌅

T

⇥ P

^a₀

(44)

P

^a₀

= LL

^T

, L ⌅ R

ⁿ^⇥^q

(45) x

^a(i)₀

= x

^a₀

+ Lb

⁽ⁱ⁾

, b

⁽ⁱ⁾

⌅ R

^q

(46)

⇤ N (0, 1) (47)

Forecast

x

^a(l)_i

= M

_i,i ₁

[x

^a(l)_i ₁

] +

^(l)_i

(48)

Analysis

{ y

^o(l)_k

, l = 1, . . . , N } (49) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

_k^o(l)

H

_k

⌃

x

^f_k^(l)

⌥⌅

(50) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

_k^o(l)

H

_k

x

^f_k^(l)

⌅

(51) K ˜

_k

= P ˜

^f_k

H

^T_k

⇤

H

_k

P ˜

^f_k

H

^T_k

+ R

_k

⌅

1

(52) K

_k

= P

^f_k

H

^T_k

⇤

H

_k

P

^f_k

H

^T_k

+ R

_k

⌅

1

(53) H

_k

P

^f_k

H

^T_k

+ R

_k

⌅ R

^m^⇥^m

(54) P ˜

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(55)

x

^a_k

:= 1 N

⇧

N

l=1

x

^a(l)_k

(56)

P ˜

^a_k

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)_k

x

^a_k

⌅⇤

x

^a(l)_k

x

^a_k

⌅

T

(57)

5

5 EnKF

Init

x

^a₀

⌅ R

ⁿ

, P

^a₀

⌅ R

ⁿ^⇥ⁿ

(41) { x

^a(l)₀

, l = 1, . . . , N } (42) x

^a₀

= 1

N

⇧

N

l=1

x

^a(l)₀

⇥ x

^t₀

⇥

(43)

P ˜

^a₀

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)₀

x

^a₀

⌅⇤

x

^a(l)₀

x

^a₀

⌅

T

⇥ P

^a₀

(44)

P

^a₀

= LL

^T

, L ⌅ R

ⁿ^⇥^q

(45) x

^a(i)₀

= x

^a₀

+ Lb

⁽ⁱ⁾

, b

⁽ⁱ⁾

⌅ R

^q

(46)

⇤ N (0, 1) (47)

Forecast

x

^a(l)_i

= M

_i,i ₁

[x

^a(l)_i ₁

] +

^(l)_i

(48)

Analysis

{ y

_k^o(l)

, l = 1, . . . , N } (49) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

_k^o(l)

H

_k

⌃

x

^f_k^(l)

⌥⌅

(50) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

^o(l)_k

H

_k

x

^f_k^(l)

⌅

(51) x

^a(l)_k

= x

^f_k^(l)

+ K

_k

⇤

y

_k^(l)

H

_k

x

^f_k^(l)

⌅

(52) K ˜

_k

= P ˜

^f_k

H

^T_k

⇤

H

_k

P ˜

^f_k

H

^T_k

+ R

_k

⌅

1

(53) K

_k

= P

^f_k

H

^T_k

⇤

H

_k

P

^f_k

H

^T_k

+ R

_k

⌅

1

(54) H

_k

P

^f_k

H

^T_k

+ R

_k

⌅ R

^m^⇥^m

(55) P ˜

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(56)

x

^a_k

:= 1 N

⇧

N

l=1

x

^a(l)_k

(57)

P ˜

^a_k

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)_k

x

^a_k

⌅⇤

x

^a(l)_k

x

^a_k

⌅

T

(58)

5

5 EnKF

Init

x

^a₀

⌅ R

ⁿ

, P

^a₀

⌅ R

ⁿ^⇥ⁿ

(41) { x

^a(l)₀

, l = 1, . . . , N } (42) x

^a₀

= 1

N

⇧

N

l=1

x

^a(l)₀

⇥ x

^t₀

⇥

(43)

P ˜

^a₀

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)₀

x

^a₀

⌅⇤

x

^a(l)₀

x

^a₀

⌅

T

⇥ P

^a₀

(44)

P

^a₀

= LL

^T

, L ⌅ R

ⁿ^⇥^q

(45) x

^a(i)₀

= x

^a₀

+ Lb

⁽ⁱ⁾

, b

⁽ⁱ⁾

⌅ R

^q

(46)

⇤ N (0, 1) (47)

Forecast

x

^a(l)_i

= M

_i,i ₁

[x

^a(l)_i ₁

] +

^(l)_i

(48)

Analysis

{ y

^o(l)_k

, l = 1, . . . , N } (49) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

^o(l)_k

H

_k

⌃

x

^f_k^(l)

⌥⌅

(50) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

_k^o(l)

H

_k

x

^f_k^(l)

⌅

(51) x

^a(l)_k

= x

^f_k^(l)

+ K

_k

⇤

y

_k^(l)

H

_k

x

^f_k^(l)

⌅

(52) K ˜

_k

= P ˜

^f_k

H

^T_k

⇤

H

_k

P ˜

^f_k

H

^T_k

+ R

_k

⌅

1

(53) K

_k

= P

^f_k

H

^T_k

⇤

H

_k

P

^f_k

H

^T_k

+ R

_k

⌅

1

(54) H

_k

P

^f_k

H

^T_k

+ R

_k

⌅ R

^m^⇥^m

(55) P ˜

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(56)

P

^f_k

:= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(57)

x

^a_k

:= 1 N

⇧

N

l=1

x

^a(l)_k

(58)

P ˜

^a_k

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)_k

x

^a_k

⌅⇤

x

^a(l)_k

x

^a_k

⌅

T

(59)

5

5 EnKF

Init

x

^a₀

⌅ R

ⁿ

, P

^a₀

⌅ R

ⁿ^⇥ⁿ

(41) { x

^a(l)₀

, l = 1, . . . , N } (42) x

^a₀

= 1

N

⇧

N

l=1

x

^a(l)₀

⇥ x

^t₀

⇥

(43)

P ˜

^a₀

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)₀

x

^a₀

⌅⇤

x

^a(l)₀

x

^a₀

⌅

T

⇥ P

^a₀

(44)

P

^a₀

= LL

^T

, L ⌅ R

ⁿ^⇥^q

(45) x

^a(i)₀

= x

^a₀

+ Lb

⁽ⁱ⁾

, b

⁽ⁱ⁾

⌅ R

^q

(46)

⇤ N (0, 1) (47)

Forecast

x

^a(l)_i

= M

_i,i ₁

[x

^a(l)_i ₁

] +

^(l)_i

(48)

Analysis

{ y

^o(l)_k

, l = 1, . . . , N } (49) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

^o(l)_k

H

_k

⌃

x

^f_k^(l)

⌥⌅

(50) x

^a(l)_k

= x

^f_k^(l)

+ K ˜

_k

⇤

y

_k^o(l)

H

_k

x

^f_k^(l)

⌅

(51) x

^a(l)_k

= x

^f_k^(l)

+ K

_k

⇤

y

_k^(l)

H

_k

x

^f_k^(l)

⌅

(52) K ˜

_k

= P ˜

^f_k

H

^T_k

⇤

H

_k

P ˜

^f_k

H

^T_k

+ R

_k

⌅

1

(53) K

_k

= P

^f_k

H

^T_k

⇤

H

_k

P

^f_k

H

^T_k

+ R

_k

⌅

1

(54) H

_k

P

^f_k

H

^T_k

+ R

_k

⌅ R

^m^⇥^m

(55) P ˜

^f_k

= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(56)

P

^f_k

:= 1

N 1

⇧

N

l=1

⇤ x

^f_k^(l)

x

^f_k

⌅⇤

x

^f_k^(l)

x

^f_k

⌅

T

(57)

x

^a_k

:= 1 N

⇧

N

l=1

x

^a(l)_k

(58)

P ˜

^a_k

:= 1

N 1

⇧

N

l=1

⇤ x

^a(l)_k

x

^a_k

⌅⇤

x

^a(l)_k

x

^a_k

⌅

T

(59)

5

Ensemble

covariance matrix Ensemble mean (state estimate)

(10)

Efficient use of ensembles

€

Kalman gain

K ˜

_k

= ˜ P

^f_k

H

^T_k

⇣

H

_k

P ˜

^f_k

H

^T_k

+ R

_k

⌘

1

K ˜

_k

= ⇣

P ˜

^f_k

⌘

1

+ H

^T

R

¹

H

1

H

^T

R

¹

Alternative form (Sherman-Morrison-Woodbury matrix identity)

Looks worse: matrices need inversion

n ⇥ n

K ˜

_k

= X

⁰

h

(N 1)I + X

⁰^T

H

^T

R

¹

HX

⁰

i

1

X

⁰^T

H

^T

R

¹

However: with ensemble

Inversion of matrix

(Ensemble perturbation matrix )

P ˜

^f_k

= (N 1)

¹

X

⁰

X

⁰^T

N ⇥ N

X

⁰

= X X ¯

(11)

!  Properties and differences not well understood

!  Learn from studying relations and differences

_

ETKF

Ensemble-based/error-subspace Kalman filters

A little “ zoo ” (not complete):

EAKF

ETKF EnKF(94/98)

SEIK

EnSRF SEEK

RRSQRT ROEK

MLEF EnKF(2003)

EnKF(2004)

SPKF ESSE

ESTKF EnKF(94/98)

SEEK

SEIK

Studied in Nerger et al. (2005)

SEIK

New study

(Nerger et al. 2012)

New filter formulation

L. Nerger et al., Tellus 57A (2005) 715-735

L. Nerger et al., Monthly Weather Review 140 (2012) 2335-2345

RHF

anamorphosis

Which filter should one use?

DEnKF

(12)

Right sided ensemble transformation

€

Very efficient: is small ( or )

Used in:

•  SEIK (Singular Evolutive Interpolated KF, Pham et al. 1998)

•  ETKF (Ensemble Transform KF, Bishop et al. 2001)

•  EnsRF (Ensemble Square-root Filter, Whitaker/Hamill 2001)

•  ESTKF (Error-Subspace Transform KF, Nerger et al. 2012)

X

⁰^a

= X

⁰^f

W

W N ⇥ N (N 1) ⇥ (N 1)

(13)

Error-subspace basis matrix

(T projects onto error space spanned by ensemble) Analysis covariance matrix

“Transform matrix” in error subspace

Transformation of ensemble perturbations

Ensemble weight matrix

•  is symmetric square root of

ESTKF (Error-Subspace Transform KF)

size (n x N-1)

(N-1 x N-1)

(N-1 x N) (n x N)

(n x n)

P

^a

= LAL

^T

L := X

^f

T

A

¹

= (N 1)I + (HL)

^T

R

¹

HL

X

⁰^a

= LW

^{EST KF}

W

^{EST KF}

= p

N 1CT

^T

C A

L. Nerger et al., Monthly Weather Review 140 (2012) 2335-2345

(14)

Requirements for applying ensemble Kalman filters

“Pure” ensemble-based Kalman filters have usually bad performance

•  e.g. due to

•  small ensemble size

•  nonlinearity

•  bias in model or data

Improvements through

•  Covariance inflation

•  Localization

•  Model error simulation S: Analysis region

D: Corresponding data region Localization

(15)

Implementation Aspects

(16)

Large scale data assimilation: Global ocean model

•  Finite-element sea-ice ocean model (FESOM)

•  Global configuration

(~1.3 degree resolution with refinement at equator)

•  State vector size: 10⁷

•  Scales well up to 256 processor cores

Sea surface elevation

•  Ocean state estimation by assimilating satellite data („ocean topography“)

•  Very costly due to large model size

(Currently using up to 2048 processor cores)

(17)

Computational and Practical Issues

Data assimilation with ensemble-based Kalman filters is costly!

Memory: Huge amount of memory required (model fields and ensemble matrix)

Computing: Huge requirement of computing time (ensemble integrations)

Parallelism: Natural parallelism of ensemble integration exists (needs to be implemented)

„Fixes “ : Filter algorithms do not work in their pure form („fixes “ and tuning are needed)

because Kalman filter optimal only in linear case

(18)

Implementing Ensemble Filters & Smoothers

➜ Abstraction of assimilation problem Ensemble forecast

•  can require model error simulation

•  naturally parallel

Analysis step of filter algorithms operates on abstract state vectors

(no specific model fields)

Analysis step requires information on observations

•  which field?

•  location of observations

•  observation error covariance matrix

•  relation of state vector to observation

(19)

PDAF: A tool for data assimilation

PDAF - Parallel Data Assimilation Framework

"  an environment for ensemble assimilation

"  provide support for ensemble forecasts

"  provide fully-implemented filter algorithms

"  for testing algorithms and for real applications

"  easily useable with virtually any numerical model

"  makes good use of supercomputers

Open source:

Code and documentation available at http://pdaf.awi.de

L. Nerger, W. Hiller, Computers & Geosciences 55 (2013) 110-118

(20)

Offline mode – separate programs Model

Aaaaaaaa Aaaaaaaa aaaaaaaa a

Start

Stop

read ensemble files analysis step

Aaaaaaaa Aaaaaaaa aaaaaaaaa

Start

Stop Do i=1, nsteps

Initialize Model

generate mesh Initialize fields

Time stepper

consider BC Consider forcing

Post-processing

For each ensemble state

•  Initialize from restart files

•  Integrate

•  Write restart files

•  Read restart files (ensemble)

•  Compute analysis step

•  Write new restart files

Assimilation program

write model restart files

⬅ generic

(21)

single program

state time

state

observations

mesh data

Indirect exchange (module/common) Explicit interface

Model

initialization time integration post processing

Filter

Initialization analysis re-initialization

Observations

obs. vector obs. operator

obs. error

Core of PDAF

Logical separation of assimilation system

Nerger, L., Hiller, W. (2013). Software for Ensemble-based DA Systems – Implementation and Scalability. Computers and Geosciences. 55: 110-118

(22)

Extending a Model for Data Assimilation

Aaaaaaaa Aaaaaaaa aaaaaaaaa

Start

Stop Do i=1, nsteps

Initialize Model

Time stepper

Post-processing

Aaaaaaaa Aaaaaaaa aaaaaaaaa

Start

Stop Do i=1, nsteps

Initialize Model

Time stepper

Post-processing

Model Extension for

data assimilation

Implementation uses parallel configuration of ensemble forecast provided by PDAF

Aaaaaaaa Aaaaaaaa aaaaaaaaa

Start

Stop

Initialize Model

Time stepper

Post-processing init_parallel_pdaf

Do i=1, nsteps init_pdaf

assimilate_pdaf

Aaaaaaaa Aaaaaaaa aaaaaaaaa

Start

Stop

Initialize Model

Time stepper

assimilate_pdaf

assimilate_pdaf For operational

forecasting use

(23)

2-level Parallelism

Filter

Forecast Analysis Forecast

1. Multiple concurrent model tasks 2. Each model task can be parallelized

!  Analysis step is also parallelized

Model Task 1

Model Task 2

Model Task 3

Model Task 1

Model Task 2 Model Task 3

(24)

User-supplied routines (call-back)

•  Model und observation specific operations

•  Elementary subroutines implemented in model context

•  Called by PDAF routines though a defined interface

•  initialize model fields from state vector

•  initialize state vector from model fields

•  application of observation operator H to some vector

•  initialization of vector of observations

•  multiplication with observation error covariance matrix

single program

state time

state

observations

mesh data

Indirect exchange (module/common) Explicit interface

Model

initialization time integration post processing

Filter

Initialization analysis re-initialization

Observations

obs. vector obs. operator

obs. error

Core of PDAF

(25)

Features of online program

•  minimal changes to model code when combining model with filter algorithm

•  model not required to be a subroutine

•  no change to model numerics!

•  model-sided control of assimilation program (user-supplied routines in model context)

•  observation handling in model-context

•  filter method encapsulated in subroutine

•  complete parallelism in model, filter, and ensemble integrations

Aaaaaaaa Aaaaaaaa aaaaaaaaa

Start

Stop

Initialize Model

Time stepper

assimilate_pdaf assimilate_pdaf

(26)

PDAF originated from comparison studies of different filters Filters

•  EnKF (Evensen, 1994)

•  ETKF (Bishop et al., 2001)

•  SEIK filter (Pham et al., 1998)

•  SEEK filter (Pham et al., 1998)

•  ESTKF (Nerger et al., 2012)

•  LETKF (Hunt et al., 2007)

•  LSEIK filter (Nerger et al., 2006)

•  LESTKF (Nerger et al., 2012) Smoothers for

•  ETKF/LETKF

•  ESTKF/LESTKF

•  EnKF

Current algorithms in PDAF

Global filters

Localized filters

Global and local smoothers

(27)

Parallel Performance of PDAF

(28)

"  Performance tests on

SGI Altix ICE at HRLN

(German “High performance computer north”)

nodes: 2 quad-core Intel Xeon Gainestown at 2.93GHz network: 4x DDR Infiniband

compiler: Intel 10.1, MPI: MVAPICH2

"  Ensemble forecasts

!  are naturally parallel

!  dominate computing time

Example: parallel forecast over 10 days: 45s SEIK with 16 ensemble members: 0.1s LSEIK with 16 ensemble members: 0.7s

Parallel performance of PDAF

(29)

Parallel Performance

Use between 64 and 4096 processors of SGI Altix ICE cluster (Intel processors) 94-99% of computing time in model integrations

Speedup: Increase number of processes for each model task, fixed ensemble size

!  factor 6 for 8x processes/model task

!  one reason: time stepping solver needs more iterations

512 proc.

4096 proc.

64/512 proc.

4096 proc.

512 proc.

64/512 proc.

Time increase factor Speedup

Scalability: Increase ensemble size, fixed number of processes per model task

!  increase by ~7% from 512 to 4096 processes (8x ensemble size)

!  one reason: more communication on the network

(30)

… Sea surface elevation

"  Ocean state improvement by

assimilation of satellite altimetry into global model

Application examples run with PDAF

RMS error in surface temperature

"  Chlorophyll assimilation into global

NASA Ocean Biogeochemical Model (with Watson Gregg, NASA GSFC)

"  Coastal assimilation of ocean surface

temperature

(S. Losa within project “DeMarine”) + external users, e.g.

•  NMEFC, China (Q. Yang)

•  IPGP Paris (PARODY, A. Fournier)

•  IFM HAMBURG, Germany (MPI-OM, S. Brune/J. Baehr)

•  U. Frankfurt (J. Tödter/B. Ahrens)

(31)

Summary

!  Ensemble-based Kalman filters:

!  Current efficient methods

suited for large-scale problems

!  Tuning of filters required

!  Simplification of technical implementation using PDAF

!  Application of the same assimilation software for test problems up to high-dimensional & operational systems

Thank you !

Lars.Nerger@awi.de - Ensemble Data Assimilation

Ensemble Data Assimilation: Algorithms and Software