Assimilation program
state time
state
observations
mesh data
Indirect Exchange of information (Fortran modules) Explicit Interface (Subroutine calls)
Model
initialization time integration post processing
Filter
Initialization analysis re-initialization
Observations
obs. vector obs. operator
obs. error
PDAF-Core
Efficient Ensemble-Based Data Assimilation for High-Dimensional Models with the
Parallel Data Assimilation Framework PDAF
Data Assimilation Program
Lars Nerger, Paul Kirchgessner, Xi Liang
Alfred-Wegener Institute Helmholtz Center for Polar and Marine Research, Bremerhaven, Germany Contact: Lars.Nerger@awi.de http://www.awi.de
Parallel Ensemble Forecasts
We show how we can build a data-assimilative model by augmenting a forecast model by data assimilation functionality for efficient ensemble data assimilation.
The method uses a direct connection between the coupled model and the ensemble data assimilation framework PDAF [1, http://pdaf.awi.de]. Augmenting the model allows us to set up a data assimilation program with high flexibility and parallel scalability with only small changes to the model.
The direct connection is obtained by
1. adapting the source codes of the coupled model so that it is able to run an ensemble of model states
2. adding a filtering step to the source codes.
We discuss this connection for the ocean circulation models NEMO and MITgcm. We insert subroutine calls to the model code, adapt the parallelization, and add routines for the handling of observations.
Aaaaaaaa Aaaaaaaa aaaaaaaaa
Stop
Initialize Model
Post-processing Init_parallel_PDAF
WHILE istp ≤ nitend Init_PDAF
Assimilate_PDAF
Start
Initialize parallelization
Model code Extension for data assimilation Initialize
ensemble Parallel ensemble
forecast Perform filter analysis step Add ensemble
parallelization Additions to program flow
Source code changes
Add 1 line stp (step.F90)
Direct (Online) Coupling of model and PDAF Overview
PDAF provides parallelization support and fully-implemented and parallelized filters
& smoothers. Ensemble assimilation without model restarts is enabled by
adding a few subroutine calls to the model code. The additions are nearly identical for NEMO and MITgcm. In NEMO one has to account for the leap-frog time stepping.
Time stepper
Add 1 line in nemo_init (nemogcm.F90)
Add 1 line in mynode (lib_mpp.F90)
Example: Nonlinear Ensemble Smoother with NEMO
Implementing the Analysis Step
References:
[1] Nerger, L., Hiller, W. Software for Ensemble-based Data Assimilation Systems - Implementation Strategies and Scalability. Comp. & Geosci., (2013) 55: 110-118
[2] Kirchgessner, P., Tödter, J., Ahrens, B., Nerger, L. (2017) The smoother extension of the nonlinear ensemble transform filter. Tellus A, 69, 1327766, 2017
[3] Liang, X., Losch, M., Nerger, L. Mu, L., Yang , Q., Liu, C. (2019) Using sea surface temperature observations to constrain upper ocean properties in an Arctic sea ice-ocean data assimilation system, in preparation
The data assimilation system has three components:
Model, filter algorithm, and observations. The filter algorithms are model-agnostic, while the model and subroutines to handle observations are provided by the user. The observation routines are called by PDAF as call-back routines.
Example of an ensemble integration with two ensemble members using a 2-level parallelization. The models and the filter are parallelized. The ensemble adds one level of parallelization to integrate all members at once. Combining model and assimilation in one program avoids costly disk file operations. Further there is no need for a full model restart after the filter analysis step.
Filter analysis
update ensemble
assimilating observations Analysis operates
on state vectors (all fields in one
vector)
Ensemble of state vectors
X
Vector of observations
y
Observation operator
H(...)
Observation error covariance matrix
R
For localization:
Local ensemble Local
observations
Model
interface Observation
module For the analysis step, we need to write
model fields into the state vectors and back afterwards. Further, the analysis step needs information on the assimilated observations. PDAF uses call-
back routines for this. They are implemented like model routines and utilize model information from Fortran modules. PDAF already provides model- binding routines for MITgcm.
PDAF is open source. The code, documentation, and a tutorial are
available at http://pdaf.awi.de
Example: Sea-ice assimilation with MITgcm
Estimated SSH at last analysis time
Longitude (degree)
Latitide (degree)
−60 −55 −50 −45 −40 −35 −30
24 28 32 36 40
44 −0.6
−0.4
−0.2 0 0.2 0.4 0.6
True sea surface height at last analysis time
Longitude (degree)
Latitide (degree)
−60 −55 −50 −45 −40 −35 −30
24 28 32 36 40
44 −0.6
−0.4
−0.2 0 0.2 0.4
NEMO is used in a box configuration (double-gyre 0.6
SEABASS) with the Nonlinear Ensemble Transform Filter (NETF) and its smoother extension [2].
Assimilated are simulated SSH observations on satellite tracks and temperature profiles over 1 year. NETF and LETKF reach the same accuracy;
the smoother reduces the error by up to 11.5%
0 20 40 60 80 100 120
lag [days]
0 2 4 6 8 10 12 14
Ri [%]
ssh t u v
Smoother: Dependence on lag
0 50 100 150 200 250 300 350
01020304050
(a) for SSH
Time [days]
relative RMSE / CRPS [%]
0 50 100 150 200 250 300 350
01020304050
(a) for T
Time [days]
relative RMSE / CRPS [%]
RMSE_NETF RMSE_LETKF CRPS_NETF CRPS_LETKF
0 50 100 150 200 250 300 350
01020304050
(a) for U
Time [days]
relative RMSE / CRPS [%]
0 50 100 150 200 250 300 350
01020304050
(a) for V
Time [days]
relative RMSE / CRPS [%]
FIG. 9. Comparison of NETF and LETKF in terms of RMSE (black/gray) and CRPS (red/orange). The lines represent the field-averaged relative RMSE and CRPS, respectively, for all prognostic variables, i.e., (a) SSH, (b) T, (c) U and (d)V , which are defined in Sec. 5.b. The legend in (b) is valid for all panels.
841 842 843
49
−60 −55 −50 −45 −40 −35 −30
242832364044
(a) SSH [m]
Longitude [°]
Latitude [°]
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● −0.6
−0.4
−0.2 0.0 0.2 0.4
● ● ● ● ● ● ● ●
500040003000200010000
(b) T [°C]
Latitude [°C]
Depth [m]
24 26 29 32 35 38 41 44
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
0 2 4 6 8 10 12
FIG. 3. Observation characteristics on day 8: (a) The horizontal domain is shown, together with the Argo profiler locations (crosses) and the synthetic SSH observations (colored) on the Envisat tracks (thin lines). (b) The vertical grid of 11 layers is visualized, and embedded are the artificial Argo temperature profiles along the
= 50 longitude line. Note that at = 44 , the true temperature field is zero due to the lateral boundary conditions.
774 775 776 777 778
41
FIG. 2. Observation characteristics on day 8: (a) The horizontal domain is shown, together with the Argo profiler locations (crosses) and the synthetic SSH observations (colored) on the Envisat tracks (thin lines). (b) The vertical grid of 11 layers is visualized, and embedded are the artificialArgo temperature profiles (46 values each) along the = 50 longitude line. At = 44 , the true temperature field is zero due to the lateral boundary conditions.
817
818
819
820
821
42
−60 −55 −50 −45 −40 −35 −30
242832364044
(a) SSH [m]
Longitude [°]
Latitude [°]
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● −0.6
−0.4
−0.2 0.0 0.2 0.4
● ● ● ● ● ● ● ●
500040003000200010000
(b) T [°C]
Latitude [°C]
Depth [m]
24 26 29 32 35 38 41 44
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●●
0 2 4 6 8 10 12
FIG. 3. Observation characteristics on day 8: (a) The horizontal domain is shown, together with the Argo profiler locations (crosses) and the synthetic SSH observations (colored) on the Envisat tracks (thin lines). (b) The vertical grid of 11 layers is visualized, and embedded are the artificial Argo temperature profiles along the
= 50 longitude line. Note that at = 44 , the true temperature field is zero due to the lateral boundary conditions.
774
775
776
777
778
41
FIG. 2. Observation characteristics on day 8: (a) The horizontal domain is shown, together with the Argo profiler locations (crosses) and the synthetic SSH observations (colored) on the Envisat tracks (thin lines). (b) The vertical grid of 11 layers is visualized, and embedded are the artificial Argotemperature profiles (46 values each) along the = 50 longitude line. At = 44 , the true temperature field is zero due to the lateral boundary conditions.
817
818
819
820
821
42
The MITgcm model is used with PDAF to assimilate sea ice concentration (SIC) from SSMIS and thickness from SMIS and CryoSat2. The effect of adding daily assimilation of SST data is examined for the year 2012 [3].
The model has a resolution of 18 km. An ensemble of 23 members is used.
Adding the assimilation of SST improves the ice condition at the sea ice edge. The figure shows the effect of sea-ice assimilation (left) and the additional effect of SST (right) in terms of RMS deviation.
day
120 240 360 480 600
RMS error in SSH [m]
0 0.01 0.02 0.03 0.04 0.05 0.06
Filter
Smoother lag 30d
Smoother: RMS errors Surface Observations (SSH and SST)
Temperature profile observations
Filters: Error statistics (RMSE and CRPS)
SSH Temperature
Relative improvement: Error reduction due to smoother
SST SST
SIC SIC
RMSD: NoSST-DA vs. CTRL RMSD: SST-DA vs. NoSST-DA
RMSD: NoSST-DA vs. CTRL RMSD: SST-DA vs. NoSST-DA