EnKF – The Ensemble Kalman Filter - Error subspace Kalman Filters

2.4 Error subspace Kalman Filters

2.4.2 EnKF – The Ensemble Kalman Filter

The EnKF [17, 8] applies a Monte Carlo method to sample and forecast the probability density function. The initial density p(x^t₀) is sampled by a finite random ensemble of state realizations. The density is forecasted by evolving each ensemble member with the full stochastic model. For the analysis each ensemble state is updated using an observation vector from an ensemble of observations, which has to be generated according to the observation error covariance matrix.

From the viewpoint of statistics the EnKF solves the Fokker-Planck-Kolmogorov equation (2.3) for the evolution of the probability density p(x^t) by a Monte Carlo method. In contrast to the SEEK algorithm, where the rank reduction directly uses the assumption that the density is Gaussian and thus can be described by a prob-ability ellipsoid, the EnKF samples the density by a random ensemble of N model states{x^a(α)₀ , α= 1, . . . , N}. The probability density is given in terms of the ensemble member density in state space dN:

N →p(x^t₀)dx for N → ∞ (2.36)

This sampling ofp(x^t₀) converges rather slow (proportional toN^−1/2), but it is valid for any kind of probability density, not just Gaussian densities. Forecasting each ensemble state with the stochastic-dynamic model (2.1) evolves the sampled density with the nonlinear model until the next analysis time. In the analysis phase the EKF analysis, which implies that the densities are Gaussian, is applied to each of the ensemble states.

For the analysis the covariance matrix P is approximated by the ensemble covariance matrix ˜P. Since the rank of ˜P is at most N −1, the EnKF also operates in an error subspace which is determined by the random sampling. Unlike the SEEK filter the directions are not provided by the principal vectors of the prescribed covariance ma-trix but determined by the random sampling. To ensure that the ensemble analysis represents the combination of two probability densities, the observation error covari-ance matrix R has to be represented by a random ensemble of observations [8]. Each ensemble state is then updated with a vector from this observation ensemble. This implicitly updates the state covariance matrix.

The EnKF algorithm is prescribed by the following equations:

Initialization:

The initial probability density p(x^t₀) is sampled by a random ensemble of N state realizations. The statistics of this ensemble approximate the initial state estimate and the corresponding covariance matrix, thus

{x^a(α)₀ , α= 1, . . . , N} (2.37)

with

x^a₀ = 1 N

XN α=1

x^a(α)₀ →<x^t₀ > for N → ∞, (2.38)

P˜^a₀ := 1 N −1

XN α=1

x^a(α)₀ −x^a₀

´³

x^a(α)₀ −x^a₀

´_T

→P^a₀ for N → ∞. (2.39) Forecast:

Each ensemble member is evolved up to timet_k with the nonlinear stochastic-dynamic model (2.1) as

x^f(α)_i =M_i,i−1[x^a(α)_i−1 ] +η^(α)_i . (2.40) Analysis:

For the analysis a random ensemble of observation vectors{y^o(β)_k , β = 1, . . . , N}is gen-erated which represents an approximate observation error covariance matrix ( ˜R_k ≈R_k).

Each of the ensemble members is updated analogously to the EKF analysis by x^a(α)_k = x^f(α)_k +K˜_k

y_k^o(α)−H_kx^f(α)_k

, (2.41)

K˜_k = P˜^f_kH^T_k

H_kP˜^f_kH^T_k +R_k

´₋₁

, (2.42)

P˜^f_k = 1 N −1

XN α=1

x^f(α)_k −x^f_k

´³

x^f(α)_k −x^f_k

´_T

. (2.43)

The analysis state and corresponding covariance matrix are then defined by the en-semble mean and covariance matrix as

x^a_k := 1 N

XN α=1

x^a(α)_k , (2.44)

P˜^a_k := 1 N−1

XN α=1

x^a(α)_k −x^a_k

´³

x^a(α)_k −x^a_k

´_T

(2.45) which complete the analysis equations of the EnKF.

An efficient implementation of this analysis is formulated in terms of “represen-ters” [19]. This formulation as well permits to handle the situation when H_kP˜^f_kH_k^T is singular, which will occur if m_k> N. The state analysis equation (2.41) is written as

x^a(α)_k =x^f(α)_k + ˜P^f_kH^T_kb^(α)_k . (2.46) The columns of the matrix ˜P^f_kH^T_k are called representers and constitute influence vec-tors for each of the measurements. Amplitudes for the influence vecvec-tors are given by the vectors {b^(α)_k } which are obtained as the solution of

H_kP˜^f_kH_k^T +R_k

b^(α)_k =y^o(α)_k −H_kx^f_k^(α) . (2.47)

The explicit computation of ˜P^f_k by equation (2.43), is not required in the algorithm.

It suffices to compute (see, for example Houtekamer and Mitchell [34]) P˜^f_kH^T_k = 1

N −1 XN α=1

x^f(α)_k −x^f_k

´ h H_k

x^f(α)_k −x^f_k

´i_T

, (2.48)

H_kP˜^f_kH^T_k = 1 N −1

XN α=1

H_k

x^f_k^(α)−x^f_k

´ h H_k

x^f_k^(α)−x^f_k

´i_T

. (2.49) For later use we also introduce the matrix notation of the EnKF. The initial state ensemble matrix holds in its columns the ensemble state as as X^a₀ ={x^a(1)₀ , . . . ,x^a(N₀ ⁾}. Introducing the ensemble matrix of the observation vectors Y^o_k ={y^o(1)_k , . . . ,y_k^o(N)}we can rewrite equation (2.47) for the influence amplitudes as

H_kP˜^f_kH_k^T +R_k

B_k =Y_k^o−H_kX^f_k (2.50) where B_k is the matrix of influence amplitudes. The ensemble update (equation 2.46) is now given as

X^a_k=X^f_k+ ˜P^f_kH^T_kB_k . (2.51) In addition, the computation of the representers ˜P^f_kH^T_k and the covariance matrix H_kP˜^f_kH^T_k is written in matrix notation as

P˜^f_kH^T_k = 1 N −1

X^f_k−X^f_k

´ h H_k

X^f_k−X^f_k

´i_T

, (2.52)

H_kP˜^f_kH^T_k = 1 N −1H_k

X^f_k−X^f_k

´ h H_k

X^f_k−X^f_k

´i_T

. (2.53)

Here the matrix X^f_k contains in all columns the vector x^f_k.

The EnKF comprises some particular features due to the use of a Monte Carlo method in all phases of the filter:

Remark 21: The EnKF treats the covariance matrix implicitly in a square root form, as is evident from equations (2.43) and (2.45). With this the covariance matrix remains symmetric in the EnKF. As in the SEEK algorithm it is neither required to store the full covariance matrix nor to compute it explicitly.

Remark 22: The forecast phase evolves all N ensemble states with the nonlinear model. This also allows for non-Gaussian densities. Algorithmically the ensemble evo-lution has the benefit that a linearized model operator is not required.

Remark 23: The analysis phase is derived from the EKF. Thus, it only accounts for the two lowest statistical moments of the probability density. Using the mean of the forecast ensemble as state forecast estimate leads for sufficiently large ensembles to a more accurate estimate than in the EKF. From the Taylor expansion, equation (2.20), it is obvious that this takes into account higher order terms than the EKF does. In contrast to the EKF and SEEK filters P is only updated implicitly by the analysis of the ensemble states.

Remark 24: The representer analysis method applied in the EnKF operates on the observation space. Hence, the error subspace is not explicitly considered. An algo-rithm which operates on the error subspace is given by the concept of Error Subspace Statistical Estimation (ESSE) [49].

Remark 25: The analysis increments for the ensemble states are computed as weighted means of the vectors X^f_k−X^f_k which belong to the error subspace. Thus the analysis equation (2.51) for the ensemble update can be written as

X^a_k=X^f_k+

X^f_k−X^f_k

´ µ 1 N −1

h H_k

X^f_k−X^f_k

´i_T B_k

(2.54) Evensen [18] noted that the analysis can also be interpreted as a weakly nonlinear combination of the ensemble states. The first interpretation, however, shows that the update increments are computed in the error subspace.

Remark 26: Using a Monte-Carlo sampling of the initial probability density also non-Gaussian densities can be represented. As the sampling convergences slowly withO(N^−1/2), rather large ensembles (N ≥100) are required [17, 19] to avoid too big sampling errors.

Remark 27: To enhance the quality of the filter estimate for small ensemble sizes a variant of the EnKF has been proposed which uses a pair of ensembles [34]. From the mathematical viewpoint it is, however, advisable to use as large as possible ensembles to ensure that the statistics can be estimated correctly. In addition, for a given en-semble size the state estimate of the EnKF using a single enen-semble is better than the state estimate of the double-ensemble EnKF with the same total number of ensemble states [84, 35].

Remark 28: Since the estimated correlations of the EnKF will be noisy for small ensembles it has been proposed [36] to filter the covariances by a Schur product of correlations functions of local support with the ensemble covariance matrix. This tech-nique filters out noisy long-range correlations. Further, correlations are intermediate distances will be weakened. Hence, the influence of observations are intermediate dis-tances is reduced, see [30]. The localization will, however, introduce imbalances into the ensemble states as has been studied by Mitchell et al. [56].

Remark 29: The generation of an observation ensemble is required to ensure consis-tent statistics of the updated state ensemble [8]. With the observation ensemble the covariance matrix R_k is represented as ˜R_k in equation (2.16) which would be missing otherwise. This, however, introduces additional sampling error to the ensemble which is largest when the ensemble is small compared with the rank of R_k, e.g. if R_k is diag-onal. Furthermore, it is likely that the state and observation ensembles have spurious correlations. This introduces an additional error term in equation (2.16).

Remark 30: In equations (2.42) and (2.47) it is possible to use, instead of the pre-scribed matrixR_k, the ensemble error covariance matrix ˜R_k of the observation ensem-ble {y^o(β)_k , k= 1, . . . , N}. As proposed by Evensen [18], this allows for an analysis scheme which is numerically very efficient. However, due to the sampling problems of R_k this can lead to a further degradation of the filter quality.

Remark 31: To avoid the requirement of an ensemble of observations, several algo-rithms have been proposed which perform the analysis only on the ensemble mean and

transform the ensemble after this update [1, 5, 94]. These filter algorithms can be interpreted in a unified way as ensemble square root filters [79].

2.4.3 SEIK – The Singular Evolutive Interpolated Kalman

Im Dokument Parallel Filter Algorithms for Data Assimilation in Oceanography (Seite 30-34)