Weather generator - Modelling phytoplankton, passengers and drivers of lake ecosystems

3.2 Methods

3.2.1 Weather generator

The proposed weather generator employs a single Vector-Autoregressive (VAR) process. Here, the vector is the set of simulated meteorological variables at one time step. VAR processes capture the auto- and crosscorrelations in multivariate time series by separating them into a deterministic, linearly dependent and a random part. They assume that the time series have time-invariant means and standard deviations. Time series generated by stable VAR processes follow normally-distributed marginals.

The simulated variables are listed in Table3.1. Rain was deemed as of minor importance for the thermal and volumetric budget of the lake. Compared to the large water volume of the model system, direct precipitation on the lake surface is negligible (Bäuerleet al.,1998). Rain was therefore not simulated, instead long term averages of daily precipitation measurement were used. The overall structure of VGis shown in Fig.3.2.

Table 3.1: Meteorological variables generated by the weather generatorVG. The last column lists the parametric distributions used for variable transformation (see section 3.2.1). “Empirical”

refers to a kernel density estimation.

Variable Symbol Distribution

Air temperature θ Normal

Short-wave radiation Qsw Empirical

Incident long-wave radiation Qlw(in.) Normal

Relative humidity φ Truncated Normal

Eastward wind speed u Empirical

Northward wind speed v Empirical

Measurement data X

Transformation Fdoy(X)

Fitting the VAR-process Ai,COV(t)

Simulate time series yt =Pp

i=1(Aiyt−i) + t + m

Scenario

pertur-bation m,mt

Re-transformation Xˆ = F_doy⁻¹(Y)

Disaggregation ofQsw,uandv KDE/

para-metric distri-butions

Fdoy

Synthetic time series Xˆ

Figure 3.2: Structure of the weather generator VG. Boxes with grey background refer to computa-tions in the standard-normal transformed domain.

Variable transformation

Meteorological variables show diverse marginal distributions and usually exhibit strong season-dependent means, standard deviations and further higher order moments. To address this non-stationarity, we employ a day-of-year(doy)-specific quantile-mapping to achieve stationary, standard-normal distributed variables.

Two approaches to attain an annual distribution function for quantile mappings are em-ployed here: (1) approximating the annual cycle of parameters of theoretical distributions by Fourier series (annual distributions) and (2) pseudo-2-dimensional Kernel Density Estimation (annual KDE). Table 3.1 shows which distribution was chosen for each variable. Inserting the measured variables together with their doy into the distribution functionF_doy gives esti-mates for de-seasonalized quantiles. Those are then converted to standard-normal distributed variables by inserting them into the inverse normal distribution with zero mean and standard deviation of one.

The first case (fitting the annual distribution) consists of two steps. First, a set of param-eters per doy is calculated by fitting a theoretic distribution to measurements of thedoy and the neighbouring doys, resulting in a series of 366 parameter sets. These parameters closely follow the specifics of the dataset. In order to generalise, the parameter series was smoothed by approximating it with a Fourier series of order 4. In effect, the distribution parameters p_1,doy, ..., p_k,doy are given by a function of thedoy:

p_j,doy= a_j,0 2 +

n=1

a_j,ncos

n2π

doy

+b_j,nsin

n2π doy

(3.1) aj,n and bj,n are parameters that are obtained by discrete Fourier transform. In the case of air temperature (θ), the distribution parameters arep_1,doy =µ_doy (mean) andp_2,doy =σ_doy (standard deviation), i.e. the parameters of the normal distribution and the doy-specific distribution function becomesF_doy(X) = Φ(X, µ_doy, σ_doy) (Φ is the distribution function of the normal distribution). This procedure allows for a smooth change of the variables throughout the year without introducing a large number of free parameters.

The second case (attaining a Kernel Density Estimate) is more data-centric and does not make any assumption that a measured variable follows a specific theoretic distribution. Short-wave radiation (Qsw) and wind speed components (uandv) exhibited annual cycles that were hard to describe using trigonometric functions like Equation 3.1. Our variant of KDE gives a one-dimensional estimation of the probability density for each doy, but takes values from neighbouring doys into account:

fˆ_doy(x) =

i∈{|doyx−doy_xi|<15}

P Kx

_x−x

hdoy

·K_doy^doy^x^−doy₁₅ ^xi

h_doy·#{|doy_x−doy_x_i|<15} (3.2) Kxis a Gaussian kernel for the dimension of the variablex,Kdoy is a triangular kernel for thedoy dimension andh_doy thedoy-specific kernel width of thex dimension. The number of x-values 15doys apart from measurementxis given by #{|doy_x−doyxi|<15}. 15, the width of thedoy-dimension, was chosen by hand to give a reasonable number of data points for each doy (usually (14·2 + 1)·n_years). h_doy was optimised by maximum likelihood, using the leave-one-out cross-validation approach. h_doy was further smoothed in the doy-domain to allow a greater abstraction from the data-set. Otherwise, back-transformed simulated values would follow the distinct short-term fluctuations (noise) inherent in the data-set. The distribution function ˆF_doy is attained by numerical integration of ˆf_doy(x).

VAR fitting

The VARprocess is given in the form:

y_t=

i=1

(A_iyt−i) +ε_t+m (3.3)

ytis aK-dimensional vector of transformed observations for the time stept,AiareK×K

matrices containing the parameters of the process,εtis aK-dimensional vector containing the residual for time stept. This means, that the weather of the current day (y_t) is depending on the weather of the preceding days (yt−i), plus white noise (ε_t). mis an additional disturbance vector that is used to generate scenarios. The entries inAicapture the correlations and cross-and autocorrelations of the dataset. To estimate theVARparameters (A_i and the covariance matrix of εt), the Least Squares Estimator was used (Lütkepohl,2005, p. 70).

Generation of time series

Time series are generated by replacing the residualsε_twith vectors drawn from a multivariate normal distribution in Equation 3.3 (for more details see Lütkepohl, 2005, p. 707). Because these values are normally distributed, they are transformed back into the measurement domain by using the inverse of the distribution functions (F_doy⁻¹) obtained in section 3.2.1, “Variable transformation”.

Generation of scenarios with a changed mean

The VARprocess was further adjusted to allow for manipulation of key output statistics, i.e.

simulating scenarios. The mean m of a K-dimensional VAR process of order p, given in the form of Equation3.3, can be adjusted through

m= I−

i=1

y. (3.4)

I is the K×K-dimensional identity matrix and y the vector of desired means. A design goal was to let the user define a change in air temperature mean ∆θ and have the weather generator set the means of the other variables accordingly. First, the change given in^◦C has to be converted to a change in the transformed domain. As θ and θ^trans are both normally distributed, with σ_θ,doy (given by Equation 3.1) and σ = 1 respectively, this amounts to a simple division, namely ∆θ^trans= ∆θ/σ_θ,doy. Using the covariances between the transformed air temperature θ^trans and the other variables σ_θtransy^trans_i , the non-air temperature elements iof y are obtained similarly to a linear regression:

y_i = ∆θ^transσ_θtransy^trans_i

σ_θ²trans

(3.5) As the link between the desired temperature change ∆θ and all elements of m is estab-lished, all further scenario definitions can be expressed in terms of ∆θ. By changing the theoretical mean of the VAR-process and keeping the rest of its parameters unchanged, we assume that the linear dependence structure of variables remains the same under changed climatic conditions.

Generation of scenarios with higher variability

In order to increase climate variability, the change of air temperature was applied non-stationary as ∆θ_t.

In the context of this study, climate variability is quantified by the statistics of periods in which daily average air temperatures deviates from the long-term average for this day of year. These episodes can be described by their duration in days and their deviation from the mean in ^◦C. In the measured air temperatures of the reference period, episodes duration shows strong similarity to an exponential distribution with an average value of 5.3 days. Apart from a bimodality around 0 ^◦C the amplitudes are approximately normally distributed with a standard deviation of 2.3^◦C. Here, increasing climate variability means increasing duration and amplitude of the episodes.

In order to increase this episode variability, artificial episodes of ∆θwere generated. This is done by generating random numbers following an exponential distribution to be used as episode lengths and normally distributed random numbers to be used as episode amplitudes.

These episodes are used to change the theoretical means of the underlying VAR-process.

Disaggregation

Short-wave radiation and the wind speed components were disaggregated to hourly values as we wanted to run the lake model with sub-daily meteorological input. The scheme to disaggre-gate short-wave radiation is deterministic and adds a typical daily cycle while maintaining the mean daily short-wave radiation generated by the weather generator. Wind speed components are disaggregated by resampling differences between daily and hourly values in the measured data. These differences are added to the daily simulated wind speed components in chunks of two days. This largely maintains the cross- and autocorrelations in the disaggregated time series. For the non-disaggregated variables, daily means are applied to each hour of the day.

Im Dokument Modelling phytoplankton, passengers and drivers of lake ecosystems (Seite 32-36)