• Keine Ergebnisse gefunden

Regardless of the method employed to obtain rainfall series, through observations or through stochastic generation, and regardless of the temporal scale, statistical analysis is required to quantify the probabilities of the rainfall intensities. Probability distributions can be assessed for the mean rainfall intensities at the relevant time scales, but most important for urban drainage applications are the rainfall extremes. These extremes can be extracted from the full rainfall series using the classical approach of annual maxima (Coles, 2001) where the annual maximum within a (hydrological) year is included in the extreme value analysis. Traditionally, this approach has been used for analysing rainfall extremes (e.g.

Schaeffer, 1990; Alila, 1999; Wallis et al.2007). Another approach considers events above a threshold level in the extreme value analysis. This approach, referred to as thePartial Duration Series (PDS) or Peak-Over-Threshold(POT) method, has been used for analysing extreme rainfall at fine temporal scales in for example Madsen et al.(2002), Beguería and Vicente-Serrano (2006), Willemset al. (2007) and Willems (2009). The pros and cons of theAnnual Maxima Series(AMS) approach versus the PDS/POT method have been discussed, amongst others, by Stedinger et al. (1993), Madsen et al. (1997) and WMO (2009c). The AMS method considers only the maximum event within a year although other events in the year may exceed annual maxima of other years. The POT approach provides a more consistent definition of the extreme values by considering all events above a threshold. However, as opposed to the AMS approach that generally assures independent events, independence criteria have to be defined to ensure independence between extreme events in the PDS/POT series. In addition, the PDS/POT method includes selection of a threshold level, which will introduce some sort of subjectivity in the extreme value analysis. Due to its simpler structure, the AMS-based method is more popular in practice. The PDS/POT analysis, however, appears to be preferable for short records, or where return periods shorter than two years are of interest (WMO, 2009c). Since the theory and application of the AMS approach have been well documented in hydrologic and engineering literature (Stedinger et al.

1993; WMO, 2009c), this section mainly focuses on the PDS/POT method.

Extreme value distributions

Several probability distributions have been applied to describe the distribution of extreme rainfall intensities at a single site (e.g. Chow, 1964; Benjamin & Cornell, 1970). Common distributions that have been applied

to the analysis of AMS include the Gumbel (NRCC, 1989), Generalized Extreme Value (GEV) (NERC, 1975), Log-normal (Pilgrim, 1998), and Log-Pearson type 3 (Niemczynowicz, 1982; Pilgrim, 1998) distributions. Among these distributions the GEV and its special form, the Gumbel distribution, have received dominant applications in modelling the annual maximum rainfall series. The Gumbel distribution was found, however, to underestimate the extreme precipitation amounts in several cases (Wilks, 1993). Studies using rainfall data from tropical and non-tropical climatic regions (Wilks, 1993;

Nguyen et al. 2002; Zalina et al. 2002) suggest also that a three-parameter distribution can provide sufficient flexibility to represent extreme precipitation data. In particular, the GEV distribution has been found to be the most convenient, since it requires a simpler method of parameter estimation and it is more suitable for regional estimation of extreme rainfalls at sites with limited or without data (Nguyen et al. 2002). When the return periods associated with frequency-based rainfall estimates greatly exceed the length of record available, discrepancies between commonly used distributions tend to increase.

For PDS/POT extremes, following the extreme value theory of Pickands (1975), the distribution’s tail of PDS/POT extremes converges asymptotically to a Generalized Pareto Distribution (GPD). The cumulative distribution functionF(x) of the GPD is given by:

F(x)=1− 1+gx−xt

wherextis the threshold level above which the distribution is considered,βis the scale parameter andγthe shape parameter (also symbolκ is often used, whereκ= −γ). Forγ=0 the exponential distribution is obtained as a special case. The parameterγ is also called the“extreme value index”and describes the shape of the tail of the distribution (heavy tail when γ.0, normal (exponential) tail when γ=0, light tail with an upper bound whenγ,0). According to the sign of the extreme value index, the following three classes are traditionally considered for extreme value distributions: class I (forγ.0), class II (for γ=0), and class III (for γ,0) (Coles, 2001). The event with a return period or average recurrence interval of T years (referred to as the T-year event) is defined as the (1 −1/(λT))–quantile in the whereλis the mean annual number of exceedances of the thresholdxt.

The GPD for PDS/POT extremes is equivalent to the GEV distribution for annual maxima. It can be shown mathematically that, when PDS/POT extremes occur randomly distributed in time (as in a Poisson process), and the exceedances follow a GPD, the corresponding annual maxima are GEV distributed (e.g. Madsenet al.1997).

Following the extreme value theory, the GPD only holds perfectly when considered asymptotically in the tail (towards values of+∞) (Coles, 2001). This means that it may not hold exactly for the lower rainfall extremes. In most practical applications, however, it has been verified that the GPD fits well also for lower rainfall intensities (e.g. Madsenet al.2002).

Two conditions have to be fulfilled for the asymptotic convergence of the GPD to apply: the PDS/POT extremes need to be independent and identically distributed (Coles, 2001). For the first condition to hold,

independence criteria often have to be imposed on the threshold exceedances. For the second condition to be fulfilled a careful selection of the threshold level is required.

The criteria for independence between events was studied in detail in the literature review by Arnellet al.

(1984). They found, that the criteria should be between 30 minutes and 12 hours depending on a number of variables. Willems (2000) defined, in the cases of temporal scales less than 12 hours, two successive PDS extremes to be independent if they are separated by at least a 12-hour time interval. For durations longer than 12 hours, independent events should be separated by a time interval larger than the considered duration.

Madsen et al. (2002) also defined independent events according to their duration. Others split the time series into wet periods, by means of dry spell identification. Various approaches for this division exist in the literature. An early theoretical approach was proposed by Restrepo-Posada and Eagleson (1982), which essentially aims at identifying the minimum separation at which events become mathematically independent. Another example is Verwornet al.(2008), who defined dry spells as periods for which the mean rainfall intensity is less than 0.02 mm/min for at least 0.75 hours, while wet periods should have at least 0.1 mm of rain over the whole wet spell period.

Regarding the selection of the threshold level, different methods have been proposed. An overview of these methods is given in Langet al.(1999) together with recommendations for operational use. One of the proposed techniques tests the stability of the distribution parameters for varying thresholds. Also Willems (2000) applied this method, where the optimal threshold was defined as the threshold above which the mean squared error of the estimated distribution is minimal. The mean squared error is calculated from the differences between the estimated quantiles and the corresponding empirical quantiles. It can be shown that if the threshold exceedances follow the GPD for a given threshold level, then for any higher threshold the exceedances will also follow the GPD with the same shape parameter (e.g. Madsen et al. 1997). This important property of the GPD can be used to select an appropriate threshold level. Instead of defining a threshold, the PDS/POT extremes can also be defined by including thenmost extreme events corresponding to using a fixed average number of events per year (Mikkelsen et al.1995).

Distribution parameter estimation

Having defined the PDS/POT extreme value series, the next step is to estimate the distribution parameters.

The GPD includes different tail behaviours as described by the shape parameter. In general, a light tail distribution is not expected for rainfall given that a light tail implies that the distribution has an upper limit. Physically, an upper limit exists as defined by the Probable Maximum Precipitation (PMP) (Chow, 1951; Bruce & Clark, 1966; WMO, 2009b), but this limit is not relevant for urban drainage problems.

Estimation of PMP is based on other methods than the extreme value methods presented here. Heavy and normal (exponential) tailed GPDs are found in the literature for rainfall extremes. Based on 10-minute rainfall series in Belgium, Willems (2000) concluded that the distribution’s tail is normal, but that after mixing normal tail distributions for convective and stratiform rain events, a resulting heavier tailed distribution is obtained. They succeeded to model this tail behaviour with a two-component exponential distribution (thereby mixing two normal-tailed GPDs). In a regional analysis of Danish rainfall extremes, Madsenet al. (2002) found that intensities with durations between 1 minute and 48 hours can be described by a heavy-tailed GPD. In the analysis of daily rainfall extremes in the Ebro Valley, Spain, Beguería and Vicente-Serrano (2006) also found that the PDS/POT series could be described by a heavy-tailed GPD.

Extensive reviews of GPD parameter estimation methods that partly go beyond applications in hydrology are given by Madsen et al.(1997), Bermudez and Kotz (2010a) and Bermudez and Kotz (2010b). The

traditional methods include: the maximum likelihood method (ML), methods of moments (MOM), and probability weighted moments (PWM) or L-moments (Stedingeret al.1993 and Madsenet al.1997). A relatively new method consists of a combination of likelihood and moment estimators (Zhang, 2007).

For other advances, discussions and a comparison with the traditional methods, see Huesleret al.(2011), Mackay et al.(2011), Zhang and Stephens (2009) and Martins and Stedinger (2001). Another type of methods is based on regression in quantile-quantile (Q-Q) plots. These methods have smaller variance in the estimation of the extreme value index, but might be biased due to the asymptotic properties of the extreme value theory (Willemset al.2007). L-moment and ML estimation of the GPD are described in Box 2.3 and Box 2.4.

Box 2.3 Parameter estimation in the GPDthe L-moments method

Assume that we have selected a threshold,xt. From this a PDS ofnextreme values are defined.

Estimates ofβandγare obtained by the following equations:

g= −l1xt

l2 +2 b= l1xt

(1g)

wherel1andl2are the first and second L-moment estimates, respectively. These are, in turn, related to the first and second PWM estimates:

The notationx(i)indicates that the extreme values have been sorted in a descending order. L-moments are linear combinations of the PWM and thereby linear combinations of the ranked observations. For more details on the estimation procedure and a discussion of the method the reader is referred to Stedingeret al.

(1993) and Hosking (1990).

Computations of the uncertainty in the parameter estimates are more complicated. They can be derived from asymptotic theory, where approximate equations for the variances are obtained from Taylor series expansion (Rosbjerget al.1992). The resulting equations can be found in Hosking and Wallis (1987).

They have been implemented in many common software tools. Appendix A contains a practical guide on L-moment based GPD parameter estimation, using the open source software R.

Box 2.4 Parameter estimation in the GPDthe ML method

Again we have selected an adequate threshold,xtand from this a PDS ofnextreme values are defined.

In the ML theory we explicitly utilize thatxfollows a GPD. As the parametersβandγare unknown, the density functionf(x) is used to predict the likelihood of different parameter values for allx. The parameter set with the highest likelihood will give the best description of the data. The likelihood function reads:

L(b,g)=n

Estimated T-year rainfall intensities may have large sampling uncertainties, especially when extrapolated far beyond the observation period. To obtain more accurate estimates, additional information should be included in the estimation process. A common approach is the use of regional information where information from several rainfall records that can be assumed to have similar extreme rainfall characteristics is combined. Importantly, regionalization also allows estimation at ungauged locations (WMO, 2009c).

Regional analysis

Regional frequency analysis involves the following basic steps: (i) identification of a homogenous set of stations with similar extreme value characteristics, (ii) determination of a regional extreme value distribution, and (iii) combination of extreme rainfall records from the different sites in the region for estimation of the regional distribution. The regional frequency analysis approach based on L-moments has been widely applied in hydrology since it was first introduced by Hosking and Wallis (1993). The method is based on an“index flood”approach. The key assumption of this approach is that data from different sites follow the same distribution except for a site-specific scaling factor, the index parameter.

Usually the mean of the distribution is taken as the index parameter. Homogeneity is then defined in terms of constant second and higher order moments.

The regional frequency analysis approach proposed by Hosking and Wallis (1993) includes a homogeneity test based on L-moments where the dispersion of L-moment ratio estimates from the group of sites is compared to the expected dispersion in a homogeneous region (i.e. due to sampling uncertainty). Similarly, an L-moment based goodness-of-fit test can be used for choosing an appropriate regional distribution. Regional distribution parameters corresponding to second and higher order moments are estimated by weighting the site-specific L-moment estimates from the group of sites.

For practical reasons it is often the log-likelihood function that is maximized:

l(b,g) = −nlog(b) − (1/g+1)n

Whenγ=0 (exponential distribution), the ML estimate ofβequals the mean value of the exceedances (similar to the MOM and L-moment estimates). When γ0, numerical techniques are required for estimation of the parameters. Methods for numerical maximization are implemented in most statistical softwares (see Appendix A using the open source software R). Convergence problems can occur leading to insensible parameter values, especially when γ0 (Coles, 2001). The ML estimator is asymptotically the most efficient estimator (has the lowest mean squared error). However, for small sample sizes the ML method may results in unreasonable estimates of the shape parameter (e.g.

Madsenet al.2007; Martins & Stedinger, 2001).

Several methods exist for estimation of the uncertainty in the parameters obtained by the ML method.

Basically they all evaluate the second order partial derivatives of the functionl(β,γ) as they express how well the maximum of l(β,γ) is defined in term of the curvature. The matrix of second order partial derivatives is known as the Information matrix (J), the Hessian matrix or the Fisher matrix. A central assumption in standard likelihood theory is that the ML estimator asymptotically follows a multivariate normal distribution with J1as variance-covariance matrix. For details on estimation procedures see Coles (2001).

The L-moment based regional estimation procedure has been widely used in extreme rainfall analysis. In a regional analysis in Canada, Alila (1999) found that the L-skewness of annual maxima precipitation could be assumed constant in the entire country, and homogenous regions with constant L-coefficient of variation (L-CV) could be defined according to the mean annual precipitation (MAP). Di Baldassarreet al.(2006) detected for a dense network of rain gauges in northern central Italy significant relationships between the L-moments of annual maximum rainfall intensities and the MAP. These relationships were valid for durations ranging from 15 min to 1 day. Walliset al.(2007) divided Washington State into 12 regions, and within each region L-CV and L-skewness were found to vary systematically with MAP. A similar approach was applied by Haddad et al. (2011) for rainfall extremes in Australia where regression equations were developed for L-skewness, L-CV and mean of annual precipitation maxima. However, the use of MAP as an“index variable”may not be appropriate for other regions with different climatic or topographic conditions. For instance, the median of annual maximum rainfalls at a site was recommended as the index variable for regional estimation of extreme rainfalls by the UK Institute of Hydrology (1999). In general, one of the main difficulties in the application of this technique is related to the definition of “homogeneous” regions. Various methods have been proposed for determining regional homogeneity, but there is no generally accepted procedure in practice (Fitzgerald, 1989;

Schaefer, 1990; Hosking & Wallis, 1993; Fernandez Mills, 1995; Nguyenet al.2002).

Madsenet al.(2002) developed a regional PDS/POT model for rainfall extremes in Denmark and found that the GPD shape parameter could be considered constant for the entire country. Sub-regions were defined with constant PDS/POT mean value, and a regression model was developed for describing the regional variability of the Poisson parameter from MAP. For rainfall extremes in the Ebro Valley, Spain, Beguería and Vicente-Serrano (2006) developed a regional PDS/POT model using regression relations of the threshold level and GPD parameters with location and relief variables.