First Results of the Daily Stew Project
Ralf Lindau
First Steps
Data:
Climate stations of DWD with daily data (or even 7, 14, 21 h)
Although the project is focused on weather indices and extremes, we consider initially a much easier parameter: Monthly mean temperature
Derive and test methods to:
Create reference time series
Differences between a station and its reference are expected be zero
Detect breaks
Maximize the external variance by a minimum of breaks
DWD Climate Stations
2000 1900
1920
1960 1200 Stations in total,
but not coexistent.
1900: 25 1940: 100 1960: 500 2000: 600
Kriging Approach
• n observations xi at the locations Pi are given.
• Perform a prediction x0 for the location P0 , where no obs is available.
• Construct the prediction by a weighted average of the observations xi.
• Take into account the observation errors xi.
• Determine the weights i.
min1
2
1
0
m t
n i
i i
i x x
x
Matrix and Input
Correlations
The spatial autocorrelation is dervided from all available data for each of the 12 months.
High correlations for monthly mean temperature.
Potsdam and Reference
A reference for each station is created by kriging of the surrounding 16 stations.
Normalized temperature anomaly in January for station Potsdam.
Station and Reference seems to be nearly identical.
Potsdam and Reference
A reference for each station is created by kriging of the surrounding 16 stations.
Normalized temperature anomaly in January for station Potsdam.
Station and Reference seems to be nearly identical.
However, there is a difference showing a positive trend from 1930 to 2000
Defining breaks
Breaks are defined by abrupt
changes in the station-reference time series.
Internal variance
within the subperiods External variance
between the means of different subperiods
Maximize the external variance by a minimum number of breaks
Decomposition of Variance
m years N subperiods nk members
The external variance is a weighted measure for the variability of the subperiods‘ means.
The internal variance contains information about the error of the subperiods‘ means.
The seeming external variance has to be diminished by this error
to obtain the true external variance.
Break Criterion
The true external variance is used as criterion for breaks.
The first break
The difference time series increase from 1930 to 2000 (as already shown)
Between 1965 and 1985 the criterion reaches maximum values.
More than 20% of the total variance can be explained by a break in one of these years.
1970 1968 1969 1979 1967 1978 1980 1971 1972 1981
21.77 21.76 21.67 21.64 21.41 21.33 21.07 20.95 20.87 20.77
criterion
time series
Break Searching Method
Now the first break is not simply fixed where the maximum criterion occured (1970).
But combinations of two breaks are tested which contain one of the 10 best first-break candidates (10 times 100 permutations).
The 10 best two-breaks combinations are used as seed for the search of three-breaks combinations.
1970 0.3197 0.3176 0.3150 0.3029 0.2968 0.2941 0.2904 0.2869 0.2857 0.2824 1930 1929 1928 1927 1925 1926 1924 1923 1920 1922
1968 0.3296 0.3270 0.3240 0.3110 0.3039 0.3014 0.2969 0.2931 0.2911 0.2881 1930 1929 1928 1927 1925 1926 1924 1923 1920 1922 1969 0.3232 0.3209 0.3181 0.3056 0.2991 0.2965 0.2924 0.2888 0.2872 0.2840
1930 1929 1928 1927 1925 1926 1924 1923 1920 1922 1979 0.2821 0.2815 0.2804 0.2718 0.2686 0.2656 0.2642 0.2632 0.2621 0.2591
1930 1929 1928 1927 1925 1926 1924 1920 1923 1922 1967 0.3301 0.3273 0.3240 0.3106 0.3032 0.3007 0.2959 0.2919 0.2896 0.2868
1930 1929 1928 1927 1925 1926 1924 1923 1920 1922 1978 0.2818 0.2810 0.2799 0.2710 0.2675 0.2646 0.2630 0.2617 0.2608 0.2577
1930 1929 1928 1927 1925 1926 1924 1920 1923 1922 1980 0.2720 0.2716 0.2707 0.2624 0.2595 0.2565 0.2553 0.2547 0.2534 0.2506
1930 1929 1928 1927 1925 1926 1924 1920 1923 1922 1971 0.3041 0.3023 0.3001 0.2887 0.2831 0.2804 0.2771 0.2739 0.2732 0.2697
1930 1929 1928 1927 1925 1926 1924 1923 1920 1922 1972 0.2987 0.2971 0.2951 0.2841 0.2790 0.2761 0.2732 0.2702 0.2698 0.2662
1930 1929 1928 1927 1925 1926 1924 1923 1920 1922 1981 0.2654 0.2651 0.2643 0.2564 0.2537 0.2508 0.2499 0.2497 0.2497 0.2495
1930 1929 1928 1927 1925 1926 1967 1924 1968 1920
The second break
1 Break
2 Breaks
3 Breaks
4 Breaks
Where to stop?
The searching method is applied to a random time series to define a stop criterion
Random Time Series
2 breaks 30 breaks
Decreasing of internal variance
1 to 400 breaks within 1000 years
1 to 50 breaks within 100 years
The remaining internal variance shrinks rather smoothly for a 1000 years time series.
Actually, we are dealing with only a 100 years time series.
Similar behaviour, but less regular.
Repeat the procedure 500 times and consider the
change in variance for each added break.
Many Breaks for many random time series
In average 6% of the variance is gained by the first breaks.
The 50th break gains only 0.3%
The 90 and the 95 percentile remain nearly constant at a few percent.
The first step is an exception as here only 100 possibilities are tested, whereas further breaks are searched from 1000
possibilities (10 candidates times 100 years).
Median
90% 95%
Observations vs Random
After 4 breaks the gained variance of the observations is comparable to that found for random time series.
4 breaks are realistic for the considered station.
95%
Random 90%
50%
Observations
Leaving out one station
January February
Reference from
nearest 16 stations
Reference without
Berlin-Dahlem
Conclusion
For monthly mean temperatures of DWD climate stations A method to create reference time series is derived.
A method to detect breaks in difference time series is derived.