Using the distribution of increments in Composite Reference difference
time series to determine the
temperature trend bias caused by inhomogeneities
Ralf Lindau
Inhomogeneities
Climate records are affected by
breaks resulting from relocations or changes in the measuring
techniques.
For the detection, differences of neighboring stations are considered to reduce the dominating natural variance.
Homogenization algorithms identify breaks by searching for the
maximum external variance (explained by the jumps).
Composite Reference method
Networks of neighboring stations are built (about 10 stations).
The average time series is built. (Composite reference).
This is subtracted from each candidate station.
As result we obtain about 10 difference time series with full break signal from candidate
with weaken break signal from the composite
without climate signal (assumed to be equal within network, appreciated) without trend bias (because it is just subtracted, not appreciated)
Trend bias
Only if the jumps are on average non-zero , they introduce a trend bias, (which is really harmful).
Otherwise they introduce only some additional scatter into the data.
Thus, we concentrate on the trend bias induced by inhomogeneities.
CR Approach Fails (1/2)
Panel (a) shows the step function of a candidate station and the
idealized composite reference Panel (b) the saw-shaped difference
time series between the two
together with the averages of the detected subperiods (thick),
Panel (c) the corrected (thick) and the original step function.
(explained by the jumps).
Some additional steps are inserted but the trend is not corrected.
b a
c
CR Approach Fails (2/2)
In the last example, we assumed only a common network bias and no inter-stational variance of the breaks.
Normally, the latter effect is large (compared to the bias) and superimposed.
The method finds and corrects these station-specific breaks.
As they dominate the variance, the procedure seems to perform well.
However, just the bias is missing.
Nonetheless, the CR approach can be used to correct the trend bias.
Increment distribution
A few large jumps, positive on average
A negative trend in between,
consisting of many small jumps, negative on average
(or vice versa)
The mean of all increments within a network is exactly zero, because each break of a station appears with a mirrored sign n times but attenuated by 1/n in the reference.
However, the median is negative (when the trend bias is positive)
Main classes and subclasses
Increments from modelled data (random Poisson-distributed):
Networks: 1000
Stations: 11
Month: 1001
Main class 0: no break in candidate 1: break in candidate Subclasses k: k breaks in reference Subclass means:
Some statistics (mean)
Frequency of main class 1:
Frequency of main class 0:
Relative frequency of the subclasses k (Binomial distribution):
Mean of class 0:
Mean of class 1:
The overall mean is actually zero
Some statistics (variance)
The slightly different means of each subclass impose a small additional variance to the main class. However, as the bias is small, this is negligible. Then:
Variance of class 0:
Variance of class 1:
The variances s02 and s12 are closely
connected to the signal-to-noise ratio SNR.
Approx. noise variance
Approx. noise plus break variance
Two main classes
Model input:
n = 10 p = 0.05 B = 1 K sd = 1 K se = 1 K
Model output
Main class 0 1
Number: 10.450.296 549.704 Mean: - 0.050 0.950 Variance: 1.091 2.107
Median of two Gaussian
We assume a positive bias.
We go from to . The median is not yet reached because now half of class 0 is smaller, but not yet half of the class 1 data.
1. Wide horizontally striped Class 0 data
2. Wide vertically striped:
Class 1 data
3. Narrow horizontally striped:
Increment necessary to reach the median (class 0)
4. Black (negligible):
Increment necessary to reach the median (class 1)
Four terms (1/2)
I: We stand at x0 and consider class 0:
Half of class 0 is reached, but this class contains only a fraction of 1 – p II: We stand at x0 and consider class 1:
Half of class 1 is reached at x1, but we are –B/s1 away from x1. III: We proceed by dx and stand approximately at 0 (class 0):
We are pB/s0 away from x0. The normalized increment is dx/s0. IV: We proceed by dx and stand approximately at 0 (class 1):
We are (1 – p)B/s1 away from x 1. The normalized increment is dx/s1.
I II III IV
Four terms (2/2)
Neglect IV while replacing 1-p by 1 in III:
Solve for dx:
with:
q is approx. 1, because we are in both
cases very near to 0 in the distribution
I II III IV
Approximation works
Formula with q = 1 (thin) Model results (thick) with p = 0.05
s0 = 0.5 K
sd = 0.1 – 0.9 K
The median is a linear function of the product pB.
This is equal to the total temperature change caused by the break bias.
The slope depends on the quotient s /s
Real data
2°-by-2° grid box in USA Fitting two functions for
the inner and the outer distribution provides an estimate for the slope 1 - s0/s1 (0.6) Together with the median
(0.0028 K), we obtain a trend bias of -0.46
K/cty for this particular grid box.
66 grid boxes
A number of 66 2°-by-2°
grid boxes in the USA finds a mean trend bias of
– 0.0515 K/cty with a stddev of 0.6996 K/cty 0.6996 / 8.12 > 0.0515 Not significant
Modelled data, one gridbox
Input: Output:
n = 10 p = 0.10 sd = 1.0 K se = 0.5 K
s0 = 0.522 K s0 = 0.552 K s1 = 1.128 K s1 = 1.108 K
�02=�+1
� ��2�12=�+1
� ��2+��2
Remember
Modelled data, 900 grid boxes
900 networks containing 10+1 stations
Realistic circumstances s0 = 0.522 K
s1 = 1.128 K p = 10 cty-1 B = 0.05 K
Trend bias = 0.5 K/cty Finding:
Mean deviation from inserted network trend: –0.017 K/cty RMS error: 0.790 K/cty
Unbiased method to determine
Conclusion
We presented a method to correct the mean trend bias of a network of stations caused by inhomogeneities.
The classic Composite Reference method fails to correct it.
However, due to the specific characteristics of the data, the trend bias is a linear function of the median of consecutive differences.
A rough estimate of the SNR is additionally required to estimate the slope.
The trend bias in the USA derived from 66 2°x 2° grid boxes is shown to be not significantly different from zero.