Statistical error and Integrated Autocorrelation time . 31

1.5 Statistical Analysis of Monte Carlo time series

1.5.3 Statistical error and Integrated Autocorrelation time . 31

0 5 10 15 20 25 30 35 40 45

C(t)

Figure 1.6:Typical autocorrelation function. Note the log-scale.

The last expression can be obtained by a lengthy calculation using the spectral representation of the Markov transition matrixP. It contains pos-itive coefficientsc_i(O)and time scales τ_i, with _τ¹_i = −logλ_i, which are re-lated to the eigenvaluesλiof the Markov matrix. These eigenvalues satisfy

|λ_i|² ∈ (0,1]and are usually real valued. We see that the autocorrelation function is asum of exponentials, and it should therefore always be ana-lyzed on alogarithmicscale. Typical cases are depicted in Fig. 1.6.

The largest eigenvalue λ₀ = 1 of the Markov transition matrix corre-sponds to the equilibrium distribution π, since π = πP. The next largest eigenvalue determines the largest time scale in the time evolution. It is called theexponential autocorrelation timeτ_exp(O). For a specific operatorO, the largest time scaleτ_exp(O)occuring inρ_O(t)is determined by the largest eigenvalue withci(O)6= 0. It sets the time scale for thermalization ofO. In the autocorrelation function, plotted logarithmically,τ_exp(O)is the asymp-totic slope, at large time separationst.

1.5.3 Statistical error and Integrated Autocorrelation time

From the autocorrelation function, one can compute the true variance of the average valueO¯ = _N¹ Pn

i=1 O_i. We now label thenmeasurements ofO byO_i. Weassumethat the Markov chain has no memory of the initial con-figuration any more. Its probabilities are then time-translation invariant.

The variance of an individual measurement O_i can be expressed by expectation values:

σ_O²_i = hO²_ii − hOii² .

We now compute the variance of theaveragevalueO¯. If the measurements were independent, it would be equal toσ_O²_i/n. In general, it is

σ²_O_¯ ≡ hO¯²i − hOi¯ ²

In the second line, we have inserted the definition of O¯, and in the third line we collected diagonal terms i = j. The last line is obtained by using the assumed time translation invariance and writingj =i+t. The first bracket after the sum is equal toσ²_O

iρ_O(t). We arrive at the important result that the actual variance ofO¯is

σO²¯ = σ_O²

n 2τint(O) (1.65)

with the so-calledintegrated autocorrelation time¹⁶ τ_int(O) = 1

Equation 1.65 means that the effective number of measurements in the Monte Carlo run is a factor of2τ_intsmaller thann!

n_{ef f} = n

2τ_int(O). (1.67)

It is thiseffectivenumber of measurements which needs to be large in order to obtain reliable averages, whereas the number of measurementsn itself has no direct meaning for error estimates !

The autocorrelation function ρ_O(t)can be estimated from the data by the empirical autocorrelation function ρ^E_O(t), (1.38). Note that in order to estimate it reliably, the time series needs to be about O(100) times as long as the largest time scale τ_i occuring inρ_O(t) ! Because of the exponential decay of ρ_O(t)this also implies that the factor (1− _n^t) does not matter in practice.¹⁷It is therefore often omitted.

16When one performs the sum in (1.66) on Monte-Carlo data, one needs to be careful sinceρO(t)will eventually vanish into noise. Summing too far into the noise (which is like a random walk) could seriously influence the measured value ofτint.

17Whennτexp, then the autocorrelation function is exponentially small where(1− t/n)would matter. Whennnis notτexp, then the time series is too short and has to be extended or thrown out, and(1−t/n)does not matter either.

Example: Letρ_O(t) = e^−t/τⁱ be a single exponential, withn τ_i 1. Then lines (but logρ_O(t)itself is not straight when there is more than one term in in the sum) each of which contributes its inverse slopeτ_iwith a factorc_i equal to the intercept with the y-axis. Note that time scale with very small coefficientsc_ican thus still be important (see Fig. 1.6).

1.5.4 Binning: Determine autocorrelations and errors

The autocorrelation function is rather cumbersome to calculate and to an-alyze. A much easier determination of errors can be done by binning the time series. The idea is that averages over very long parts of the time se-ries, much longer than the autocorrelation time, are independent of each other (similar to independent Monte Carlo runs), and provide correct error estimates.

Since we do not know the autocorrelation time in advance, we have to use blocks of increasing lengths until the error estimate converges, which means that the blocks are then long enough. We perform the following steps:

Figure 1.7: Error estimate by data binning. The value of ^σ

2 Bi

N_B converges to-wards the true varianceσ_O²_¯ of the mean valueO¯.

• Cut the time series of length n into N_B = n/k blocks of identical lengthk, and disregard any leftover measurements. Repeat the fol-lowing procedure fork= 2¹,2²,2³, ...(When the autocorrelation time is fairly small, then a less drastic increase of k, even k = 1,2,3, ...., may work better.) Calculate the block average of each block

O¯_B_i = 1

All blocks together have a mean value of O¯_B. The variance of the single-block averagesO¯_B_i is

σ_B² the block length k τ_int(O), then the block averages are uncorre-lated, and this estimate converges to the correct variance of the mean O¯:

Correct error²of mean: 1 N_Bσ_B²

kτ_int(O)

−→ σ_O²_¯ (1.70)

This is illustrated in figure 1.7.

• The longer the block length k, the fewer the number N_B of such blocks. When NB becomes smaller than roughly 100, then the fluc-tuations in the estimated variance ^σ

2 Bi

NB become too large, as can be seen in fig. 1.7 at largek.

• Convergence criterion:The measurement ofO¯ can be considered to be converged when ^σ

2 Bi

NB stays constant (to within a few percent) over several bin-sizes.

Then the run has been long enough to overcome autocorrelations, and this estimate ofσ_O²_¯ as well as the estimate hOi ≈O¯ are reliable.

Since the number of blocksNBneeds to be at least about100for sta-ble results, the simulation must be considerably longer than 100τ_int for binning to converge.

A useful early estimate can be obtained just by looking at the time series: the run should be longer than about 100 times the longest visible timescale. (Of course longer time scales might become visible during long runs.)

Caution: When ^σ

2 Bi

NB has not converged, then O¯ can be completely wrong and the time series has to be extended a lot, orthrown away! Example: Magnetization in the Ising model at low temperature: the exact value ishMi = 0, but the meanM¯ is often far from 0because of extremely long autocorrelations.

• From (1.65), the integrated autocorrelation time for the quantityOis given by the ratio of the last and the first value in the binning plot (fig. 1.7):

2τ_int(O) =

σ_Bi²

NB(k→ ∞)

σ²_Bi

NB(k = 1)

(1.71)

• Note that autocorrelation times can be very different for different measured quantities. (Example: Magnetization and energy for the Ising model). The convergence has therefore to be checked indepen-dently for each measured quantity. Fortunately, the convergence cri-terion described above can easily beautomated!

However, if some quantities have not converged, then one has to carefully discuss the validity of other seemingly converged measure-ments (which may or may not be correct).

1.6 Summary: Recipe for reliable Monte Carlo

Im Dokument Computer Simulations (Seite 39-43)