Extended ensembles - Appendix: Other Monte Carlo algorithms

1.7 Appendix: Other Monte Carlo algorithms

1.7.3 Extended ensembles

Often, simulations are stopped by barriers in phase space from exploring important regions of phase space. This problem appears not only in stan-dard Monte Carlo simulations of some probabilty distributionπ, butalso in optimization problemsin a multitude of areas, which we will discuss in the next chapter.

Two successful strategies against such barriers are either to reduce their effective size, or to overcome them. Examples of these strategies are the so called Multicanonical EnsembleandTempering. Both are versions of um-brella sampling.

log π(E) N(E)

Figure 1.8: Histogram of how often specific energies occur in a Monte Carlo. There is a large barrier between small and high energies, which can be overcome by a suitably chosen new weight function.

Multicanonical ensemble

The probability of configurations with a certain energy E to appear in a Monte Carlo configuration is proportional to the Boltzmann weightπ(E)∼ exp(−βE). It is also proportional to thenumber of statesN(E)with the en-ergy E. This number is related to the entropy. Therefore the number of configurations with energyE in a Monte Carlo simulation is proportional to

π(E)N(E).

We now consider the case of a large energy and/or entropy barrier, as sketched in fig. 1.8. The barrier between the two populated regions can often be many orders of magnitude high. For example, in the Potts model with q >4, there is a first-order phase transition, with a jump in the aver-age energy as a function of temperature. The barrier then corresponds to a surface tension between ordered and disordered phases, and can reach 10¹⁰⁰ already for relatively small systems. It is then impossible for Monte Carlo configurations to move through this barrier. Sometimes, but rarely, one can devise methods to move from one favorable region directly to the other. For the Potts model, however, one even needs to explore the barrier itself in order to measure the surface tension.

The strategy of multicanonical simulations is to reweight energiesin such a way that the histogram of energies becomes approximately flat.

A flat histogram as depicted in fig. 1.8 means that the Monte Carlo will perform arandom walk in energies(!) and will therefore move in energies fairly quickly, much quicker than the exponentially large time required to go through the minimum. The change of (unnormalized) weight is

π(E) = e^−βE 7→ π(E) :=˜ e^−βE+^f^(E) = W(E)e^f(E) (1.73)

with a suitably chosenfunction f(E), such that the histogram for a simu-lation with ˜π is approximately constant. The histogram does not need to be completely flat, it just needs to allow transitions between low and high energies. The function f(E) is the difference between the two curves in fig. 1.8. In practise, this function is determined iteratively, by starting with cases (e.g. small systems) where the barrier is not high.¹⁹

The Monte Carlo is now performed with the new weights π˜. Fortu-nately, one can still calculate expectation values with respect to the origi-nal weight π, by reweighting the measurements back (this works for any reweighting functionf_x):

hOi_π ≡ Indeed, one can also determine expectation values with respect to differ-ent temperatures.²⁰The only requirement is that the probability distribution at the desired temperature iscontainedcompletely within the distribution that is actually simulated (fig. 1.8). I.e. one simulates anumbrellaof all de-sired distributions. Note that one can also reweight in quantities other than the energy.

The multicanonical technique has found important applications in op-timization problems, notably in theTraveling salesmanand related prob-lems, and inProtein folding. This will be discussed in the next chapter.

Free energy measurements

The free energy F of a statistical system can be defined via the partition function by

Z =e^−βF.

Knowledge ofF is therefore equivalent to knowledge of the partition func-tion, which in its parameter dependence contains all thermodynamic quan-tities, and also provides knowledge of the entropyS, viaF =U −T S.

Unfortunately, in Markov Chain Monte Carlo,Zis just a normalization factor for the sum over all configurations, which in Monte Carlo becomes the normalization by the number of measured configurations, which does not give any information aboutZ.

However, by clever application of the Multicanonical Method (and sim-ilarly for Tempering, see below), Z or the free energy F can indeed be measured.

19Note that there should not be too many values for the argument of the reweighting function because that many values off(E)need to be determined. A quantity like the energy is ok here, butf should not be independently different for every configuration x.

20Hence the somewhat unfortunate name ”multicanonical”

beta

a=0 time

third bin second

bin independent a=1

first independent bin

Figure 1.9: Tempering: Random walk inβ.

We first classify the sum over all configurations x into the different en-ergies:

Z = X

e^−βE(x) = X

x|E

e^−βE = X

NE e^−βE, (1.75) where N_E is the number of states in phase space with energyE (density of states). We then measure how many configurations of a given energy occur in a Monte Carlo. The ratio of expectation values is

hN_E₁i

hN_E₂i = N_E₁e^−βE¹

N_E₂e^−βE² . (1.76)

The multicanonical method flattens the distribution of energies in the Monte Carlo over some chosen range of energies. We can choose to include all possible energies of the system, also the lowest (or highest), for which the number of configurations is usually known exactly (e.g. exactly two com-pletely ferromagnetic states in the Ising model). From this exact knowl-edge and (1.76), all the other N(E) can be measured, and then Z can be calculated via (1.75).

Tempering

The strategy of tempering is to overcome barriers by making the temper-ature a dynamical variable with discrete values β_i of β, which can now change during a simulation. The extended phase space of the simulation now consists of the original variablesxand the indexiof temperature.²¹ Simulated Tempering

The partition function in this extended space for Simulated Tempering is

Z = X

e^−βⁱ^H^x⁺^gⁱ = X

Z(β_i). (1.77)

21Literally, this approach ought to be called ”multicanonical”, but it is not.

One chooses constants g_i in such a way that the system approximately performs arandom walk in temperatures.

When it is at a sufficiently high temperature, then the system can easily overcome energy barriers. Indeed, ifβ = 0 is included in the simulation, the configurations will decorrelate completely whenever β = 0is reached, cutting off any autocorrelations. When β = 0is included, simulated tem-pering also allows the measurement of entropy, which is usually impossi-ble in Monte Carlo (see below).

When one considers only the configurations at a certain inverse tem-perature β_i, then (1.77) implies that they are from the normal canonical ensemble exp(−β_iE) ! Therefore one can directly measure observables in the canonical ensemble at any of the β_i, including the properties of the system at some low temperature, where the barriers are felt fully.

For tempering to work, theβ_i need to be chosen more densely close to a phase transition. At a strong first order phase transition, the multicanon-ical method works better.

Partition function:Equation 1.77 can be regarded as a partition function in the space of indices i, with weightsZ(β_i). The Monte Carlo is done in this space, and the number of configurationsN_i at certain temperaturesβ_i must then obey

hN_ii hN_ji = Z_i

Z_j . (1.78)

Simulated tempering covers a range ofβ values. If one includesβ = 0, the situation simplifies. Atβ = 0, the partition function is Z₀ = P

xe⁰ = N_x, namely just the total number of configurations in phase space. Thus Z₀ is known, and measurements of hN_ii provide Z_i via (1.78) also for other temperatures.

Parallel Tempering

In parallel tempering, one simulates all inverse temperatures βi in par-allel, e.g. on different processors. Occasionally, the interchange of con-figurations at neighboring temperatures is proposed, and accepted with Metropolis probability according to (1.77). For such an interchange, the constants g_i cancel. They are therefore not required in parallel tempering and need not be determined. As a drawback, the entropy cannot be deter-mined.

There are many applications of Tempering outside of physics, includ-ing again optimization problems like theTraveling salesmanandProtein folding, as well as Bayesian parameter determination from measured data. This will be discussed in more detail in the next chapter.

(This page is empty)

Chapter 2 Minimization/Optimization – Problems

Literature

• W. PRESSet al.,Numerical recipes, Cambridge University Press

• F.S. A^CTON,Numerical Methods that work, Mathematical Association of America

Optimization is of enormous importance in physics, engineering, eco-nomics, information technology and in many other fields. The layout of electric circuits on a chip, timetables, or the optimum load of a pipeline system are a few typical examples.

Usually we have a problem which is determined by a set of n real parameters. Each set of parameter values specifies a state. Together they span the search space. Each state is mapped to a real number, its cost, by a cost function. One has to find the specific values of the parameters for which the cost function develops a maximum/minimum. Often, one also requires robustness of the solutionagainst small fluctuations of the pa-rameters.

If the cost function is at least once differentiable in its parameters, then mathematics will provide us with well defined (deterministic) algorithms to find at least local maxima/minima of the cost function. Such methods will be discussed first.

However, the state space is often far too large for such methods, or there are too many extrema, or the value of the cost function is not differ-entiable. Then other approaches have to be used, often stochastic methods similar to Markov Chain Monte Carlo. In these one starts from some state and modifies it. If the cost function changes to a better value or does not worsen too much, according to some strategy, then the new state is taken.

This is repeated until an optimum has been found, as specified by some suitable criterion.

Im Dokument Computer Simulations (Seite 47-54)