• Keine Ergebnisse gefunden

Physics of the canonical ensemble

2.2 Statistical mechanics

2.2.1 Physics of the canonical ensemble

Let us consider a system whose state is represented by the abstract variables~x∈Ωembedded in a heat bath. Imagine, for example, a discrete chain swimming in water at finite temperature T, such thatΩis the set of all 3N-dimensional chain conformations (phase space). The system also possesses an energy function E(~x) (e.g. reflecting its elastic properties). This situation is known in thermodynamics as the canonical ensemble: the system of interest is coupled to a heat bath and is allowed to exchange heat with it, making its total energy and momentum fluctuating rather than preserved quantities.

When the microstate of a system is prepared to be~x0 at time zero, all information about it is initially available: the probability distribution for~xis a delta function:p(~x, 0)=δ(~x−~x0). For times t>0, the system’s current microstate~xis no longer known since its evolution is stochastic. We imagine anensembleof trajectories~x(t): A set of infinitely many independent realizations of the experiment obeying the initial condition. The trajectories in the ensemble independently branch out from the initial condition, and to an outside observer who went for lunch after preparing the initial condition, the particular stochastic trajectory the system has chosen is unknown.

Therefore, its current state can then only be quantified in terms of a probability distribution p(~x,t). This distribution is best understood as a histogram over all microstates in the ensemble at time t. For intuition, compare figures 2.4 (page 44) which shows trajectories in an ensemble, and figure 2.3 (page 40) which shows the corresponding probability distributions at different times.

Entropy We start our calculations with the following axiom: Sufficiently long after knowledge about the system’s initial state is available, p(~x) has relaxed towards a stationary state which has maximumuncertainty S[p]. This situation is referred to as thermal equilibrium. Thanks to the groundbreaking work of Claude Shannon [102], the founder of information theory, the somewhat vague concept of uncertainty can be mathematically quantified by the probability distribution’s entropyS. It can in fact be shown (see e.g. pages 68–71 in [103]) that the functional Sisuniquelydetermined from the three axioms that (i)Sbe a real, continuous functional ofp, (ii) if pis constant, say pi= 1

N, thenSbe an increasing function ofN, and (iii)Sbe additive under groupingsof the states~xinΩ. The third axiom makes sure that the uncertainties on distinct regions inΩadd up when they are treated as a whole: uncertainties about independent systems must be additive. Up to an arbitrary prefactor which we set to 1, the only function with these properties is known asentropy, and it reads

(2.35) S[p]= −〈lnp〉 = −

Z

p(~x) lnp(~x) d~x .

Note that contrary to entropy in conventional thermodynamics, this entropy is dimensionless, which corresponds to setting Boltzmann’s constant tokB=1. This definition of entropy is also known as theGibbs entropy: p(~x) arises from an ensemble rather than from a time average.

Technically speaking, Shannon’s axioms above only lead to a discrete version (“Shannon entropy”) of the Gibbs entropy. Taking its continuum limit is not trivial, and requires singling out a set of variables~xwith respect to whichp(~x) shall be maximally ignorant. For instance, a uniform distribution on a spherical surface2is constant with respect to the variables (φ, cosθ) but not with respect to (φ,θ). We must thus keep in mind that~xin our entropy definition represents Cartesian coordinates. For technical details on this issue, we refer to chapters 4.5 and 5.2 in reference [103].

2For a uniform distribution on the surface of a unit sphere, surface elements carry a probability∝ |dA| =

|sinθdθdφ| = |d(cosθ) dφ|. Therefore, cosθand notθis uniformly distributed.

Boltzmann distribution At the same time as maximizing its entropy, it is required that p fulfills two constraints. Firstly, any probability distribution must be normalized such that R

p(~x) d~x=1. Secondly, the average energyEof the chain cannot be unbounded, but the heat bath dictates some fixed average valueU= 〈E〉 which is called its inner energy. Let us now find the function p(~x) which maximizes its entropy while fulfilling these two constraints. The variational problem at hand can be solved by introducing two scalar Lagrangian multipliersα andβrepresenting the constraints, and solving

(2.36) Setting the integrand to zero yields

(2.37) normalization factor ofp. It can hence be determined from the normalization constraint and is given by

The second Lagrangian multiplier,β, can be interpreted as follows. Since the argument of the exponential function must be dimensionless,β1 must be an energy. It determines the energy scale on which the probability to find the system in a certain state~xdecays. Forβ→ ∞, states with more energy than the minimum energy the system can take on do not carry any statistical weight. Forβ→0, all states carry the same statistical weight, regardless of their energy. At the same time, the system’s inner energyU can be tuned by changingβbecause

(2.39) U∝

Z

E·exp¡

β

d~x .

It would be perfectly reasonable to treat βas a free parameter characterizing the heat bath that the system is embedded in. It is more intuitive, however, to deal with the energy scaleβ1 instead, which is better known as the temperature

(2.40) T=β−1 .

The fact that the temperature is an energy is a result of our convention thatkB=1.

The probability distribution therefore takes on the simple form

which is called the Boltzmann distribution. Note that it is entirely characterized by the system’s energy landscape E and the heat bath’s temperature T – no kinematic details enter it. For instance, whether or not all vertices (beads) of the chain have identical friction coefficients and whether or not hydrodynamic interactions are considered for the chain’s dynamics does not alter the Boltzmann distribution, which is quite remarkable. The Boltzmann distribution has two further properties which are sometimes even used to derive it in a heuristic manner. Firstly, due to being normalized, it does not change when a constant is added to the energy function E0(~r)=E(~r)+const., as it should not because only the gradientsof potential functions have physical relevance. Secondly, when two non-interacting physical systems with energy functions E1(~x1) and E2(~x2) are treated as one global system with E =E1+E2, then the probability distribution for the global system factorizes (p(~x1,~x2)∝exp(−E1/T)·exp(−E2/T) ), which correctly demonstrates the statistical independence of the two subsystem’s states.

Another side note is that the celebratedequipartition theoremdirectly follows from the Boltzmann distribution: Consider any variable qwhich enters the system’s energy function in a quadratic manner (e.g. a harmonic elastic potential, or a particle’s velocity) such thatE=c

2·q2. Then its and we can immediately read off its variance to be

(2.43) 〈q2〉 = T

c .

This gives the average energy in q’s degree of freedom:

(2.44) U= 〈E〉 = 〈c

2q2〉 = T

2 .

This is yet another perspective which allows us to interpret the Lagrangian multiplier we named temperature as an energy scale relevant to the physical system.

Free energy From equation 2.36 we observe that, within the space of all normalized probability distributions, the Boltzmann distribution minimizes the functional

(2.45) F=U−T S ,

which is called the (Helmholtz) free energy. It is the minimized thermodynamic potential in the canonical ensemble.

The distinction between entropy maximization and free energy minimization is entirely formal:

the latter considers the system’s inner energyU as part of the extremized functional, while the former uses its value solely as a mathematical constraint. Both perspectives consider the normal-ization ofpas a mathematical constraint. Taking the viewpoint of free energy minimization, some physical insight is gained: We have at hand a balancing act between two contradicting extreme cases: minimizing energy (causing a sharply peaked probability distribution) and maximizing entropy (causing a uniform probability distribution), with the temperature acting as a weighting factor dictating which of these two extremes enjoys higher priority.

Inserting the Boltzmann distribution into the definition of entropy (equation 2.35) yields

(2.46)

which (upon comparison with equation 2.45) shows us that the system’s free energy is given by

(2.47) F= −TlnZ .

Finally, we consider a subsetΩA ofΩ: amesostate A. For example, ifΩis the set of all confor-mations (microstates) of a polymer, thenΩA (the set of allopenstates) could be the set of all conformations for which the end-to-end distance of the polymer exceeds some given value.

The total probability to find the system inΩAis given by

(2.48) p(A)=

Thus, the partition functionZdenotes the total probability of the set of microstates~xover which pwas integrated. In analogy to the probability of a microstate, which is given by the Boltzmann distribution with that microstate’s energy in the exponent, we may therefore write

(2.49) p(A)= 1

where FA= −TlnZA is the free energy of the mesostate A. The relative probabilities of mi-crostates~x1and~x2, and of mesostatesAandBare therefore given by Boltzmann factors

(2.50)

In summary, one may hence interpret the free energy as the effective energy associated with a mesostate (orΩitself) rather than a microstate. The probability distribution for microstates hidden within that mesostate minimizes its free energy.

Detailed balance Note that, by construction, all the arguments above include no mention of time. That is because we just studiedstatisticalproperties of a physical system in thermal equilibrium, which is well known as classical thermodynamics. In my opinion, this terminology is very unfortunate as thermodynamics explicitly doesnotconcern itself with dynamics. We must bear in mind, however, that even in thermal equilibrium, a physical system constantly evolves over time. It changes its microstate~x, sampling the Boltzmann distribution as its time average.

While the statistical properties of the system are easily found – only the energy functionE(~x) is required – examining itsdynamicalproperties (anything requiring a notion of time) is far from trivial. Consider, for example, the fact that the sampled probability distribution of a sufficiently long trajectory~x(t) does not change when it is sped up or slowed down, or when it is cut into parts which are then reassembled such that~x(t) may perform discontinuous jumps.

To study dynamics in thermal equilibrium, we require another physical axiom other than free energy minimization (determining all statistical properties), and this is called the detailed balance condition. It states that a sufficiently long time after~xwas prepared to obey some known initial condition, the trajectories~x(t) obey time-reversal symmetry. For instance, this means that there is no way of telling if a video of a polymer chain in thermal equilibrium is played forward or backward. This is a stronger requirement than stating that the probability distribution p(~x) be stationary. Consider for example a three-state systemΩ={x1,x2,x3}whose dynamics cyclically move from states x1→x2→x3→x1→x2→. . . with unit speed – these trajectories sample a stationary distribution but do not obey time reversal symmetric. The detailed balance condition must hold for any two statesxiandxj:

(2.51)

probability flux(xi→xj)=probability flux(xj→xi) p(xi)·k(xi,xj)=p(xj)·k(xj,xi)

⇔ k(xi,xj) k(xj,xi)=exp

µ

−E(~xj)−E(~xi) T

wherek(p,q) is the transition rate per time from state pto stateq.