Lviv Ising Lecture Janke Corrected
Lviv Ising Lecture Janke Corrected
Chapter 3
Wolfhard Janke
Institut für Theoretische Physik and Centre for Theoretical Sciences (NTZ),
Universität Leipzig, Postfach 100 920, 04009 Leipzig, Germany
[email protected]
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2. The Monte Carlo Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2.1. Random sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.2. Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.3. Local update algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
2.4. Temporal correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
2.5. Cluster algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3. Statistical Analysis of Monte Carlo Data . . . . . . . . . . . . . . . . . . . . . . . . 113
93
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
94 W. Janke
1. Introduction
Computer Simulation
J
J
J
J
J
J
J
J
J
J
J
Fig. 1. Sketch of the relationship between theory, experiment and computer simulation.
in detail and for further study the reader is referred to recent textbooks,1–4 where
some of the material presented here is discussed in more depth. The rest of this
chapter is organized as follows. In the next Sect. 2, first the definition of the
standard Ising model is briefly recalled. Then the basic method underlying all im-
portance sampling Monte Carlo simulations is described and some properties of
local update algorithms (Metropolis, Glauber, heat-bath) are discussed. The fol-
lowing subsection is devoted to non-local cluster algorithms which in some cases
can dramatically speed up the simulations. A fairly detailed account of statistical
error analyses is given in Sect. 3. Here temporal correlation effects and auto-
correlation times are discussed, which explain the problems with critical slowing
down at a continuous phase transition and exponentially large flipping times at a
first-order transition. Reweighting techniques are discussed in Sect. 4 which are
particularly important for finite-size scaling studies. More advanced generalized
ensemble simulation methods are briefly outlined in Sect. 5, focusing on simu-
lated and parallel tempering, the multicanonical ensemble and the Wang-Landau
method. In Sect. 6 suitable observables for scaling analyses (specific heat, mag-
netization, susceptibility, correlation functions, . . . ) are briefly discussed. Some
characteristic properties of phase transitions, scaling laws, the definition of criti-
cal exponents and the method of finite-size scaling are summarized. In order to
illustrate how all these techniques can be put to good use, in Sect. 7 two concrete
applications are discussed: The phase diagram of a quenched, diluted ferromagnet
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
96 W. Janke
where O stands for any quantity of the system defined by its Hamiltonian H and
X X
Z = e−βF = e−βH(σ) = Ω(E)e−βE (2)
states σ E
is the (canonical) partition function. The first sum runs over all possible mi-
crostates of the system and the second sum runs over all energies, where the
density of states Ω(E) counts the number of microstates contributing to a given
energy E. The state space may be discrete or continuous (where sums become
integrals etc.). As usual β ≡ 1/kB T denotes the inverse temperature fixed by an
external heat bath and kB is Boltzmann’s constant.
In the following most simulation methods will be illustrated for the minimal-
istic Ising model5 where
X X
H(σ) = −J σi σj − h σi , σi = ±1 . (3)
hiji i
One way out is stochastic sampling of the huge state space. Simple random sam-
pling, however, does not work for statistical systems with many degrees of free-
dom. Here the problem is that the region of state space that contributes signifi-
cantly to canonical expectation values at a given temperature T ≪ ∞ is extremely
narrow and hence far too rarely hit by random sampling. In fact, random sampling
corresponds to setting β = 1/T = 0, i.e., exploring mainly the typical microstates
at infinite temperature. Of course, the low-energy states in the tails of this distri-
bution contain theoretically (that is, for infinite statistics) all information about
the system’s properties at finite temperature, too, but this is of very little practical
relevance since the probability to hit this tail in random sampling is by far too
small. With finite statistics consisting of typically 109 − 1012 randomly drawn
microstates, this tail region is virtually not sampled at all.
The solution to this problem has been known since long as importance sampling6,7
where a Markov chain8–10 is set up to draw a microstate σi not at random but
according to the given equilibrium distribution
W W W W
· · · −→ σ (k) −→ σ (k+1) −→ σ (k+2) −→ . . . ,
where σ (k) is the current state of the system after the kth step of the Markov chain.
To ensure that, after an initial transient or equilibration period, microstates occur
with the given probability (4), the transition probability Wij has to satisfy three
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
98 W. Janke
conditions:
i) Wij ≥ 0 ∀ i, j , (5)
X
ii) Wij = 1 ∀i , (6)
j
The first two conditions merely formalize that, for any initial state σi , Wij should
be a properly normalized probability distribution. The equal sign in (5) may oc-
cur and, in fact, does so for almost all pairs of microstates i, j in any realistic
implementation of the Markov process. To ensure ergodicity one additionally
has to require that starting from any given microstate σi any other σj can be
reached in a finite number of steps, i.e., an integer n < ∞ must exist such that
(W n+1 )ij = k1 ,k2 ,...,kn Wik1 Wk1 k2 . . . Wkn j > 0. In other words, at least one
P
(finite) path connecting σi and σj must exist in state space that can be realized
with non-zero probability.b
The balance condition (7) implies that the transition probability W has to be
chosen such that the desired equilibrium distribution (4) is a fixed point of W , i.e.,
an eigenvector of W with unit eigenvalue. The usually employed detailed balance
is a stronger, sufficient condition:
By summing over i and using the normalization condition (6), one easily proves
the more general balance condition (7).
After an initial equilibration period, expectation values can be estimated as
arithmetic mean over the Markov chain,
N
X 1 X
hOi = O(σ)P eq (σ) ≈ O ≡ O(σ (k) ) , (9)
σ
N
k=1
where σ (k) stands for a microstate at “time” k.c Since in equilibrium hO(σ (k) )i =
hOi at any “time” k, one immediately sees that hOi = hOi, showing that the mean
value O is a so-called unbiased estimator of the expectation value hOi. A more
detailed exposition of the mathematical concepts underlying any Markov chain
Monte Carlo algorithm can be found in many textbooks and reviews.1–4,11–13
b In practice, one may nevertheless observe “effective” ergodicity breaking when (W n+1 )ij is so
small that this event will typically not happen in finite simulation time.
c In Monte Carlo simulations, “time” refers to the stochastic evolution in state space and is not directly
related to physical time as for instance in molecular dynamics simulations where the trajectories are
determined by Newton’s deterministic equation.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
Finally we show that Wij satisfies the detailed balance condition (8). We first
consider the case fji Pjeq > fij Pieq . Then, from (11), one immediately finds
eq
fij Pi
Wij Pieq = fij Pieq for the l.h.s. of (8). Since Wji = fji min 1, fji Pjeq
, the
r.h.s. of (8) becomes
fij Pieq eq
Wji Pjeq = fji P = fij Pieq , (14)
fji Pjeq j
which completes the proof. For the second case fji Pjeq < fij Pieq , the proof
proceeds precisely along the same lines.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
100 W. Janke
The update prescription (10), (11) is still very general: (a) The selection prob-
ability may be asymmetric (fij 6= fji ), (b) it has not yet been specified how to
pick the trial state σj given σo , and (c) P eq could be “some” arbitrary probabil-
ity distribution. The last point (c) is obviously trivial, but the resulting formulas
simplify when a Boltzmann weight as in (4) is assumed. Then
Pjeq
= e−β∆E (15)
Pieq
where ∆E = Ej − Ei = En − Eo is the energy difference between the proposed
new and the old microstate. The second point (b), on the other hand, is of great
practical relevance since an arbitrary proposal for σn would typically lead to a
large ∆E and hence a high rejection rate if β > 0. One therefore commonly tries
to update only one degree of freedom at a time. Then σn differs only locally from
σo . For short-range interactions this automatically has the additional advantage
that only the local neighborhood of the selected degree of freedom contributes to
∆E, so that there is no need to compute the total energies in each update step.
These two specializations are usually employed, but the selection probabilities
may still be chosen asymmetrically. If this is the case, one refers to this update
prescription as the Metropolis-Hastings15 update algorithm. For a recent example
with asymmetric fij in the context of polymer simulations see, e.g., Ref. 16.
zero and the Metropolis method degenerates to a minimization algorithm for the
energy functional. With some additional refinements, this is the basis for the sim-
ulated annealing technique,17 which is often applied to hard optimization and
minimization problems.
For the Ising model with only two states per spin, a spin flip is the only admis-
sible local update proposal. Hence in this simple example there is no parameter
available by which one could tune the acceptance ratio, which is defined as the
fraction of trial moves that are accepted. For models with many states per spin
(e.g., q-state Potts or Zn clock models) or in continuous systems (e.g., Heisenberg
spin model or off-lattice molecular systems), however, it is in the most cases not
recommendable to propose the new state uniformly out of all available possibili-
ties. Rather, one usually restricts the trial states to a neighborhood of the current
“old” state. For example, in a continuous atomic system, a trial move may consist
of displacing a randomly chosen atom by a random step size up to some maximum
Smax in each Cartesian direction. If Smax is small, almost all attempted moves
will be accepted and the acceptance ratio is close to unity, but the configuration
space is explored slowly. On the other hand, if Smax is large, a successful move
would make a large step in configuration space, but many trial moves would be
rejected because configurations with low Boltzmann weight are very likely, yield-
ing an acceptance ratio close to zero. As a compromise of these two extreme
situations, one often applies the common rule of thumb that Smax is adjusted to
achieve an acceptance ratio of 0.5.18,19
Empirically this value proves to be a reasonable but at best heuristically justi-
fied choice. In principle, one should measure the statistical error bars as a function
of Smax for otherwise identical simulation conditions and then choose that Smax
which minimises the statistical error. In general the optimal Smax depends on the
model at hand and even on the considered observable, so finally some “best aver-
age” would have to be used. At any rate, the corresponding acceptance ratio would
certainly not coincide with 0.5. Example computations of this type reported val-
ues in the range 0.4 − 0.6 (Refs. 18,20) but for certain models also much smaller
(or larger) values may be favourable. Incidentally, there appeared recently a proof
in the mathematical literature21 claiming an optimal acceptance ratio of 0.234
which, however, relies on assumptions22 not met in a typical statistical physics
simulation.d
Whether relying on the rule of thumb value 0.5 or trying to optimise Smax ,
this should be done before the actual simulation run. Trying to maintain a given
acceptance ratio automatically during the run by periodically updating Smax is at
d Thanks
are due to Yuko Okamoto who pointed to this paper and to Bob Swendsen who immediately
commented on it during the CompPhys11 Workshop in November 2011 in Leipzig.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
102 W. Janke
least potentially dangerous.19 The reason is that the accumulated average of the
acceptance ratio and hence the updated Smax are dependent on the recent history
of the Monte Carlo trajectory – and not only on the current configuration – what
violates the Markovian requirement. Consequently the balance condition is no
longer fulfilled which may lead to more or less severe systematic deviations (bias).
As claimed already a while ago in Ref. 18 and reemphasized recently in Ref. 20,
by following a carefully determined schedule for the adjustments of Smax , the
systematic error may be kept smaller than the statistical error in a controlled way,
but to be on the safe side one should be very cautious with this type of refinements.
Finally a few remarks on the practical implementation of the Metropolis
method. To decide whether a proposed update should be accepted or not, one
draws a uniformly distributed random number r ∈ [0, 1), and if r ≤ wij , the new
state is accepted. Otherwise one keeps the old configuration and continues with
the next spin. In computer simulations, random numbers are generated by means
of “pseudo-random number generators” (RNGs), which produce – according to
some deterministic rule – (more or less) uniformly distributed numbers whose
values are “very hard” to predict.23 In other words, given a finite sequence of
subsequent pseudo-random numbers, it should be (almost) impossible to predict
the next one or to even uncover the deterministic rule underlying their generation.
The “goodness” of a RNG is thus assessed by the difficulty to derive its underly-
ing deterministic rule. Related requirements are the absence of correlations and a
very long period, what can be particularly important in high-statistics simulations.
Furthermore, a RNG should be portable among different computer platforms and,
very importantly, it should yield reproducible results for testing purposes. The de-
sign of RNGs is a science in itself, and many things can go wrong with them.e As
a recommendation one should better not experiment too much with some fancy
RNG picked up somewhere from the WWW, say, but rely on well-documented
and well-tested subroutines.
As indicated earlier the Markov chain conditions (5)–(7) are rather general and
the Metropolis rule (11) or (16) for the acceptance probability wij is not the only
possible choice. For instance, when flipping a spin at site i0 in the Ising model,
wij can also be taken as25
1
wij = w(σi0 → −σi0 ) = [1 − σi0 tanh (βSi0 )] , (17)
2
eA prominent example is the failure of the by then very prominent and apparently well-tested R250
generator when applied to the single-cluster algorithm.24
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
1.0 Metropolis
acceptance ratio
0.8
0.6
Glauber
0.4
β = 0.2
0.2
β = 0.44
β = 1.0
0.0
-8 -4 0 4 8
energy difference ∆E
Fig. 2. Comparison of the acceptance ratio for a spin flip in the two-dimensional Ising model with
the Glauber (or equivalently heat-bath) and Metropolis update algorithm for three different inverse
temperatures β.
P
where Si0 = k σk + h is an effective spin or field collecting all neighboring
spins (in their “old” states) interacting with the spin at site i0 and h is the ex-
ternal magnetic field. This is the Glauber update algorithm. Detailed balance is
straightforward to prove. Rewriting σi0 tanh (βSi0 ) = tanh (βσi0 Si0 ) (making
use of σi0 = ±1 and the point symmetry of the hyperbolic tangent) and noting
that ∆E = En − Eo = 2σi0 Si0 (where σi0 is the “old” spin value and (−σi0 ) the
“new” one), Eq. (17) becomes
1 e−β∆E/2
w(σi0 → −σi0 ) = [1 − tanh (β∆E/2)] = β∆E/2 , (18)
2 e + e−β∆E/2
showing explicitly that the acceptance probability of the Glauber algorithm also
only depends on the total energy change as in the Metropolis case. In this form
it is thus possible to generalize the Glauber update rule from the Ising model
with only two states per spin to any general model that can be simulated with the
Metropolis procedure. The acceptance probability (18) is plotted in Fig. 2 as a
function of ∆E for various (inverse) temperatures and compared with the corre-
sponding probability (16) of the Metropolis algorithm. Note that for all values of
∆E and temperature, the Metropolis acceptance probability is higher than that of
the Glauber algorithm. As we shall see in the next paragraph, for the Ising model,
the Glauber and heat-bath algorithms are identical.
The Glauber update algorithm for the Ising model is also theoretically of
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
104 W. Janke
interest since for the one-dimensional case the dynamics of the Markov chain
can be calculated analytically. For the relaxation time of the magnetisation
one finds the remarkably simple result25 m(t) = m(0) exp(−t/τrelax ) with
τrelax = 1/[1 − tanh(2β)]. For two and higher dimensions, however, no exact
solutions are known.
The heat-bath algorithm is different from the two previous update algorithms in
that it does not follow the previous scheme “update proposal plus accept/reject
step”. Rather, the new value of σi0 at a randomly selected site i0 is determined by
testing all its possible states in the “heat bath” of its (fixed) neighbors (e.g., 4 on a
square lattice and 6 on a simple-cubic lattice with nearest-neighbor interactions).
For models with a finite number of states per degree of freedom the transition
probability reads
P
e−βH(σn ) e−β k Hi0 k
w(σo → σn ) = P =P P , (19)
−βH(σ ) −β k Hi0 k
σi e σi e
o
0 0
P
where k Hi0 k collect all terms involving the spin σi0 . All other contributions to
the energy not involving σi0 cancel due to the ratio in (19), so that for the update
at each site i0 only a small number of computations is necessary (e.g, about 4 for
a square and 6 for a simple-cubic lattice of arbitrary size). Detailed balance (8) is
obviously satisfied since
e−βH(σn ) e−βH(σo )
e−βH(σo ) P −βH(σn )
= e −βH(σn )
P −βH(σo )
. (20)
σi e
0 σi e 0
How is the probability (19) realized in practice? Due to the summation over
all local states, special tricks are necessary when each degree of freedom can
take many different states, and only in special cases the heat-bath method can
be efficiently generalized to continuous degrees of freedom. In many applica-
tions, however, the admissible local states of σi0 can be labeled by a small num-
ber of integers, say n = 1, . . . , N , which occur with probabilities pn according
to (19). Since this probability distribution is normalized to unity, the sequence
(p1 , p2 , . . . , pn , . . . , pN ) decomposes the unit interval into segments of length
∝ pn . If one now draws a random number R ∈ [0, 1) and compares the accu-
Pn
mulated probabilities k=1 pk with R, then the new state n is the smallest upper
Pn
bound that satisfies k=1 pk ≥ R. Clearly, for a large number of possible local
states, the determination of n can become quite time-consuming (in particular,
if many small pn are at the beginning of the sequence, in which case a clever
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
permutation of the pn by relabeling the admissible local states can improve the
performance).
In the special case of the Ising model with only two states per spin, σi = ±1,
(19) simplifies to
eβσi0 Si0
w(σo → σn ) = , (21)
eβSi0 + e−βSi0
P
where σi0 is the new spin value and Si0 = k σk + h represents the effective spin
interacting with σi0 as defined already below (17). And since ∆E = En − Eo =
−(σi0 − (−σi0 ))Si0 = −2σi0 Si0 , the probability for a spin flip becomes26
e−β∆E/2
w(−σi0 → σi0 ) = . (22)
eβ∆E/2 + e−β∆E/2
This is identical to the acceptance probability (18) for a spin flip in the Glauber
update algorithm, that is, for the Ising model, the Glauber and heat-bath update
rules give precisely the same results.
106 W. Janke
2.0 1.0
0
10
1.8
−1
10
τe,exp = 11.3
1.6
A(k)
−2
10
−E/V
A(k)
1.4 0.5
−3
10
1.2 −4
10
0 20 40 60 80 100
1.0 k
(a) (b)
0.8 0.0
0 200 400 600 800 1000 0 20 40 60
MC time k
Fig. 3. (a) Part of the√time evolution of the energy e = E/V for the 2D Ising model on a 16 × 16
lattice at βc = ln(1 + 2)/2 = 0.440 686 793 . . . and (b) the resulting autocorrelation function. In
the inset the same data are plotted on a logarithmic scale, revealing a fast initial drop for very small k
and noisy behaviour for large k. The solid lines show a fit to the ansatz A(k) = a exp(−k/τe,exp )
in the range 10 ≤ k ≤ 40 with τe,exp = 11.3 and a = 0.432.
The last 1000 sweeps of the time evolution of the energy are shown in Fig. 3(a).
Using the complete time series the autocorrelation function was computed accord-
ing to (23) which is shown in Fig. 3(b). On the linear-log scale of the inset we
clearly see the asymptotic linear behaviour of ln A(k). A linear fit of the form
(24), ln A(k) = ln a − k/τe,exp , in the range 10 ≤ k ≤ 40 yields an estimate for
the exponential autocorrelation time of τe,exp ≈ 11.3. In the small k behaviour
of A(k) we observe an initial fast drop, corresponding to faster relaxing modes,
before the asymptotic behaviour sets in. This is the generic behaviour of auto-
correlation functions in realistic models where the small-k deviations are, in fact,
often much more pronounced than for the 2D Ising model.
The influence of autocorrelation times is particular pronounced for phase tran-
sitions and critical phenomena.27–30 For instance, close to a critical point, the
autocorrelation time typically scales in the infinite-volume limit as
τO,exp ∝ ξ z , (25)
where z ≥ 0 is the so-called dynamical critical exponent. Since the spatial corre-
lation length ξ ∝ |T − Tc |−ν → ∞ when T → Tc , also the autocorrelation time
τO,exp diverges when the critical point is approached, τO,exp ∝ |T − Tc|−νz . This
leads to the phenomenon of critical slowing down at a continuous phase transition
which can be observed experimentally for instance in critical opalescence.31 The
reason is that local spin-flip Monte Carlo dynamics (or diffusion dynamics in a
lattice-gas picture) describes at least qualitatively the true physical dynamics of a
system in contact with a heat bath. In a finite system, the correlation length ξ is
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
limited by the linear system size L, so that the characteristic length scale is then
L and the scaling law (25) is replaced by
τO,exp ∝ Lz . (26)
For local dynamics, the critical slowing down effect is quite pronounced since
the dynamical critical exponent takes a rather large value around
z≈2 , (27)
which is only weakly dependent on the dimensionality and can be understood by
a simple random-walk or diffusion argument in energy space. Non-local update
algorithms such as multigrid schemes32–36 or in particular the cluster methods dis-
cussed in the next section can reduce the value of the dynamical critical exponent
z significantly, albeit in a strongly model-dependent fashion.
At a first-order phase transition, a completely different mechanism leads to
an even more severe “slowing-down” problem.37,38 Here, the keyword is “phase
coexistence”. A finite system close to the (pseudo-) transition point can flip be-
tween the coexisting pure phases by crossing a two-phase region. Relative to the
weight of the pure phases, this region of state space is strongly suppressed by an
additional Boltzmann factor exp(−2σLd−1 ), where σ denotes the interface ten-
sion between the coexisting phases, Ld−1 is the (projected) “area” of the interface
and the factor 2 accounts for periodic boundary conditions, which enforce always
an even number of interfaces for simple topological reasons. The time spent for
crossing this highly suppressed rare-event region scales proportional to the in-
verse of this interfacial Boltzmann factor, implying that the autocorrelation time
increases exponentially with the system size,
d−1
τO,exp ∝ e2σL . (28)
In the literature, this behaviour is sometimes termed supercritical slowing down,
even though, strictly speaking, nothing is “critical” at a first-order phase transi-
tion. Since this type of slowing-down problem is directly related to the shape of
the probability distribution, it appears for all types of update algorithms, i.e., in
contrast to the situation at a second-order transition, here it cannot be cured by
employing multigrid or cluster techniques. It can be overcome, however, at least
in part by means of multicanonical methods which are briefly discussed at the end
of this chapter in Sect. 5.
108 W. Janke
relations. This suggests that some sort of non-local update rules should be able
to alleviate this problem. Natural candidates are rules where whole clusters or
droplets of spins are flipped at a time. Still, it took until 1987 before Swend-
sen and Wang39 proposed the first legitimate cluster update procedure satisfying
detailed balance. For the Ising model this follows from the identity
X X
Z= exp β σi σj (29)
{σi } hiji
XY
eβ (1 − p) + pδσi ,σj
= (30)
{σi } hiji
X X Y
eβ (1 − p)δnij ,0 + pδσi ,σj δnij ,1 ,
= (31)
{σi } {nij } hiji
where
p = 1 − e−2β . (32)
Here the nij are bond occupation variables which can take the values nij = 0
or 1, interpreted as “deleted” or “active” bonds. The representation (30) follows
from the observation that the product σi σj of two Ising spins can only take the
two values ±1, so that exp(βσi σj ) = x + yδσi σj can easily be solved for x
and y. And in the third line (31) we made use of the trivial (but clever) identity
P1
a + b = n=0 (aδn,0 + bδn,1 ). Going one step further and performing in (31) the
summation over spins, one arrives at the so-called Fortuin-Kasteleyn representa-
tion.40–43
Fig. 4. Illustration of the bond variable update. The bond between unlike spins is always “deleted”
as indicated by the dashed line. A bond between like spins is only “active” with probability p =
1 − exp(−2β). Only at zero temperature (β −→ ∞) stochastic and geometrical clusters coincide.
Notice the difference between the just defined stochastic clusters and geomet-
rical clusters whose boundaries are defined by drawing lines through bonds be-
tween unlike spins. In fact, since in the stochastic cluster definition bonds between
like spins are “deleted” with probability p0 = 1 − p = exp(−2β), stochastic clus-
ters are on the average smaller than geometrical clusters. Only at zero temperature
(β −→ ∞) p0 approaches zero and the two cluster definitions coincide. It is worth
pointing out that at least for the 2D Ising and more generally 2D Potts models the
geometrical clusters also do encode critical properties – albeit those of different
but related (tricritical) models.48
As described above, the cluster algorithm is referred to as Swendsen-Wang
(SW) or multiple-cluster update.39 The distinguishing point is that the whole lat-
tice is decomposed into stochastic clusters whose spins are assigned a random
value +1 or −1. In one sweep one thus attempts to update all spins of the lattice.
In the single-cluster algorithm of Wolff49 one constructs only the one cluster con-
nected with a randomly chosen site and then flips all spins of this cluster. Typical
configuration plots before and after the cluster flip are shown in Fig. 5, which
also illustrates the difference between stochastic and geometrical clusters men-
tioned in the last paragraph: The upper right plot clearly shows that, due to the
randomly distributed inactive bonds between like spins, the stochastic cluster is
much smaller than the underlying black geometrical cluster which connects all
neighboring like spins.
In the single-cluster variant some care is necessary with the definition of the
unit of “time” since the number of flipped spins varies from cluster to cluster. It
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
110 W. Janke
100.0 100.0
80.0 80.0
60.0 60.0
40.0 40.0
20.0 20.0
0.0 0.0
0.0 20.0 40.0 60.0 80.0 100.0 0.0 20.0 40.0 60.0 80.0 100.0
100.0 100.0
80.0 80.0
60.0 60.0
40.0 40.0
20.0 20.0
0.0 0.0
0.0 20.0 40.0 60.0 80.0 100.0 0.0 20.0 40.0 60.0 80.0 100.0
Fig. 5. Illustration of the Wolff single-cluster update for the 2D Ising model on a 100 × 100 square
lattice at 0.97 × βc . Upper left: Initial configuration. Upper right: The stochastic cluster is marked.
Note how it is embedded into the larger geometric cluster connecting all neighboring like (black) spins.
Lower left: Final configuration after flipping the spins in the cluster. Lower right: The flipped cluster.
also depends crucially on temperature since the average cluster size automatically
adapts to the correlation length. With h|C|i denoting the average cluster size, a
sweep is usually defined to consist of V /h|C|i single cluster steps, assuring that on
the average V spins are flipped in one sweep. With this definition, autocorrelation
times are directly comparable with results from the Swendsen-Wang or Metropolis
algorithm. Apart from being somewhat easier to program, Wolff’s single-cluster
variant is usually more efficient than the Swendsen-Wang multiple-cluster algo-
rithm, especially in 3D. The reason is that with the single-cluster method, on the
average, larger clusters are flipped.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
one needs a new strategy.49,51–53 The basic idea is to isolate Ising degrees of
freedom by projecting the spins ~σi onto a randomly chosen unit vector ~r,
k k
~σi = ~σi + ~σi⊥ , ~σi = ǫi |~σi · ~r| ~r , ǫi = sign(~σi · ~r) . (34)
Inserting this in (33) one ends up with an effective Hamiltonian
X
HO(n) = − Jij ǫi ǫj + const , (35)
hiji
with positive random couplings Jij = J|~σi · ~r||~σj · ~r| ≥ 0, whose Ising degrees
of freedom ǫi can be updated with a cluster algorithm as described above.
112 W. Janke
For the Ising model (n = 1) this reduces to χ′ /β = h|C|i, i.e., the improved
estimator of the susceptibility is just the average cluster size of the single-cluster
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
3.1.1. Estimators
When discussing the importance sampling idea in Sect. 2.2 we already saw in
Eq. (9) that within Markov chain Monte Carlo simulations, the expectation value
hOi of some quantity O, for instance the energy, can be estimated as arithmetic
mean,
N
X 1 X
hOi = O(σ)P eq (σ) ≈ O = Ok , (40)
σ
N
k=1
where the “measurement” Ok = O(σ (k) ) is obtained from the kth microstate σ (k)
and N is the number of measurement sweeps. Of course, this is only valid after
a sufficiently long thermalization period without measurements, which is needed
to equilibrate the system after starting the Markov chain in an arbitrarily chosen
initial configuration.
Conceptually it is important to distinguish between the expectation value hOi,
an ordinary number representing the exact result (which is usually unknown, of
course), and the mean value O, which is a so-called estimator of the former. In
contrast to hOi, the estimator O is a random variable which for finite N fluctuates
around the theoretically expected value. Certainly, from a single Monte Carlo
simulation with N measurements, we obtain only a single number for O at the
end of the day. For estimating the statistical uncertainty due to the fluctuations,
i.e., the statistical error, it seems at first sight that one would have to repeat the
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
114 W. Janke
whole simulation many times. Fortunately, this is not so because one can express
the variance of O,
2 2
σO = h[O − hOi]2 i = hO i − hOi2 , (41)
in terms of the statistical properties of the individual measurements Ok , k =
1, . . . , N , of a single Monte Carlo run.
where we have collected diagonal and off-diagonal terms. The second, off-
diagonal term encodes the “temporal” correlations between measurements at
“times” k and l and thus vanishes for completely uncorrelated data (which is, of
course, never really the case for importance sampling Monte Carlo simulations).
2
Assuming equilibrium, the variances σO k
= hOk2 i − hOk i2 of individual mea-
surements appearing in the first, diagonal term do not depend on “time” k, such
2 2
that σO k
= σO and (42) simplifies to
2 2
σO = σO /N . (43)
Whatever form the distribution P(Ok ) assumes (which, in fact, is often close to
Gaussian because the Ok are usually already lattice averages over many degrees of
freedom), by the central limit theorem the distribution of the mean value is Gaus-
sian, at least for weakly correlated data in the asymptotic limit of large N . The
2
variance of the mean, σO , is the squared width of this (N dependent) distribution
which is usually taken as the “one-sigma” squared error, ǫ2O ≡ σO 2
, and quoted
together with the mean value O. Under the assumption of a Gaussian distribu-
tion for the mean, the interpretation is that about 68% of all simulations under the
same conditions would yield a mean value in the range [hOi − σO , hOi + σO ].61
For a “two-sigma” interval which also is sometimes used, this percentage goes
up to about 95.4%, and for a “three-sigma” interval which is rarely quoted, the
confidence level is higher than 99.7%.
PN PN PN
k6=l as 2 k=1 l=k+1 , reordering the summation, and using time-translation
invariance in equilibrium, one obtains66
N
" #
1 X k
2 2
σO = σO + 2 hO1 O1+k i − hO1 ihO1+k i 1− , (44)
N N
k=1
where, due to the last factor (1 − k/N ), the k = N term may be trivially kept in
2
the summation. Factoring out σO , this can be written as
2
2 σO
σO = 2τO,int , (45)
N
where we have introduced the integrated autocorrelation time
N
1 X k
τO,int = + A(k) 1 − , (46)
2 N
k=1
with
hO1 O1+k i − hO1 ihO1+k i k→∞ −k/τO,exp
A(k) ≡ 2 −→ ae (47)
σO
describing the effective statistics. This shows more clearly that only every 2τO,int
iterations the measurements are approximately uncorrelated and gives a better idea
of the relevant effective size of the statistical sample. In view of the scaling be-
haviour of the autocorrelation time in (25), (26) or (28), it is obvious that without
extra care this effective sample size may become very small close to a continuous
or first-order phase transition, respectively.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
116 W. Janke
3.1.4. Bias
A too small effective sample size does not only affect the error bars, but for some
quantities even the mean values can be severely underestimated. This happens for
so-called biased estimators, as is for instance the case for the specific heat and
susceptibility. The specific heat can be computed as C = β 2 V he2 i − hei2 =
β 2 V σe2 , with the standard estimator for the variance
N
2 2 1 X 2
σ̂O = O2 − O = (O − O)2 = Ok − O . (49)
N
k=1
Subtracting and adding hOi2 , one finds for the expected value of σ̂O
2
,
2 2
2
hσ̂O i = hO2 − O i = hO2 i − hOi2 − hO i − hOi2 = σO 2 2
+ σO . (50)
6
2 τe,int = 11.86
τe,int = 5.93
5 10
τe,int(kmax)
2
4
nBσB /σ0
2
3
5
2
1
(a) (b)
0 0
0 50 100 150 200 0 100 200 300 400
kmax block length nB
Fig. 6. (a) Integrated autocorrelation time approaching τe,int ≈ 5.93 for large upper cutoff kmax
and (b) binning analysis for the energy of the 2D Ising model on a 16 × 16 lattice at βc , using the
same data as in Fig. 3. The horizontal line in (b) shows 2τe,int with τe,int read off from (a).
numbers) by mean values (random variables), e.g., hO1 O1+k i by O1 O1+k . With
increasing separation k the relative variance of Â(k) diverges rapidly. To get at
least an idea of the order of magnitude of τO,int and thus the correct error estimate
(45), it is useful to record the “running” autocorrelation time estimator
kmax
1 X
τ̂O,int (kmax ) = + Â(k) , (52)
2
k=1
which approaches τO,int in the limit of large kmax where, however, the statistical
error rapidly increases. As an example, Fig. 6(a) shows results for the 2D Ising
model from an analysis of the same raw data as in Fig. 3.
As a compromise between systematic and statistical errors, an often employed
procedure is to determine the upper limit kmax self-consistently by cutting off the
summation once kmax ≥ 6τ̂O,int(kmax ), where A(k) ≈ e−6 ≈ 10−3 . In this case
an a priori error estimate is available,34,35,63
r r
2(2kmax + 1) 12
ǫτO,int = τO,int ≈ τO,int . (53)
N Neff
For a 5% relative accuracy one thus needs at least Neff ≈ 5 000 or N ≈
10 000 τO,int measurements. For an order of magnitude estimate consider the
2D Ising model on a square lattice with L = 100 simulated with a local update
algorithm. Close to criticality, the integrated autocorrelation time for this example
is of the order of Lz ≈ L2 ≈ 1002 (ignoring an unknown prefactor of “order
unity” which depends on the considered quantity), implying N ≈ 108 . Since in
each sweep L2 spins have to be updated and assuming that each spin update takes
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
118 W. Janke
about 0.1 µsec, we end up with a total time estimate of about 105 seconds ≈ 1
CPU-day to achieve this accuracy.
An alternative is to approximate the tail end of A(k) by a single exponential
as in (24). Summing up the small k part exactly, one finds68
τO,int (kmax ) = τO,int − ce−kmax /τO,exp , (54)
where c is a constant. The latter expression may be used for a numerical estimate
of both the exponential and integrated autocorrelation times.68
Even if the data are completely uncorrelated in time, one still has to handle the
problem of error estimation for quantities that are not “directly” measured in the
simulation but are computed as a non-linear combination of “basic” observables
such as hOi2 or hO1 i/hO2 i. This problem can either be solved by error propa-
gation or by using the Jackknife method,69,70 where instead of considering rather
small blocks of length nB and their fluctuations as in the binning analysis, one
(J)
forms NB large Jackknife blocks Oj containing all data but the jth block of the
previous binning method,
(B)
(J) N O − nB Oj
Oj = , j = 1, . . . , NB , (58)
N − nB
cf. the schematic sketch in Fig. 7. Each of the Jackknife blocks thus consists of
N − nB = N (1 − 1/NB ) data, i.e., it contains almost as many data as the orig-
inal time series. When non-linear combinations of basic variables are estimated,
the bias is hence comparable to that of the total data set (typically 1/(N − nB )
compared to 1/N ). The NB Jackknife blocks are, of course, trivially correlated
because one and the same original data is re-used in NB − 1 different Jackknife
blocks. This trivial correlation caused by re-using the original data over and over
again has nothing to do with temporal correlations. As a consequence, the Jack-
nife block variance σJ2 will be much smaller than the variance estimated in the
binning method. Because of the trivial nature of the correlations, however, this
reduction can be corrected by multiplying σJ2 with a factor (NB − 1)2 , leading to
B N
NB − 1 X (J)
2
ǫ2O ≡ σO = (O − O(J) )2 . (59)
NB j=1 j
To summarize this section, any realization of a Markov chain Monte Carlo up-
date algorithm is characterised by autocorrelation times which enter directly into
the statistical errors of Monte Carlo estimates. Since temporal correlations always
increase the statistical errors, it is thus a very important issue to develop Monte
Carlo update algorithms that keep autocorrelation times as small as possible. This
is the reason why cluster and other non-local algorithms are so important.
4. Reweighting Techniques
120 W. Janke
O
(J)
1 O1
(J)
2 O2
(J)
3 O3
(J)
NB ONB
Fig. 7. Sketch of the organization of Jackknife blocks. The grey part of the N data points is used for
calculating the total and the Jackknife block averages. The white blocks enter into the more conven-
tional binning analysis using non-overlapping blocks.
their power in practice has been realized only relatively late in 1988. The impor-
tant observation by Ferrenberg and Swendsen71,72 was that the best performance
is achieved near criticality where histograms are usually broad. In this sense
reweighting techniques are complementary to improved estimators, which usually
perform best off criticality.
If we would normalize Pβ0 (e) to unit area, the r.h.s. would have to be divided by
P
e Pβ0 (e) = Z(β0 ), but the normalization will be unimportant in what follows.
Let us assume we have performed a Monte Carlo simulation at inverse temperature
g For simplicity we consider here only models with discrete energies. If the energy varies continuously,
sums have to be replaced by integrals, etc. Also lattice size dependences are suppressed to keep the
notation short.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
the normalization of Pβ (e) indeed cancels. This gives for instance the energy
hei(β) and the specific heat C(β) = β 2 V [he2 i(β) − hei(β)2 ], in principle, as
a continuous function of β from a single Monte Carlo simulation at β0 , where
V = LD is the system size.
As an example of this reweighting procedure, using actual Swendsen-Wang
cluster simulation data (with 5000 sweeps for equilibration and 50 000 √ sweeps
for measurements) of the 2D Ising model at β0 = βc = ln(1 + 2)/2 =
0.440 686 . . . on a 16 × 16 lattice with periodic boundary conditions, the
reweighted data points for the specific heat C(β) are shown in Fig. 8(a) and com-
pared with the continuous curve obtained from the exact Kaufman solution73,74
for finite Lx × Ly lattices. Note that the location of the peak maximum is slightly
displaced from the infinite-volume transition point βc due to the rounding and
shifting of C(β) caused by finite-size effects discussed in more detail in Sect. 6.
This comparison clearly demonstrates that, in practice, the β-range over which
reweighting can be trusted is limited. The reason for this limitation are un-
avoidable statistical errors in the numerical determination of Pβ0 using a Monte
Carlo simulation. In the tails of the histograms the relative statistical errors are
largest, and the tails are exactly the regions that contribute most when multiply-
ing Pβ0 (e) with the exponential reweighting factor to obtain Pβ (e) for β-values
far off the simulation point β0 . This is illustrated in Fig. 8(b) where the simu-
lated histogram at β0 = βc is shown together with the reweighted histograms at
β = 0.375 ≈ β0 − 0.065 and β = 0.475 ≈ β0 + 0.035, respectively. For the 2D
Ising model the quality of the reweighted histograms can be judged by comparing
with the curves obtained from Beale’s75 exact expression for Ω(e).
122 W. Janke
2
β=0.475
β=0.375
2D Ising
2
16
specific heat
10
β0=βc
counts
1
(a) (b)
0 0
0.3 0.4 0.5 0.6 0.5 1.0 1.5 2.0
β −E/V
Fig. 8. (a) The specific heat of the 2D Ising model on a 16 × 16 square lattice computed by reweight-
ing from a single Monte Carlo simulation at β0 = βc , marked by the filled data symbol. The con-
tinuous line shows for comparison the exact solution of Kaufman.73,74 (b) The corresponding energy
histogram at β0 , and reweighted to β = 0.375 and β = 0.475. The dashed lines show for comparison
the exact histograms obtained from Beale’s expression.75
this range is wide enough to locate from a single simulation, e.g., the specific-heat
maximum by employing a standard maximization subroutine to the continuous
function C(β). This is by far more convenient, accurate and faster than the tradi-
tional way of performing many simulations close to the peak of C(β) and trying
to determine the maximum by splines or least-squares fits.
For an analytical estimate of the reweighting range we now require that the
peak of the reweighted histogram is within the width hei(T0 )±∆e(T0 ) of the input
histogram (where a Gaussian histogram would have decreased to exp(−1/2) ≈
0.61 of its maximum value),
|hei(T ) − hei(T0 )| ≤ ∆e(T0 ) , (64)
where we assumed that for a not too asymmetric histogram Pβ0 (e) the maximum
location approximately coincides with hei(T0 ). Recalling that the half width
∆e of a histogram is related to the specific heat via (∆e)2 ≡ h(e − hei)2 i =
he2 i − hei2 = C(β0 )/β02 V and using the Taylor expansion hei(T p ) = hei(T0 ) +
C(T0 )(T − T0 ) + . . . , this can be written as C(T0 )|T − T0 | ≤ T0 C(T0 )/V or
|T − T0 | 1 1
≤√ p . (65)
T0 V C(T0 )
Since C(T0 ) is known from the input histogram this is quite a general estimate of
the reweighting range. For the example in Fig. 8 with V = 16 × 16, β0 = βc ≈
0.44 and C(T0 ) ≈ 1.5, this estimate yields |β − β0 |/β0 ≈ |T − T0 |/T0 ≤ 0.05,
i.e., |β − β0 | ≤ 0.02 or 0.42 ≤ β ≤ 0.46. By comparison with the exact solution
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
we see that this is indeed a fairly conservative estimate of the reliable reweighting
range.
If we only want to know the scaling behaviour with system size V = LD , we
can go one step further by considering three generic cases:
|T − T0 |
∝ V −1/2 = L−D/2 . (66)
T0
ii) Critical, where C(T0 ) ≃ a1 + a2 Lα/ν , with a1 and a2 being constants, and α
and ν denoting the standard critical exponents of the specific heat and corre-
lation length, respectively. For α > 0, the leading scaling behaviour becomes
|T − T0 |/T0 ∝ L−D/2 L−α/2ν . Assuming hyperscaling (α = 2 − Dν) to be
valid, this simplifies to
|T − T0 |
∝ L−1/ν , (67)
T0
i.e., the typical scaling behaviour of pseudo-transition temperatures in the
finite-size scaling regime of a second-order phase transition.76 For α < 0,
C(T0 ) approaches asymptotically a constant and the leading scaling be-
haviour of the reweighting range is as in the off-critical case.
|T − T0 |
∝ V −1 = L−D , (68)
T0
which is again the typical finite-size scaling behaviour of pseudo-transition
temperatures close to a first-order phase transition.38
124 W. Janke
1.0 1.0
<<µ >>(e)
0.6 0.6
2
0.4 0.4
0.2 0.2
(a) (b)
0.0 0.0
0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0
-e -e
Fig. 9. Microcanonical expectation values for (a) the absolute magnetization and (b) the magnetiza-
tion squared obtained from the 2D Ising model simulations shown in Fig. 8.
Recalling that µ Ω(e, µ)e−β0 E /Z(β0 ) = Ω(e)e−β0 E /Z(β0 ) = Pβ0 (e) and
P
we arrive at
X
hg(µ)i = hhg(µ)ii(e)Pβ0 (e) . (71)
e
Identifying hhg(µ)ii(e) with f (e) in Eq. (63), the actual reweighting procedure
is precisely as before. An example for computing hh|µ|ii(e) and hhµ2 ii(e) using
the data of Fig. 8 is shown in Fig. 9. Mixed quantities, e.g. hek µl i, can be treated
similarly. One caveat of this method is that one has to decide beforehand which
“lists” hhg(µ)ii(e) one wants to store during the simulation, e.g., which powers k
in hhµk ii(e) are relevant.
An alternative and more flexible method is based on time series. Suppose
we have performed a Monte Carlo simulation at β0 and stored the time series
of N measurements e1 , e2 , . . . , eN and µ1 , µ2 , . . . , µN . Then the most general
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
i.e., in particular all moments hek µl i can be computed. Notice that this can also
be written as
hf (e, µ)i = hf (e, µ)e−(β−β0 )E i0 /he−(β−β0 )E i0 , (73)
where the subscript 0 refers to expectation values taken at β0 . Another very im-
portant advantage of the last formulation is that it works without any systematic
discretization error also for continuously distributed energies and magnetizations.
As nowadays hard-disk space is no real limitation anymore, it is advisable
to store time series in any case. This guarantees the greatest flexibility in the
data analysis. As far as the memory requirement of the actual reweighting code
is concerned, however, the method of choice is sometimes not so clear. Using
directly histograms and lists, one typically has to store about (6 − 8)V data, while
working directly with the time series one needs 2N computer words. The cheaper
solution (also in terms of CPU time) thus obviously depends on both, the system
size V and the run length N . It is hence sometimes faster to generate from the time
series first histograms and the required lists and then proceed with reweighting the
latter quantities.
126 W. Janke
P
where Zi ≡ Z(βi ) so that e Pi (e) = 1. This can be estimated by the empirical
histogram Hi (e) obtained from the simulation at βi ,
Hi (e)
P̂i (e) =, (75)
Ni
P
which also satisfies the normalization constraint e P̂i (e) = 1. Rearranging (74)
and replacing the exact Pi (e) by its estimator P̂i (e) yields an estimator for the
density of states (this corresponds to choosing the common reference point as
β0 = 0):
Hi (e)
Ω̂i (e) = Zi eβi E . (76)
Ni
Notice that we have introduced a subscript i to label the m estimators Ω̂i (e).
The expectation value of each Ω̂i (e) should be the exact Ω(e), but being random
variables their statistical properties are different as can be quantified by estimating
their variance. This is simplest done by interpreting the histogram entries Hi (e)
as result of measuring O = δet ,e where et denotes the energy after the t’s sweep
of the simulation at βi :
Ni
Hi (e) 1 X
= δet ,e = δe ,e . (77)
Ni Ni t=1 t
PNi
As in (40) and (41) the expected value is hHi (e)/Ni i = (1/Ni ) t=1 hδet ,e i =
Pi (e) and, neglecting temporal correlations for the moment,
Ni
* 2 + * +
Hi (e) 1 X
= δet ,e δet′ ,e
Ni Ni2 ′
t,t =1
1
= 2 Ni (Ni − 1)hδet ,e ihδet′ ,e i + Ni hδet ,e δet′ ,e i (78)
Ni
1
= Pi (e)2 + Pi (e)[1 − Pi (e)] ,
Ni
such that
* 2 +
2 Hi (e) Hi (e) 1
σH i (e)/Ni
= − = Pi (e)[1 − Pi (e)] . (79)
Ni Ni Ni
For sufficiently many energy bins, the normalized probabilities Pi (e) are much
smaller than unity, such that the second term [1 − Pi (e)] can usually be neglected.
Taking autocorrelations into account, as in (45) the variance (79) would be en-
hanced by a factor 2τint,i (e). Recall that the subscript i of τint,i (e) refers to the
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
simulation point and the argument e to the energy bin. Note that the autocorrela-
tion times of the histogram bins are usually much smaller than the autocorrelation
time τint,e of the mean energy. For the following it is useful to define the effective
statistics parameter Neff,i (e) = Ni /2τint,i (e). Recalling (76), the variance of the
m estimators Ω̂i (e) can then be written as
2 Zi2 e2βi E Zi eβi E
σΩ̂ = Pi (e) = Ω(e) . (80)
i (e) Neff,i (e) Neff,i (e)
As usual the error weighted average
Pm
w (e)Ω̂i (e)
Pm i
Ω̂opt (e) = i=1 (81)
i=1 wi (e)
2
with wi (e) = 1/σΩ̂ is an optimised estimator with minimal variance
i (e)
2
Pm
σΩ̂ = 1/ w
i=1 i (e). This can be simplified to
opt (e)
Pm
Hi (e)/2τint,i (e)
Ω̂opt (e) = Pmi=1 −1 −βi E (82)
i=1 eff,i (e)Zi e
N
and
2 1
σΩ̂ /Ω2 (e) = Pm . (83)
opt (e)
i=1 hHi (e)i/2τint,i (e)
So far the partition function values Zi ≡ Z(βi ) have been assumed to be exact
(albeit usually unknown) parameters which are now self-consistently determined
from
Pm
X X Hi (e)/2τint,i (e)
Zj = Ω̂opt (e)e−βj E = Pm i=1 −1 −βi E e
−βj E
, (84)
e e i=1 (N i /2τint,i (e))Z i e
up to an unimportant overall constant. A good starting point for the recursion is
to fix, say, Z1 = 1 and use single histogram reweighting to get an estimate of
Z2 /Z1 = exp[−(F̂2 − F̂1 )], where F̂i = βi F (βi ). Once Z2 is determined, the
same procedure can be applied to estimate Z3 and so on. In the limit of infinite
statistics, this would already yield the solution of (84). In realistic simulations the
statistics is of course limited and the remaining recursions average this uncertainty
to get a self-consistent set of Zi . In order to work in practice, the histograms at
neighboring β-values must have sufficient overlap, i.e., the spacings of the simu-
lation points must be chosen according to the estimates (66)–(68). The issue of
optimal convergence of the WHAM equations (84) has recently been discussed in
detail in Ref. 80.
Multiple-histogram reweighting has been employed in a wide spectrum of ap-
plications. In many applications the influence of autocorrelations has been ne-
glected since it is quite cumbersome to estimate the τint,i (e) for each of the m
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
128 W. Janke
simulations and all energy bins. For work dealing with autocorrelations in this
context see, e.g., Refs. 81,82. Note that, even when ignoring the τint,i (e), the er-
ror weighted average in (81) does still give a correct estimator for Ω(e) – it is only
no longer properly optimised. Moreover, since for each energy bin typically only
the histograms at neighboring simulation points contribute significantly, the two
or three τint,i (e) values relevant for each energy bin e are close to each other. And
since an overall constant drops out of the WHAM equation (84), the influence of
autocorrelations on the final result turns out to be very minor anyway.
Alternatively59 one may also compute from each of the m independent sim-
ulations by reweighting all quantities of interest as a function of β, together
with their proper statistical errors including autocorrelation effects as discussed
in Sect. 3.1.3. As a result one obtains, at each β-value, m estimates, e.g.
e1 (β) ± ∆e1 , e2 (β) ± ∆e2 , . . . , em (β) ± ∆em , which may be optimally com-
bined according to their error bars to give e(β) ± ∆e, where
!
e1 (β) e2 (β) em (β) 2
e(β) = 2 + 2 + ··· + 2 (∆e) , (85)
(∆e1 ) (∆e2 ) (∆em )
and
1 1 1 1
2 = 2 + 2 + ···+ 2 . (86)
(∆e) (∆e1 ) (∆e2 ) (∆em )
Notice that by this method the average for each quantity can be individually opti-
mised.
All Monte Carlo methods described so far assumed a conventional canonical en-
semble where the probability distribution of microstates is governed by a Boltz-
mann factor ∝ exp(−βE). A simulation at some inverse temperature β0 then
covers a certain range of the state space but not all (recall the discussion of the
reweighting range). In principle a broader range can be achieved by patching
several simulations at different temperatures using the multi-histogram method.
Loosely speaking generalized ensemble methods aim at replacing this “static”
patching by a single simulation in an appropriately defined “generalized ensem-
ble”. The purpose of this section is to give at least a brief survey of the available
methods.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
130 W. Janke
and all m systems at different simulation points β1 < β2 < · · · < βm are simu-
lated in parallel, using any legitimate update algorithm (Metropolis, cluster,. . . ).
This freedom in the choice of update algorithm is a big advantage of a paral-
lel tempering simulation88 which is a special case of the earlier replica exchange
Monte Carlo method85 proposed in the context of spin-glass simulations (to some
extent the focus on this special application hides the general aspects of the method
as becomes clearer in Ref. 86). After a certain number of sweeps, exchanges of
the current configurations σi and σj are attempted (equivalently, the βi may be
exchanged, as is done in most implementations). Adapting the Metropolis crite-
rion (16) to the present situation, the proposed exchange will be accepted with
probability
sampling was first realized in the so-called “umbrella sampling” method,100 but
it took many years before the introduction of the multicanonical ensemble turned
non-Boltzmann sampling into a widely appreciated practical tool in computer sim-
ulation studies of phase transitions. Once the feasibility of such a generalized
ensemble approach was realized, many related methods and further refinements
were developed. By now the applications of the method range from physics and
chemistry to biophysics, biochemistry and biology to engineering problems.
Conceptually the method can be divided into two main strategies. The first
strategy can be best described as “avoiding rare events” which is close in spirit to
the alternative tempering methods. In this variant one tries to connect the impor-
tant parts of phase space by “easy paths” which go around suppressed rare-event
regions which hence cannot be studied directly. The second approach is based
on “enhancing the probability of rare event states”, which is for example the typ-
ical strategy for dealing with the highly suppressed mixed-phase region of first-
order phase transitions38,97 and the very rugged free-energy landscapes of spin
glasses.101–104 This allows a direct study of properties of the rare-event states such
as, e.g., interface tensions or more generally free energy barriers, which would be
very difficult (or practically impossible) with canonical simulations and also with
the tempering methods described in Sects. 5.1 and 5.2.
In general the idea goes as follows. With σ representing generically the
degrees of freedom (discrete spins or continuous field variables), the canonical
Boltzmann distribution
Pcan (σ) ∝ e−βH(σ) (92)
is replaced by an auxiliary multicanonical distribution
Pmuca (σ) ∝ W (Q(σ))e−βH(σ) ≡ e−βHmuca (σ) , (93)
introducing a multicanonical weight factor W (Q) where Q stands for any macro-
scopic observable such as the energy or magnetization. This defines formally
Hmuca = H − (1/β) ln W (Q) which may be interpreted as an effective “multi-
canonical” Hamiltonian. The Monte Carlo sampling can then be implemented as
usual by comparing Hmuca before and after a proposed update of σ, and canonical
expectation values can be recovered exactly by inverse reweighting,
hOican = hOW −1 (Q)imuca /hW −1 (Q)imuca , (94)
similarly to Eq. (73). The goal is now to find a suitable weight factor W such that
the dynamics of the multicanonical simulation profits most.
To be specific, let us assume in the following that the relevant macroscopic ob-
servable is the energy E itself. This is for instance the case at a temperature driven
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
132 W. Janke
3.0 q = 7, L = 60
Pmuca
2.0
P(E)
1.0
Pcan
0.0
1.0 1.2 1.4 1.6
−E/V
Fig. 10. The canonical energy density Pcan (E) of the 2D 7-state Potts model on a 60 × 60 lattice at
inverse temperature βeqh,L , where the two peaks are of equal height, together with the multicanonical
energy density Pmuca (E), which is approximately constant between the two peaks.
first-order phase transition, where the canonical energy distribution Pcan (E) de-
velops a characteristic double-peak structure.38 As an illustration, simulation data
for the 2D 7-state Potts model105 are shown in Fig. 10. With increasing sys-
tem size, the region between the two peaks becomes more and more suppressed
by the interfacial Boltzmann factor ∝ exp(−2σod LD−1 ), where σod is the (re-
duced) interface tension, LD−1 the cross-section of a D-dimensional system, and
the factor 2 accounts for the fact that with the usually employed periodic bound-
ary condition at least two interfaces are present due to topological reasons. The
time needed to cross this strongly suppressed rare-event two-phase region thus
grows exponentially with the system size L, i.e., the autocorrelation time scales
as τ ∝ exp(+2σod LD−1 ). In the literature, this is sometimes termed “super-
critical slowing down” (even though nothing is “critical” here). Given such a
situation, one usually adjusts W = W (E) such that the multicanonical distribu-
tion Pmuca (E) is approximately constant between the two peaks of Pcan (E), thus
aiming at a random-walk (pseudo-) dynamics of the Monte Carlo process,106,107
cf. Fig. 10.
The crucial non-trivial point is, of course, how this can be achieved. On a
piece of paper, W (E) ∝ 1/Pcan (E) – but we do not know Pcan (E) (otherwise
there would be little need for the simulation . . . ). The solution of this problem is
a recursive computation. Starting with the canonical distribution, or some initial
guess based on results for already simulated smaller systems together with finite-
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
The recursion is initialized with p0 (E) = 0. To derive this recursion one assumes
p (unnormalized) histogram entries Hn (E) have an a priori statistical error
that
Hn (E) and (quite crudely) that all data are uncorrelated. Due to the accumula-
tion of statistics, this procedure is rather insensitive to the length of the nth run in
the first step and has proved to be rather stable and efficient in practice.
In most applications local update algorithms have been employed, but for
certain classes of models also non-local multigrid methods34,35,111 are applica-
ble.68,112 A combination with non-local cluster update algorithms, on the other
hand, is not straightforward. Only by making direct use of the random-cluster
representation as a starting point, a multibondic variant113–115 has been developed.
For a recent application to improved finite-size scaling studies of second-order
phase transitions, see Ref. 116. If Pmuca was completely flat and the Monte Carlo
update moves would perform an ideal random walk, one would expect that af-
ter V 2 local updates the system has travelled on average a distance V in total
energy. Since one lattice sweep consists of V local updates, the autocorrelation
time should scale in this idealized picture as τ ∝ V . Numerical tests for vari-
ous models with a first-order phase transition have shown that in practice the data
are at best consistent with a behaviour τ ∝ V α , with α ≥ 1. While for the
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
134 W. Janke
6. Scaling Analyses
Equipped with the various technical tools discussed above, the purpose of this
section is to outline typical scaling and finite-size scaling (FSS) analyses of Monte
Carlo simulations of second-order phase transitions. The described procedure is
generally applicable but to keep the notation short, all formulas are formulated
for Ising like systems. For instance for O(n) symmetric models, m should be
replaced by m ~ etc. The main results of such studies are usually estimates of the
critical temperature and the critical exponents characterising the universality class
of the transition.
Basic observables are the internal energy per site, u = U/V , with U =
−d ln Z/dβ = hHi ≡ hEi, and the specific heat,
du
= β 2 hE 2 i − hEi2 /V = β 2 V he2 i − hei2 ,
C= (102)
dT
where we have set H ≡ E = eV with V denoting the number of lattice sites, i.e.,
the “lattice volume”. In simulations one usually employs the variance definition
(since any discretized numerical differentiation would introduce some systematic
error). The magnetization per site m = M/V and the susceptibility χ are defined
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
136 W. Janke
ash
1 X
m = h|µ|i , µ= σi , (103)
V i
and
χ = βV hµ2 i − h|µ|i2
. (104)
In the disordered phase for T > Tc , where m = hµi = 0 by symmetry, one often
works with the definition
χ′ = βV hµ2 i . (105)
The correlation between spins σi and σj at sites labeled by i and j can be mea-
sured by considering correlation functions like the two-point spin-spin correlation
where ξ is the spatial correlation length and the exponent κ of the power-law
prefactor depends in general on the dimension and on whether one studies the
ordered or disordered phase. Strictly speaking ξ depends on the direction of ~r.
formal definition of the zero-field magnetization m(β) = (1/V β) limh→0 ∂ ln Z(β, h)/∂h. The
reason is that for a symmetric model on finite lattices one obtains hµi(β) = 0 for all temperatures
due to symmetry. Only in the proper infinite-volume limit, that is limh→0 limV →∞ , spontaneous
symmetry breaking can occur below Tc . In a simulation on finite lattices, this is reflected by a sym-
metric double-peak structure of the magnetization distribution (provided the runs are long enough).
By averaging µ one thus gets zero by symmetry, while the peak locations ±m0 (L) are close to the
spontaneous magnetization so that the average of |µ| is a good estimator. Things become more in-
volved for slightly asymmetric models, where this recipe would produce a systematic error and thus
cannot be employed. For strongly asymmetric models, on the other hand, one peak clearly dominates
and the average of µ can usually be measured without too many problems.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
with the same critical exponent ν but a different critical amplitude ξ0− 6= ξ0+ .
The singularities of the specific heat, magnetization (for T < Tc ), and suscep-
tibility are similarly parameterized by the critical exponents α, β, and γ, respec-
tively,
m = m0 (1 − T /Tc )β + . . . , (111)
where Creg is a regular background term, and the amplitudes are again in general
different on the two sides of the transition. Right at the critical temperature Tc ,
two further exponents δ and η are defined through
m ∝ h1/δ (T = Tc ) , (113)
138 W. Janke
In the conventional scaling scenario, Rushbrooke’s and Griffiths’ laws can be de-
duced from the Widom scaling hypothesis that the Helmholtz free energy is a
homogeneous function.131 Widom scaling and the remaining two laws can in turn
be derived from the Kadanoff block-spin construction132 and ultimately from RG
considerations.133 Josephson’s law can also be derived from the hyperscaling hy-
pothesis, namely that the free-energy density behaves near criticality as the inverse
correlation volume: f ∼ ξ −D . Twice differentiating this relation and inserting the
scaling law (110) for the specific heat gives immediately (118).
The paradigm model for systems exhibiting a continuous (or, roughly speak-
ing, second-order) phase transition is the Ising model. When the temperature is
varied the system passes at Tc from an ordered low-temperature to a disordered
high-temperature phase. In two dimensions (2D), the thermodynamic limit of this
model in zero external field has been solved exactly by Onsager,134 and even for
finite Lx × Ly lattices the exact partition function is known.73,74 Also the exact
density of states can be calculated by means of computer algebra up to reason-
ably large lattice sizes.75 This provides a very useful testing ground for any new
algorithmic idea in computer simulations. For infinite lattices, the correlation
length has been calculated in arbitrary lattice directions.135,136 The exact magne-
tization for h = 0, apparently already known to Onsager,137 was first derived by
Yang138 and later generalized by Chang.139 The only quantity which up to date
is not truly exactly known is the susceptibility. However, its properties have been
characterized to very high precision140–142 (for both, low- and high-temperature
series expansions, 2000 terms are known exactly141). In three dimensions (3D)
no exact solutions are available, but analytical and numerical results from vari-
ous methods give a consistent and very precise picture. In four dimensions (4D)
the so-called upper critical dimension Du is reached and for D ≥ Du = 4 the
critical exponents take their mean-field values (in 4D up to multiplicative loga-
rithmic corrections143). The critical exponents of the Ising model are collected in
Table 1.144–146
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
Table 1. Critical exponents of the Ising model. All 2D exponents are exactly known.144,145
For the 3D Ising model the “world-average” for ν and γ calculated in Ref. 146 is quoted. The
other exponents follow from hyperscaling (α = 2 − Dν) and scaling (β = (2 − α − γ)/2,
δ = γ/β + 1, η = 2 − γ/ν) relations. For all D ≥ Du = 4 the mean-field exponents are
valid (in 4D up to multiplicative logarithmic corrections).
ν α β γ δ η
D=2 1 0 (log) 1/8 7/4 15 1/4
D=3 0.630 05(18) 0.109 85 0.326 48 1.237 17(28) 4.7894 0.036 39
D≥4 1/2 0 (disc) 1/2 1 3 0
m ∝ L−β/ν + . . . , (121)
χ ∝ Lγ/ν + . . . , (122)
where Creg is a regular, smooth background term and a a constant. As a mnemonic
rule, a critical exponent x in a temperature scaling law is replaced by −x/ν in the
corresponding FSS law. This describes the rounding of the singularities quantita-
tively.
In general these scaling laws are valid in a vicinity of Tc as long as the scaling
variable
x = (1 − T /Tc )L1/ν (123)
is kept fixed.11,147–149 In this more general formulation the scaling law for, e.g.,
the susceptibility reads
χ(T, L) = Lγ/ν f (x) , (124)
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
140 W. Janke
where f (x) is a scaling function. By plotting χ(T, L)/Lγ/ν versus the scaling
variable x, one thus expects that the data for different T and L fall onto a mas-
ter curve described by f (x). This is a nice visual method for demonstrating the
scaling properties.
For given L the maximum of χ(T, L) as a function of temperature happens at
some xmax . For the location Tmax of the maximum this implies a FSS behaviour
of the form
Tmax = Tc (1 − xmax L−1/ν + . . . ) = Tc + cL−1/ν + . . . . (125)
This quantifies the shift of so-called pseudo-critical points which depends on the
observables considered. Only in the thermodynamic limit L → ∞ all quantities
diverge at the same temperature Tc .
Further useful quantities in FSS analyses are the energetic fourth-order param-
eter
he4 i
V (β) = 1 − , (126)
3he2 i2
the magnetic cumulants (Binder parameters)
hµ2 i
U2 (β) = 1 − , (127)
3h|µ|i2
hµ4 i
U4 (β) = 1 − , (128)
3hµ2 i2
and their slopes
" #
dU2 (β) V 2 µ2 h|µ|ei 2
= µ hei − 2 + hµ ei
dβ 3h|µ|i2 h|µ|i
h|µ|ei hµ2 ei
= V (1 − U2 ) hei − 2 + , (129)
h|µ|i hµ2 i
hµ2 ei hµ4 ei
dU4 (β)
= V (1 − U4 ) hei − 2 2 + . (130)
dβ hµ i hµ4 i
The Binder parameters scale according to
U2p = fU2p (x)[1 + . . . ] , (131)
i.e., for constant scaling variable x, U2p takes approximately the same value for all
∗
lattice sizes, in particular U2p ≡ fU2p (0) at Tc . Applying the differentiation to this
scaling representation, one picks up a factor of L1/ν from the scaling function,
dU2p
= (dx/dβ)fU′ 2p [1 + . . . ] = L1/ν fU2p
′ (x)[1 + . . . ] . (132)
dβ
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
142 W. Janke
For example it is no problem to experiment with the size and number of Jackknife
bins. Since a reasonable choice depends on the a priori unknown autocorrelation
time, it is quite cumbersome to do a reliable error analysis “on the flight” during
the simulation. Furthermore, basing data reweighting on time-series data is more
efficient since histograms, if needed or more convenient, can still be produced
from this data but working in the reverse direction is obviously impossible.
For some models it is sufficient to perform for each lattice size a single long
run at some coupling β0 close to the critical point βc . This is, however, not al-
ways the case and also depends on the observables of interest. In this more gen-
eral case, one may use several simulation points βi and combine the results by the
multi-histogram reweighting method or may apply a recently developed finite-size
adapted generalized ensemble method.116,171 In both situations, one can compute
the relevant quantities from the time series of the energies e = E/V (if E hap-
P
pens to be integer valued, this should be stored of course) and µ = i σi /V by
reweighting.
By using one of these techniques one first determines the temperature depen-
dence of C(β), χ(β), . . . , in the neighborhood of the simulation point β0 ≈ βc
(a reasonably “good” initial guess for β0 is usually straightforward to obtain).
Once the temperature dependence is known, one can determine the maxima,
e.g., Cmax (βmaxC ) ≡ maxβ C(β), by applying standard extremization routines:
When reweighting is implemented as a subroutine, for instance C(β) can be han-
dled as a normal function with a continuously varying argument β, i.e., no inter-
polation or discretization error is involved when iterating towards the maximum.
The locations of the maxima of C, χ, dU2 /dβ, dU4 /dβ, dh|µ|i/dβ, d lnh|µ|i/dβ,
and d lnhµ2 i/dβ provide us with seven sequences of pseudo-transition points
βmaxi (L) which all should scale according to βmaxi (L) = βc + ai L−1/ν + . . . .
In other words, the scaling variable x = (βmaxi (L) − βc )L1/ν = ai + . . . should
be constant, if we neglect the small higher-order corrections indicated by . . . .
Notice that while the precise estimates of ai do depend on the value of ν, the
qualitative conclusion that x ≈ const for each of the βmaxi (L) sequences does
not require any a priori knowledge of ν or βc . Using this information one thus has
several possibilities to extract unbiased estimates of the critical exponents ν, α/ν,
β/ν, and γ/ν from least-squares fits assuming the FSS behaviours (120), (121),
(122), (132), (136), and (137).
Considering only the asymptotic behaviour, e.g., d lnh|µ|i/dβ = aL1/ν , and
taking the logarithm, ln(d lnh|µ|i/dβ) = c + (1/ν) ln(L), one ends up with a
linear two-parameter fit yielding estimates for the constant c = ln(a) and the
exponent 1/ν. For small lattice sizes the asymptotic ansatz is, of course, not
justified. Taking into account the (effective) correction term [1 + bL−w ] would
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
result in a non-linear four-parameter fit for a, b, 1/ν and w. Even if we would fix
w to some “theoretically expected” value (as is sometimes done), we would still
be left with a non-linear fit which is usually much harder to control than a linear
fit (where only a set of linear equations with a unique solution has to be solved,
whereas a non-linear fit involves a numerical minimization of the χ2 -function,
possessing possibly several local minima). The alternative method is to use the
linear fit ansatz and to discard successively more and more small lattice sizes until
the χ2 per degree-of-freedom or the goodness-of-fit parameter61 Q has reached
an acceptable value and does not show any further trend. Of course, all this relies
heavily on correct estimates of the statistical error bars on the original data for
d lnh|µ|i/dβ.
Once ν is estimated one can use the scaling form βmaxi (L) = βc + ai L−1/ν +
. . . to extract βc and ai . As a useful check, one should repeat these fits at the error
margins of ν, but usually this dependence turns out to be very weak. As a use-
ful cross-check one can determine βc also from the Binder parameter crossings,
which is the most convenient and fastest method for a first rough estimate. As
a rule of thumb, an accuracy of about 3 − 4 digits for βc can be obtained with
this method without any elaborate infinite-volume extrapolations – the crossing
points lie usually much closer to βc than the various maxima locations. For high
precision, however, it is quite cumbersome to control the necessary extrapolations
and often more accurate estimates can be obtained by considering the scaling of
the maxima locations. Also, error estimates of crossing points involve the data for
two different lattice sizes which tends to be quite unhandy.
Next, similarly to ν, the ratios of critical exponents α/ν, β/ν, and γ/ν can be
obtained from fits to (120), (121), (122), and (136). Again the maxima of these
quantities or any of the FSS sequences βmaxi can be used. What concerns the
fitting procedure the same remarks apply as for ν. The specific heat C usually
plays a special role in that the exponent α is difficult to determine. The reason
is that α is usually relatively small (3D Ising model: α ≈ 0.1), may be zero
(logarithmic divergence as in the 2D Ising model) or even negative (as for instance
in the 3D XY and Heisenberg models). In all these cases, the constant background
contribution Creg in (120) becomes important, which enforces a non-linear three-
parameter fit with the just described problems. Also for the susceptibility χ, a
regular background term cannot be excluded, but it is usually much less important
since γ ≫ α. Therefore, in (121), (122), and (136), similar to the fits for ν, one
may take the logarithm and deal with much more stable linear fits.
As a final step one may re-check the FSS behaviour of C, χ, dU2 /dβ, . . .
at the numerically determined estimate of βc . These fits should be repeated also
at βc ± ∆βc in order to estimate by how much the uncertainty in βc propagates
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
144 W. Janke
into the thus determined exponent estimates. In (the pretty rare) cases where βc
is known exactly (e.g., through self-duality), this latter option is by far the most
accurate one. This is the reason, why for such models numerically estimated
critical exponents are usually quite precise.
When combining the various fit results for, e.g. βc or ν, to a final average
value, some care is necessary with the optimal weighted average and the final
statistical error estimate, since the various fits for determining βc or ν are of course
correlated (since they all use the data from one and the same simulation). In
principle this can be dealt with by applying a cross-correlation analysis.172
7. Applications
Fig. 11. Neutron scattering measurements of the susceptibility in Mn0.75 Zn0.25 F2 close to criti-
cality, governed by the disorder fixed point of the Ising model, over the reduced temperature inter-
val 4 × 10−4 < |T /Tc − 1| < 2 × 10−1 . The solid lines show power-law fits with exponent
γ = 1.364(76) above and below Tc [after Mitchell et al. (Ref. 174)].
arithmic modifications.176
Figure 11 shows an experimental verification of the qualitative influence of
disorder for a three-dimensional Ising-like system where the measured critical ex-
ponent γ = 1.364(76) of the susceptibility χ ∝ |T − Tc |−γ is clearly different
from that of the pure 3D Ising model, γpure = 1.2396(13). Theoretical results,
on the other hand, remained relatively scarce in 3D until recently. Most analyt-
ical renormalization group and computer simulation studies focused on the Ising
model,177,178 usually assuming site dilution when working numerically. This mo-
tivated us to consider the case of bond dilution179–181 which enables one to test
the expected universality with respect to the type of disorder distribution and, in
addition, facilitates a quantitative comparison with recent high-temperature series
expansions.182–184
The Hamiltonian (in a Potts model normalisation) is given as
X
−βH = Kij δσi ,σj , (138)
hi,ji
where the spins take the values σi = ±1 and the sum goes over all nearest-
neighbor pairs hi, ji. The coupling strengths Kij are drawn from the bimodal
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
146 W. Janke
distribution
Y Y
℘[Kij ] = P (Kij ) = [pδ(Kij − K) + (1 − p)δ(Kij − RK)] . (139)
hi,ji hi,ji
Besides bond dilution (R = 0), which we will consider here, this also includes
random-bond ferromagnets (0 < R < 1) and the physically very different class
of spin glasses (R = −1) as special cases. For the case of bond dilution, the cou-
plings are thus allowed to take two different values Kij = K ≡ Jβ ≡ J/kB T
and 0 with probabilities p and 1−p, respectively, with c = 1−p being the concen-
tration of missing bonds, which play the role of the non-magnetic impurities. The
pure case thus corresponds to p = 1. Below the bond-percolation threshold185
pc = 0.248 812 6(5) one does not expect any finite-temperature phase transition
since without a percolating (infinite) cluster of spins long-range order cannot de-
velop.
The model (138), (139) with R = 0 was studied by means of large-scale
Monte Carlo simulations using the Swendsen-Wang (SW) cluster algorithm39
(which in the strongly diluted case is better suited than the single-cluster Wolff
variant). To arrive at final results in the quenched case, for each dilution, tem-
perature and lattice size, the Monte Carlo estimates for hQ{J} i of thermodynamic
quantities Q{J} for a given random distribution {J} of diluted bonds (realized as
usual by averages over the time series of measurements) have to be averaged over
many different disorder realisations,
1 X
Q ≡ [hQ{J} i]av = hQ{J} i , (140)
#{J}
{J}
where a discretized evaluation of the integrals for finite #{J} is implicitly im-
plied. While conceptually straightforward, the quenched average in (140) is com-
putationally very demanding since the number of realisations #{J} usually must
be large, often of the order of a few thousands. In fact, if this number is chosen too
small one may observe typical rather than average values186 which may differ sig-
nificantly when the distribution P(hQ{J} i) exhibits a long tail (which in general
is hard to predict beforehand).
To get a rough overview of the phase diagram we first studied the depen-
dence of the susceptibility peaks on the dilution, where the susceptibility χ =
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
performed for p = 0.95, 0.90, . . . , 0.36 and moderate system sizes SW cluster
MC simulations with NMCS = 2 500 MC sweeps (MCS) each. By performing
quite elaborate analyses of autocorrelation times, this statistics was judged to be
reasonable (NMCS > 250 τe ). By applying single-histogram reweighting to the
data for each of the 2 500 − 5 000 disorder realisation and then averaging the re-
sulting χ(K) curves, we finally arrived at the data shown in Fig. 12.
From the locations of the maxima one obtains the phase diagram of the model
in the p − T plane shown in Fig. 13 which turned out to be in excellent agreement
with a “single-bond effective-medium” (EM) approximation,187
(1 − pc )eKc (1) − (1 − p)
EM
Kc (p) = ln , (142)
p − pc
where Kc (1) = J/kB Tc (1) = 0.443 308 8(6) is the precisely known transition
point of the pure 3D Ising model.188 As an independent confirmation of (142),
the phase diagram also coincides extremely well with recent results from high-
temperature series expansions.184
The quality of the disorder averages can be judged as in Fig. 14 by computing
running averages over the disorder realisations taken into account and looking at
100
p = 0.95 p = 0.36
[χL]av
50
0
0.4 0.6 0.8 1 1.2 1.4 1.6
J/kBT
Fig. 12. The average magnetic susceptibility [χL ]av of the 3D bond-diluted Ising model versus K =
J/kB T for several concentrations p and L = 8, 10, 12, 14, 16, 18, and 20. For each value of p and
each lattice size L, the curves are obtained by standard single-histogram reweighting of the simulation
data at one value of K.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
148 W. Janke
MC
2.0 HTS
mean-field approx.
effective-medium approx.
kBTc/J
1.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0
p
Fig. 13. Phase diagram of the bond-diluted Ising model on a three-dimensional simple cubic lattice
in the dilution-temperature plane. The percolation point pc ≈ 0.2488 is marked by the diamond and
p = 1 is the pure case without impurities. The results from the Monte Carlo (MC) simulations are
compared with analyses of high-temperature series (HTS) expansions and with (properly normalized)
mean-field and effective-medium approximations.
the distributions P(χi ). The plots show that the fluctuations in the running average
disappear already after a few hundreds of realisations and that the dispersion of the
χi values is moderate. The histogram also shows, however, that the distributions
of physical observables typically do not become sharper with increasing system
size at a finite-randomness disorder fixed point. Rather their relative widths stay
constant, a phenomenon called non-self-averaging. More quantitatively, non-self-
averaging can be checked by evaluating the normalized squared width Rχ (L) =
Vχ (L)/[χ(L)]2av , where Vχ (L) = [χ(L)2 ]av − [χ(L)]2av is the variance of the
susceptibility distribution. Figure 15 shows this ratio for three concentrations of
the bond-diluted Ising model as a function of inverse lattice size. The fact that Rχ
approaches a constant when L increases, as predicted by Aharony and Harris,189
is the signature of a non-self-averaging system, in qualitative agreement with the
results of Wiseman and Domany190 for the site-diluted 3D Ising model.i
In order to study the critical behaviour in more detail, we concentrated on the
three particular dilutions p = 0.4, 0.55, and 0.7. In a first set of simulations we
focused on the FSS behaviour for lattice sizes up to L = 96. It is well known
that ratios of critical exponents are almost equal for pure and disordered mod-
i Our estimate of Rχ is about an order of magnitude smaller since we worked with χ = KV (hµ2 i −
h|µ|i2 ) whereas in Ref. 190 the “high-temperature” expression χ′ = KV hµ2 i was used.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
1600 x
x L=96 x
x L=64 xxx p = 0.7
x L=40
xx
xx
1400 0.04
xxx
xxxxx
x
xx
xx
xxx
xx
x
xxx
xx
xx
xx
P(χi /[χ]av)
1200 xx xx
xxxx xx
xxx
x
xx
xx
xx
xxx
xxxx
x
xxx
xxxxx
xx
χi
xx
xxxx
xx
x xx
xxxx
x
1000 0.02 xxx x
xx
xxxx
xxx
x xx
x xx
xx xx
x
xx
xx xxx
xxx
xx
x xxxx
xxxxxxxxxx
xxx xxx
xxx
x
xx
xx
xx
xxxx x xxx
xx
x
xx xxxxxxxx
xxxxxx
xx
x
xx
x
xxx
xx
x
xx
x
x
x
x
x
xx
xx
xx
x
xx
x x
xx
xxx
xxx
800
xx
xxx
xx xxx
xx xx
xxxx
x xx
xxxx
x x xx
xxxx
xxxx
x
x xxx
xxxx x xx
x x
xx
xx
x x xx xx
x
xxxx
0 xx
x
xx
xx x
xx
x0.5
xx x
xx
xxx
xx x
xx
xxxx
xx
xxxx
xx
xxxx
xx
xxxx
xxx
xx
xx
x
xx
xx
x
xxx
xx
xxx
xxxx
xx
xxxx
xxx
x x
xx
xx
xx
xxxx
xx
x
600 1
0 1000 2000
sample i χi /[χ]av
Fig. 14. Left: Susceptibility for the different disorder realisations of the three-dimensional bond-
diluted Ising model for L = 96 and a concentration of magnetic bonds p = 0.7 at K = 0.6535 ≈
Kc (L). The running average over the samples is shown by the solid (red) line. Right: The resulting
probability distribution of the susceptibility scaled by its quenched average [χ]av , such that the results
for the different lattice sizes L = 40, 64, and 96 collapse. The vertical dashed line indicates the
average susceptibility χi /[χ]av = 1.
0.04
p = 0.4
xx
xx pp == 0.55
0.7
0.03 xx
xx
xxxxxxx
Rχ
0.02
x xxxxxx
xxx xx xx x xx xx xx
xx
x xx
0.01
0.00
0.00 0.02 0.04 0.06 0.08 0.10
1/L
Fig. 15. Normalized squared width of the susceptibility distribution versus the inverse lattice size for
the three concentrations p = 0.4, 0.55, and 0.7 at the effective critical coupling Kc (L). The straight
lines are linear fits used as guides to the eye.
els, e.g., γ/ν = 1.966(6) (pure191 ) and γ/ν = 1.963(5) (disordered192). The
only distinguishing quantity is the correlation length exponent ν which can be ex-
tracted, e.g., from the derivative of the magnetisation versus inverse temperature,
d ln[m]av /dK ∝ L1/ν , at Kc or the locations of the susceptibility maxima. Us-
ing the latter unbiased option and performing least-square fits including data from
Lmin to Lmax = 96 we obtained the effective critical exponents shown in Fig. 16.
For the dilution closest to the pure model (p = 0.7), the system is influenced by
the pure fixed point with 1/ν = 1.5863(33). On the other hand, when the bond
concentration is small (p = 0.4), the vicinity of the percolation fixed point where
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
150 W. Janke
1.60
Pure
p = 0.7
xxxp = 0.55
p = 0.4
1.55
(1/ν)eff
1.50
xx
x xx xx xx
xx
Dis. xxxxxxxxxxxxxxxxx xx
xx xxxxxxxx
1.45 xxxx
1.40
0 0.05 0.1 0.15 0.2 0.25
1/Lmin
Fig. 16. Effective exponents (1/ν)eff as obtained from fits to the behaviour of d ln[m]av /dK ∝
L1/ν as a function of 1/Lmin for p = 0.4, 0.55, and 0.7. The upper limit of the fit range is
Lmax = 96.
1/ν ≈ 1.12 induces a decrease of 1/ν below its expected disorder value. The
dilution for which the cross-over effects are the least is around p = 0.55 which
suggests that the scaling corrections should be rather small for this specific dilu-
tion.
The main problem of the FSS study is the competition between different fixed
points (pure, disorder, percolation) in combination with corrections-to-scaling
terms ∝ L−ω , which we found hard to control for bond dilution. In contrast to
recent claims for the site-diluted model that ω ≈ 0.4, we were not able to extract
a reliable estimate of ω from our data for bond dilution.
In a second set of simulations we examined the temperature scaling of the
magnetisation and susceptibility for lattice sizes up to L = 40. This data al-
lows direct estimates of the exponents β and γ whose relative deviation from
the pure model is comparable to that of ν, e.g. γ = 1.2396(13) (pure191)
and γ = 1.342(10) (disordered192). As a function of the reduced temperature
τ = (Kc − K) (τ < 0 in the low-temperature (LT) phase and τ > 0 in the high-
temperature (HT) phase) and the system size L, the susceptibility is expected to
scale as
[χ(τ, L)]av ∼ |τ |−γ g± (L1/ν |τ |) , (143)
1/ν
where g± is a scaling function of the variable x = L |τ | and the subscript
± stands for the HT/LT phases. Assuming [χ(τ )]av ∝ |τ |−γeff without any
corrections-to-scaling terms, we can define a temperature dependent effective crit-
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
1.5 xxx
xx xxxxxxx
xx
xxxx
xx
xxx
xxxxxxxxxx
xxxxx
xxxx
x xxxxxxxx xx Dis.
xxx
xx xxx xx
xxxxx xxx xxxxx
xxxx
x xxx
xxxxxxxxxx xx xxx xx
1.0 xx
γeff(|τ|)
Pure
xx
xx
0.5 xx
xxxx
xxxx
0.0xx
xx
0 5 10 15 20
1/ν
L |τ|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxx xxxxxxxxx
xxx
xxxxxx
xxxxxxxxxx xx xx x
xxxxxxxxxxx xxxxxx xxxxxxxxxxxxx xx xx x
x
0.1
xxx xx xx
γ
[χ]av|τ|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xx xxx
xxx
xxxxxxxxx xxxxxxxxxxxxxxxxx xxxx xxL = 14 xx
L = 10
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xx xxx
xxx
xxxxxxxxx xxxxxxxxxxxxxxxxx xxxx xx xx
x xxxxLL == 18
0.05xx xxxxxL = 22
p = 0.7 xxx L = 30
xxxxL = 35
xxxx 40
xx
x
0x
x
1/ν
L |τ|
Fig. 17. Top: Variation of the temperature dependent effective critical exponent γeff (|τ |) =
−d ln[χ]av /d ln |τ | (in the low-temperature phase) as a function of the rescaled temperature L1/ν |τ |
for the bond-diluted Ising model with p = 0.7 and several lattice sizes L. The horizontal solid and
dashed lines indicate the site-diluted and pure values of γ, respectively. Bottom: The figure below
shows the critical amplitudes Γ± above and below the critical temperature.
ical exponent γeff (|τ |) = −d ln[χ]av /d ln |τ |, which should converge towards the
asymptotic critical exponent γ when L → ∞ and |τ | → 0. Our results for p = 0.7
are shown in Fig. 17. For the greatest sizes, the effective exponent γeff (|τ |) is sta-
ble around 1.34 when |τ | is not too small, i.e., when the finite-size effects are not
too strong. The plot of γeff (|τ |) vs. the rescaled variable L1/ν |τ | shows that the
critical power-law behaviour holds in different temperature ranges for the differ-
ent sizes studied. By analysing the temperature behaviour of the susceptibility, we
also have directly extracted the power-law exponent γ using error weighted least-
squares fits and choosing the temperature range that gives the smallest χ2 /d.o.f
for several system sizes. The results are consistent with γ ≈ 1.34 − 1.36, cf.
Table 2.
From the previous expression of the susceptibility as a function of the reduced
temperature and size, it is instructive to plot the scaling function g± (x). For finite
size and |τ | 6= 0, the scaling functions may be Taylor expanded in powers of
the inverse scaling variable x−1 = (L1/ν |τ |)−1 , [χ± (τ, L)]av = |τ |−γ [g± (∞) +
x−1 g±′
(∞) + O(x−2 )], where the amplitude g± (∞) is usually denoted by Γ± .
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
152 W. Janke
mined.
h site dilution, p = 0.8. The observed correction to scaling could be the
next-to-leading one.
i site dilution, p = 0.8.
j bond dilution, p = 0.6 to 0.7.
Multiplying by |τ |γ leads to
When |τ | → 0 but with L still larger than the correlation length ξ, one should
recover the critical behaviour given by g± (x) = O(1). The critical amplitudes Γ±
follow, as shown in the lower plot of Fig. 17. Some experimental and numerical
estimates are compiled in Table 2.
To summarize, this application is a good example for how large-scale Monte
Carlo simulations employing the cluster update algorithm can be used to investi-
gate the influence of quenched bond dilution on the critical properties of the 3D
Ising. It also illustrates how scaling and finite-size scaling analyses can be applied
to a non-trival problem.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
The first two terms are a standard 12-6 Lennard-Jones (LJ) potential and a
weak bending energy describing the bulk behaviour. The distance between the
monomers i and j is rij and 0 ≤ ϑi ≤ π denotes the bending angle between the
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
154 W. Janke
Fig. 18. Sketch of a single polymer subject to an attractive substrate at z = 0. The hard wall at
z = Lz prevents a non-grafted polymer from escaping.
ith, (i + 1)th, and (i + 2)th monomer. The third term is specific to an attractive
substrate. This 9-3 LJ surface potential follows by integration over the continu-
ous half-space z < 0 (cf. Fig. 18), where every space element interacts with each
monomer by the usual 12-6 LJ expression.218 The relative strength of the two LJ
interactions is continuously varied by considering ǫs as a control parameter.
We employed parallel tempering simulations to a 40mer once grafted with
one end to the substrate in the potential minimum and once freely moving in the
space between the substrate and a hard wall a distance Lz = 60 away. There
exist several attempts to optimise the choice of the simulation points βi ,89,90 but
usually one already gets a reasonable performance when observing the histograms
and ensuring the acceptance probability to be around 50%, which approximately
requires an equidistribution in β. We employed 64 − 72 different replicas with
50 000 000 sweeps each, from which every 10th value was stored in a time series
– the autocorrelation time in units of sweeps turned out to be of the order of
thousands. Finally, all data are combined by the multi-histogram technique (using
the variant of Ref. 219).
Apart from the internal energy and specific heat, a particular useful
2
quantity for polymeric systems is the squared radius of gyration Rgyr =
PN 2 PN
i=1 i(~
r − ~
r cm ) , with ~
rcm = (x , y ,
cm cm cm z ) = i=1 i~
r /N being the center-
of-mass of the polymer. In the presence of a symmetry breaking substrate, it
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
is useful to also monitor the tensor components parallel and perpendicular to the
PN 2 2 PN 2
substrate, Rk2 = i=1 [(xi − xcm ) +(yi − ycm ) ] and R⊥ 2
= i=1 (zi − zcm ) .
As an indicator for adsorption one may take the distance of the center-of-mass of
the polymer to the surface. Additionally, we also analyzed the mean number of
monomers docked to the surface ns where for the continuous substrate potential
we defined a monomer i to be docked if zi < zc ≡ 1.5.
The main results are summarized in the phase diagram shown in Fig. 19. It
is constructed using the profile of several canonical fluctuations as shown for the
specific heat in Fig. 20. For the non-grafted polymer this plot clearly reveals the
freezing and adsorption transitions. Freezing leads to a pronounced peak near
T = 0.25 (we use units in which kB = 1) almost independently of the surface
attraction strengths. That this is indeed the freezing transition is confirmed by
the very rigid crystalline structures found below this temperature. To differenti-
ate between the different crystalline structures, the radius of gyration, its tensor
components parallel and perpendicular to the substrate, and the number of sur-
face contacts were analyzed. This revealed that the crystalline phases arrange
in a different number of layers to minimize the energy. For high surface attrac-
tion strengths, a single layer is favored (AC1), and for decreasing ǫs the number
of layers increases until for the 40mer a maximal number of 4 layers is reached
(AC4), cf. the representative conformations depicted in the right panel of Fig.
19. The fewer layers are involved in a layering transition, the more pronounced
is that transition. Raising the temperature above the freezing temperature, poly-
mers form adsorbed and still rather compact conformations. This is the phase
of adsorbed globular (AG) conformations that can be subdivided into droplet-like
globules for surface interactions ǫs that are not strong enough to induce a sin-
gle layer below the freezing transition and more pancake-like flat conformations
(AG1) at temperatures above the AC1 phase. At higher temperatures, two scenar-
ios can be distinguished. For small adsorption strength ǫs , a non-grafted polymer
first desorbs from the surface [from AG to the desorbed globular (DG) bulk phase]
and disentangles at even higher temperatures [from DG to the desorbed expanded
bulk phase (DE)]. For larger ǫs , the polymer expands while it is still adsorbed to
the surface (from AG/AG1 to AE) and desorbs at higher temperatures (from AE
to DE). The collapse transition in the adsorbed phase takes place at a lower tem-
perature compared to the desorbed phase because the deformation at the substrate
leads to an effective reduction of the number of contacts.
Grafting the polymer to the substrate mainly influences the adsorption tran-
sition. Figure 20(b), e.g., reveals that it is strongly weakened for all ǫs . Due to
grafting, the translational entropy for desorbed chains is strongly reduced. As a
consequence adsorption of finite grafted polymers appears to be continuous, in
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
156 W. Janke
Fig. 19. The pseudo-phase diagram parametrized by adsorption strength ǫs and temperature T for a
40mer. The gray transition regions have a broadness that reflects the variation of the corresponding
peaks of the fluctuations of canonical expectation values we investigated. Phases with an ‘A/D’ are
adsorbed/desorbed. ‘E’, ‘G’ and ‘C’ denote phases with increasing order: expanded, globular and
compact/crystalline. The right panel shows representative conformations of the individual phases.
Fig. 20. Specific-heat profile, cV (ǫs , T ), for (a) the non-grafted and (b) the grafted polymer.
contrast to the non-grafted case where this behaviour becomes apparent for very
long chains only. The reason is that all conformations of a grafted polymer are in-
fluenced by the substrate, because they cannot escape. Hence, the first-order-like
conformational rearrangement of extended non-grafted polymers upon adsorption
is not necessary and the adsorption is continuous.
The case of globular chains has to be discussed separately. While non-grafted
globular chains adsorb continuously, for grafted globular chains it even is nontriv-
ial to identify an adsorption transition. A globular chain attached to a substrate
always has several surface contacts such that a “desorbed globule” stops to be
a well-defined description here. For stronger surface attraction one might, how-
ever, identify the transition from attached globules that only have a few contacts
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
Fig. 21. (a) Specific cV (T ), (b) fluctuation of the radius of gyration component perpendicular
D heat E
to the substrate d R2gyr,⊥ (T )/dT , and (c) fluctuation of the number of monomers in contact with
the substrate d hns i (T )/dT for weak surface attraction, ǫs = 0.7, where the adsorption occurs at a
lower temperature than the collapse.
to docked conformations with the wetting transition. This roughly coincides with
the position of the adsorption transition for the free chain between DG and AG
in the phase diagram and is illustrated for ǫs = 0.7 in Fig. 21. ForDa non-grafted
E
2
polymer, at the adsorption transition a peak is visible in cV (T ), d Rgyr,⊥ /dT
and d hns i /dT . For the grafted polymer, on the other hand, the first two peaks
disappear and with it the adsorption transition. Only a signal in the number of sur-
face contacts is left. This change of surface contacts in an otherwise unchanged
attached globule signals the wetting transition.
To summarize, this example was chosen to illustrate the application of ex-
tensive parallel tempering simulations to analyze and compare the whole phase
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
158 W. Janke
8. Concluding Remarks
The aim of this chapter is to give an elementary introduction into the basic prin-
ciples underlying modern Markov chain Monte Carlo simulations and to illustrate
their usefulness by two advanced applications to quenched, disordered spin sys-
tems and adsorption phenomena of polymers.
The simulation algorithms employing local update rules are very generally ap-
plicable but suffer from critical slowing down at second-order phase transitions.
Non-local cluster update methods are much more efficient but more specialized.
Some generalizations from Ising to Potts and O(n) symmetric spin models have
been indicated. In principle also other models may be efficiently simulated by
cluster updates, but there does not exist a general strategy for their construction.
Reweighting techniques and generalized ensemble ideas such as simulated and
parallel tempering, the multicanonical ensemble and Wang-Landau method can
be adapted to almost any statistical physics problem where rare-event states ham-
per the dynamics. Well known examples are first-order phase transitions and spin
glasses, but also some macromolecular systems fall into this class. The perfor-
mance of the various algorithms can be judged by statistical error analysis which
is completely general. Finally, also the outlined scaling and finite-size scaling
analyses can be applied to virtually any model exhibiting critical phenomena as
was exemplified for a disordered spin system.
Acknowledgements
I thank Yurij Holovatch for his kind invitation to present one of the Ising Lectures
at the Institute for Condensed Matter Physics of the National Academy of Sciences
of Ukraine, Lviv, Ukraine.
I gratefully acknowledge the contributions to the work reviewed here by
my collaborators, in particular Michael Bachmann, Bertrand Berche, Pierre-
Emmanuel Berche, Elmar Bittner, Christophe Chatelain, Monika Möddel,
Thomas Neuhaus, Andreas Nußbaumer, Stefan Schnabel, and Martin Weigel, and
thank Bernd Berg, Kurt Binder, David Landau, Yuko Okamoto, and Bob Swend-
sen for many useful discussions.
This work was partially supported by DFG Grant No. JA 483/24-3, DFG
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
References
160 W. Janke
Scientific, Singapore, 2007), pp. 123–180, and the extensive list of references to ear-
lier work given therein.
49. U. Wolff, Phys. Rev. Lett. 62, 361 (1989).
50. R. B. Potts, Proc. Camb. Phil. Soc. 48, 106 (1952).
51. U. Wolff, Nucl. Phys. B 322, 759 (1989).
52. M. Hasenbusch, Nucl. Phys. B 333, 581 (1990).
53. U. Wolff, Nucl. Phys. B 334, 581 (1990).
54. U. Wolff, Phys. Lett. A 228, 379 (1989).
55. C. F. Baillie, Int. J. Mod. Phys. C 1, 91 (1990).
56. M. Hasenbusch and S. Meyer, Phys. Lett. B 241, 238 (1990).
57. R. H. Swendsen, J.-S. Wang, and A. M. Ferrenberg, in The Monte Carlo Method in
Condensed Matter Physics, ed. K. Binder (Springer, Berlin, 1992).
58. W. Janke, Phys. Lett. A 148, 306 (1990).
59. C. Holm and W. Janke, Phys. Rev. B 48, 936 (1993).
60. X.-L. Li and A. D. Sokal, Phys. Rev. Lett. 63, 827 (1989); ibid. 67, 1482 (1991).
61. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes
in Fortran 77 – The Art of Scientific Computing, 2nd edition (Cambridge University
Press, Cambridge, 1999).
62. H. Müller-Krumbhaar and K. Binder, J. Stat. Phys. 8, 1 (1973).
63. N. Madras and A. D. Sokal, J. Stat. Phys. 50, 109 (1988).
64. T. W. Anderson, The Statistical Analysis of Time Series (Wiley, New York, 1971).
65. M. B. Priestley, Spectral Analysis and Time Series, 2 vols. (Academic, London,
1981), Chapters 5-7.
66. W. Janke, Statistical analysis of simulations: Data correlations and error estimation,
invited lecture notes, in Proceedings of the Euro Winter School Quantum Simulations
of Complex Many-Body Systems: From Theory to Algorithms, eds. J. Grotendorst, D.
Marx, and A. Muramatsu, John von Neumann Institute for Computing, Jülich, NIC
Series, Vol. 10, pp. 423–445 (2002).
67. A. M. Ferrenberg, D. P. Landau, and K. Binder, J. Stat. Phys. 63, 867 (1991).
68. W. Janke and T. Sauer, J. Stat. Phys. 78, 759 (1995).
69. B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans (Society for
Industrial and Applied Mathematics [SIAM], Philadelphia, 1982).
70. R. G. Miller, Biometrika 61, 1 (1974).
71. A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. 61, 2635 (1988).
72. A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. 63, 1658(E) (1989).
73. B. Kaufman, Phys. Rev. 76, 1232 (1949).
74. A. E. Ferdinand and M.E. Fisher, Phys. Rev. 185, 832 (1969).
75. P. D. Beale, Phys. Rev. Lett. 76, 78 (1996).
76. N. Wilding, Computer simulation of continuous phase transitions, in Computer Sim-
ulations of Surfaces and Interfaces, NATO Science Series, II. Mathematics, Physics
and Chemistry – Vol. 114, eds. B. Dünweg, D. P. Landau, and A. I. Milchev (Kluwer,
Dordrecht, 2003), pp. 161–171.
77. A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. 63, 1195 (1989).
78. S. Kumar, D. Bouzida, R. H. Swendsen, P. A. Kollman, and J. M. Rosenberg, J.
Comp. Chem. 13, 1011 (1992).
79. S. Kumar, J. M. Rosenberg, D. Bouzida, R. H. Swendsen, and P. A. Kollman, J.
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
162 W. Janke
164 W. Janke
formations].
143. R. Kenna, D. A. Johnston, and W. Janke, Phys. Rev. Lett. 96, 115701 (2006); ibid.
97, 155702 (2006) [Publisher’s Note: ibid. 97, 169901(E) (2006)].
144. F. Y. Wu, Rev. Mod. Phys. 54, 235 (1982).
145. F. Y. Wu, Rev. Mod. Phys. 55, 315(E) (1983).
146. M. Weigel and W. Janke, Phys. Rev. B 62, 6343 (2000).
147. M. E. Barber, in Phase Transitions and Critical Phenomena, Vol. 8, eds. C. Domb
and J. L. Lebowitz (Academic Press, New York, 1983), p. 146.
148. V. Privman (ed.), Finite-Size Scaling and Numerical Simulations of Statistical Sys-
tems (World Scientific, Singapore, 1990).
149. K. Binder, in Computational Methods in Field Theory, Schladming Lecture Notes,
eds. H. Gausterer and C. B. Lang (Springer, Berlin, 1992), p. 59.
150. G. Kamieniarz and H. W. J. Blöte, J. Phys. A 26, 201 (1993).
151. J. Salas and A. D. Sokal, J. Stat. Phys. 98, 551 (2000).
152. X. S. Chen and V. Dohm, Phys. Rev. E 70, 056136 (2004).
153. V. Dohm, J. Phys. A 39, L259 (2006).
154. W. Selke and L. N. Shchur, J. Phys. A 38, L739 (2005).
155. M. Schulte and C. Drope, Int. J. Mod. Phys. C 16, 1217 (2005).
156. M. A. Sumour, D. Stauffer, M. M. Shabat, and A. H. El-Astal, Physica A 368, 96
(2006).
157. W. Selke, Eur. Phys. J. B 51, 223 (2006); J. Stat. Mech. P04008 (2007).
158. J. D. Gunton, M. S. Miguel, and P. S. Sahni, in Phase Transitions and Critical Phe-
nomena, Vol. 8, eds. C. Domb and J. L. Lebowitz (Academic Press, New York, 1983).
159. K. Binder, Rep. Prog. Phys. 50, 783 (1987).
160. H. J. Herrmann, W. Janke, and F. Karsch (eds.): Dynamics of First Order Phase
Transitions (World Scientific, Singapore, 1992).
161. M. E. Fisher and A. N. Berker, Phys. Rev. B 26, 2507 (1982).
162. V. Privman, M. E. Fisher, J. Stat. Phys. 33, 385 (1983).
163. K. Binder and D. P. Landau, Phys. Rev. B 30, 1477 (1984).
164. M. S. S. Challa, D. P. Landau, and K. Binder, Phys. Rev. B 34, 1841 (1986).
165. V. Privman and J. Rudnik, J. Stat. Phys. 60, 551 (1990).
166. C. Borgs and R. Kotecky, J. Stat. Phys. 61, 79 (1990).
167. J. Lee and J. M. Kosterlitz, Phys. Rev. Lett. 65, 137 (1990).
168. C. Borgs, R. Kotecky, and S. Miracle-Solé, J. Stat. Phys. 62, 529 (1991).
169. C. Borgs and W. Janke, Phys. Rev. Lett. 68, 1738 (1992).
170. W. Janke, Phys. Rev. B 47, 14757 (1993).
171. E. Bittner and W. Janke, Phys. Rev. E 84, 036701 (2011).
172. M. Weigel and W. Janke, Phys. Rev. Lett. 102, 100601 (2009); Phys. Rev. E 81,
066701 (2010).
173. J. Cardy, Scaling and Renormalization in Statistical Physics (Cambridge University
Press, Cambridge, 1996), Chap. 8.
174. P. W. Mitchell, R. A. Cowley, H. Yoshizawa, P. Böni, Y. J. Uemura, and R. J. Birge-
neau, Phys. Rev. B 34, 4719 (1986).
175. A. B. Harris, J. Phys. C 7, 1671 (1974).
176. For a review, see B. Berche and C. Chatelain, Phase transitions in two-dimensional
random Potts models, in Order, Disorder and Criticality: Advanced Problems of
June 18, 2012 13:35 World Scientific Review Volume - 9in x 6in master
166 W. Janke
207. T. Vogel, M. Bachmann, and W. Janke, Phys. Rev. E 76, 061803 (2007).
208. M. P. Taylor, W. Paul, and K. Binder, Phys. Rev. E 79, 050801 (2009).
209. E. Eisenriegler, K. Kremer, and K. Binder, J. Chem. Phys. 77, 6296 (1982).
210. E. Eisenriegler, Polymers near Surfaces: Conformation Properties and Relation to
Critical Phenomena (World Scientific, Singapore, 1993).
211. F. Kuhner, M. Erdmann, and H. E. Gaub, Phys. Rev. Lett. 97, 218301 (2006).
212. M. Bachmann, K. Goede, A. Beck-Sickinger, M. Grundmann, A. Irbäck, and W.
Janke, Angew. Chem. Int. Ed. 122, 9721 (2010).
213. D. E. Smith, S. J. Tans, S. B. Smith, S. Grimes, D. L. Anderson, and C. Bustamante,
Nature 413, 748 (2001).
214. K. Kegler, M. Salomo, and F. Kremer, Phys. Rev. Lett. 98, 058304 (2007).
215. M. Möddel, W. Janke, and M. Bachmann, Macromolecules 44, 9013 (2011).
216. M. Möddel, M. Bachmann, and W. Janke, J. Phys. Chem. B 113, 3314 (2009).
217. M. Möddel, W. Janke, and M. Bachmann, Phys. Chem. Chem. Phys. 12, 11548
(2010).
218. W. A. Steele, Surface Sci. 36, 317 (1973).
219. M. K. Fenwick, J. Chem. Phys. 129, 125106 (2008).