Next: 1.4 Problems for Chapter Up: 1. Why is water Previous: 1.2 Model systems

# 1.3 Fundamentals of Statistics

By statistics we denote the investigation of regularities in apparently non-deterministic processes. An important basic quantity in this context is the relative frequency'' of an event''. Let us consider a repeatable experiment - say, the throwing of a die - which in each instance leads to one of several possible results - say, . Now repeat this experiment times under equal conditions and register the number of cases in which the specific result occurs; call this number . The relative frequency of is then defined as .

Following R. von Mises we denote as the probability '' of an event the expected value of the relative frequency in the limit of infinitely many experiments:

 (1.24)

EXAMPLE: Game die; 100-1000 trials; , or .
Now, this definition does not seem very helpful. It implies that we have already done some experiments to determine the relative frequency, and it tells us no more than that we should expect more or less the same relative frequencies when we go on repeating the trials. What we want, however, is a recipe for the prediction of .

To obtain such a recipe we have to reduce the event to so-called elementary events'' that obey the postulate of equal a priori probability. Since the probability of any particular one among possible elementary events is just , we may then derive the probability of a compound event by applying the rules

 (1.25) (1.26)

Thus the predictive calculation of probabilities reduces to the counting of the possible elementary events that make up the event in question.
EXAMPLE: The result is one among mutually exclusive elementary events with equal a priory probabilities (). The compound event consists of the elementary events ; its probability is thus .

How might this apply to statistical mechanics? - Let us assume that we have equivalent mechanical systems with possible states . A relevant question is then: what is the probability of a situation in which
 (1.27)

EXAMPLE: dice are thrown (or one die times!). What is the probability that dice each have numbers of points ? What, in contrast, is the probability that all dice show a one''?

The same example, but with more obviously physical content:
Let gas atoms be contained in a volume , which we imagine to be divided into equal partial volumes. What is the probability that at any given time we find particles in each subvolume? And how probable is the particle distribution )? (Answer: see below under the heading multinomial distribution''.)
We can generally assume that both the number of systems and the number of accessible states are very large - in the so-called thermodynamic limit'' they are actually taken to approach infinity. This gives rise to certain mathematical simplifications.

Before advancing into the field of physical applications we will review the fundamental concepts and truths of statistics and probability theory, focussing on events that take place in number space, either (real numbers) or (natural numbers).

DISTRIBUTION FUNCTION
Let be a real random variate in the region . The distribution function

 (1.28)

is defined as the probability that some is smaller than the given value . The function is monotonically increasing and has and . The distribution function is dimensionless: .

The most simple example is the equidistribution for which

 (1.29)

Another important example, with , , is the normal distribution
 (1.30)

and its generalization, the Gaussian distribution
 (1.31)

where the parameters and define an ensemble of functions.

DISTRIBUTION DENSITY
The distribution or probability density is defined by
 (1.32)

In other words, is just the differential quotient of the distribution function:
 (1.33)

has a dimension; it is the reciprocal of the dimension of its argument :
 (1.34)

For the equidistribution we have
 (1.35)

and for the normal distribution
 (1.36)

If is limited to discrete values with a step one often writes

 (1.37)

for the probability of the event . This is by definition dimensionless, although it is related to the distribution density for continuous arguments. The definition 1.37 includes the special case that is restricted to integer values ; in that case .

MOMENTS OF A DENSITY
By this we denote the quantities
 (1.38)

The first moment is also called the expectation value or mean value of the distribution density , and the second moment is related to the variance and the standard deviation: variance (standard deviation = square root of variance).

EXAMPLES:
.) For an equidistribution we have , and .
.) For the normal distribution we find and .

SOME IMPORTANT DISTRIBUTIONS

Equidistribution: Its great significance stems from the fact that this distribution is central both to statistical mechanics and to practical numerics. In the theory of statistical-mechanical systems, one of the fundamental assumptions is that all states of a system that have the same energy are equally probable (axiom of equal a priori probability). And in numerical computing the generation of homogeneously distributed pseudo-random numbers is relatively easy; to obtain differently distributed random variates one usually processes'' such primary equidistributed numbers.

Gauss distribution: This distribution pops up everywhere in the quantifying sciences. The reason for its ubiquity is the central value theorem'': Every random variate that can be expressed as a sum of arbitrarily distributed random variates will in the limit of many summation terms be Gauss distributed. For example, when we have a complex measuring procedure in which a number of individual errors (or uncertainties) add up to a total error, then this error will be nearly Gauss distributed, regardless of how the individual contributions may be distributed. In addition, several other physically relevant distributions, such as the binomial and multinomial densities (see below), approach the Gauss distribution under certain - quite common - circumstances.

Binomial distribution: This discrete distribution describes the probability that in independent trials an event that has a single trial probability will occur exactly times:
 (1.39)

For the first two moments of the binomial distribution we have (not necessarily integer) and (i.e. ).

APPLICATION: Fluctuation processes in statistical systems are often described in terms of of the binomial distribution. For example, consider a particle freely roaming a volume . The probability to find it at some given time in a certain partial volume is . Considering now independent particles in , the probability of finding just of them in is given by

 (1.40)

The average number of particles in and its standard deviation are

 (1.41)

Note that for we have for the variance , meaning that the population fluctuations in are then of the same order of magnitude (namely, ) as the mean number of particles itself.

For large such that the binomial distribution approaches a Gauss distribution with mean and variance (theorem of Moivre-Laplace):
 (1.42)

with .

If and such that their product remains finite, the density 1.39 approaches

 (1.43)

which goes by the name of Poisson distribution.

An important element in the success story of statistical mechanics is the fact that with increasing the sharpness of the distribution 1.39 or 1.42 becomes very large. The relative width of the maximum, i. e. , decreases as . For the width of the peak is no more than of , and for molar'' orders of particle numbers the relative width is already . Thus the density approaches a delta distribution''. This, however, renders the calculation of averages particularly simple:

 (1.44)

or
 (1.45)

Multinomial distribution: This is a generalization of the binomial distribution to more than 2 possible results of a single trial. Let be the (mutally exclusive) possible results of an experiment; their probabilities in a single trial are , with . Now do the experiment times; then
 (1.46)

is the probability to have the event just times, accordingly times, etc.

We get an idea of the significance of this distribution in statistical physics if we interpret the the possible events as states'' that may be taken on by the particles of a system (or, in another context, by the systems in an ensemble of many-particle systems). The above formula then tells us the probability to find among the particles in state , etc.

EXAMPLE: A die is cast times. The probability to find each number of points just times is

 (1.47)

To compare, the probabilities of two other cases: , . Finally, for the quite improbable case we have

Due to its large number of variables () we cannot give a graph of the multinomial distribution. However, it is easy to derive the following two important properties:

Approach to a multivariate Gauss distribution: just as the binomial distribution approaches, for large , a Gauss distribution, the multinomial density approaches an appropriately generalized - multivariate'' - Gauss distribution.

Increasing sharpness: if and become very large (multiparticle systems; or ensembles of elements), the function has an extremely sharp maximum for a certain partitioning , namely . This particular partitioning of the particles to the various possible states is then almost always'' realized, and all other allotments (or distributions) occur very rarely and may safely be neglected.

This is the basis of the method of the most probable distribution which is used with great success in several areas of statistical physics.[2.2]

STIRLING'S FORMULA
For large values of the evaluation of the factorial is difficult. A handy approximation is Stirling's formula

 (1.48)

EXAMPLE: (Near most pocket calculators' limit): ; .
The same name Stirling's formula is often used for the logarithm of the factorial:
 (1.49)

(The term may usually be neglected in comparison to .)

EXAMPLE 1: ; .

EXAMPLE 2: The die is cast again, but now there are trials. When asked to produce most pocket calculators will cancel their cooperation. So we apply Stirling's approximation:
The probability of throwing each number of points just times is

 (1.50)

and the probability of the partitioning is .

STATISTICAL (IN)DEPENDENCE
Two random variates are statistically mutually independent (uncorrelated) if the distribution density of the compound probability (i.e. the probability for the joint occurence of and ) equals the product of the individual densities:
 (1.51)

EXAMPLE: In a fluid or gas the distribution density for a single component of the particle velocity is given by (Maxwell-Boltzmann)

 (1.52)

The degrees of freedom are statistically independent; therefore the compound probability is given by

 (1.53)

By conditional distribution density we denote the quantity
 (1.54)

(For uncorrelated we have ).

The density of a marginal distribution describes the density of one of the variables regardless of the specific value of the other one, meaning that we integrate the joint density over all possible values of the second variable:

 (1.55)

TRANSFORMATION OF DISTRIBUTION DENSITIES
From 1.32 we can immediately conclude how the density will transform if we substitute the variable . Given some bijective mapping the conservation of probability requires
 (1.56)

(The absolute value appears because we have not required to be an increasing function.) This leads to
 (1.57)

or
 (1.58)

Incidentally, this relation is true for any kind of density, such as mass or spectral densities, and not only for distribution densities.
EXAMPLE 1: A particle of mass moving in one dimension is assumed to have any velocity in the range with equal probability; so we have . The distribution density for the kinetic energy is then given by (see Figure 1.11)

 (1.59)

in the limits , where . (The factor in front of comes from the ambiguity of the mapping ).

EXAMPLE 2: An object is found with equal probability at any point along a circular periphery; so we have for . Introducing cartesian coordinates , we find for the distribution density of the coordinate , with , that

 (1.60)

(see Figure 1.12)

Problems equivalent to these examples:
a) A homogeneously blackened glass cylinder - or a semitransparent drinking straw - held sideways against a light source: absorption as a function of the distance from the axis?
b) Distribution of the -velocity of a particle that can move randomly in two dimension, keeping its kinetic energy constant.

Simulation 1.6: Stadium Billiard. Distribution of the velocity component . [Code: Stadium]

c) Distribution of the velocity of any of two particles arbitrarily moving in one dimension, keeping only the sum of their kinetic energies constant.

For the joint probability density of several variables the transformation formula is a direct generalization of 1.58, viz.
 (1.61)

Here we write for the functional determinant (or Jacobian) of the mapping ,
 (1.62)

EXAMPLE 3: Again, let , and as in equ. 1.53. Now we write , with

 (1.63)

The Jacobian of the mapping is

 (1.64)

Therefore we have for the density of the modulus of the particle velocity

 (1.65) (1.66)

Next: 1.4 Problems for Chapter Up: 1. Why is water Previous: 1.2 Model systems
Franz Vesely
2005-01-25