Stanford Encyclopedia of Philosophy
This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Boltzmann's Work in Statistical Physics

First published Wed Nov 17, 2004

Ludwig Boltzmann (1844-1906) is generally acknowledged as one of the most important physicists of the nineteenth century. Particularly famous is his statistical explanation of the second law of thermodynamics. The celebrated formula S = k logW, expressing a relation between entropy S and probability W has been engraved on his tombstone (even though he never actually wrote this formula down). Boltzmann's views on statistical physics continue to play an important role in contemporary debates on the foundations of that theory.

However, Boltzmann's ideas on the precise relationship between the thermodynamical properties of macroscopic bodies and their microscopic constitution, and the role of probability in this relationship are involved and differed quite remarkably in different periods of his life. Indeed, in his first paper in statistical physics of 1866, he claimed to obtain a completely general theorem from mechanics that would prove the second law. However, thirty years later he stated that the second law could never be proved by mechanical means alone, but depended essentially on probability theory. In his lifelong struggle with the problem he employed a varying arsenal of tools and assumptions. (To mention a few: the so-called Stoßzahlansatz, the ergodic hypothesis, ensembles, the permutational argument, the hypothesis of molecular disorder.) However, the exact role of these assumptions, and the results he obtained from them, also shifted in the course of time. Particularly notorious are the role of the ergodic hypothesis and the status of the so-called H-theorem. Moreover, he used ‘probability’ in four different technical meanings. It is, therefore, not easy to speak of a consistent, single "Boltzmannian approach" to statistical physics. It is the purpose of this essay to describe the evolution of a selection of these approaches and their conceptual problems.


1. Introduction

1.1 Popular perceptions of Boltzmann

Boltzmann's work met with mixed reactions during his lifetime, and continue to do so even today. It may be worthwhile, therefore, to devote a few remarks to the perception and reception of his work.

Boltzmann is often portrayed as a staunch defender of the atomic view of matter, at a time when the dominant opinion in the German-speaking physics community, led by influential authors like Mach and Ostwald, disapproved of this view. Indeed, the story goes, in the late nineteenth century any attempt at all to search for a hypothetical, microphysical underpinning of macroscopic phenomena was regarded as suspicious. Further, serious criticism on his work was raised by Loschmidt and Zermelo. Various passages in Boltzmann's writing, especially in the late 1890s, complain that his work was hardly noticed (entitling one article "On some of my lesser-known papers on gas theory and their relation to the same" (1879b) or even about a "hostile attitude" (1898a, v) towards gas theory, and of his awareness of "being a powerless individual struggling against the currents of the time" (ibid.).

Thus, the myth has arisen that Boltzmann was ignored or resisted by his contemporaries.[1] Sometimes, his suicide in 1906 is attributed to the injustice he thus suffered, The fact that his death occurred just at the dawn of the definitive victory of the atomic view in the works of Einstein, Smoluchowski, Perrin et al. adds a further touch of drama to this picture.

As a matter of fact, Boltzmann's reputation as a theoretical physicist was actually widely known and well-respected. In 1888 he was offered (but declined, after a curious sequence of negotiations) a most prestigious chair in Berlin. Later, several universities (Vienna, Munich, Leipzig) competed to get him appointed, sometimes putting the salaries of several professorships together in their effort (Lindley 2001). He was elected to membership or honorary membership in many academies (cf. Höflechner 1994, 192), received honorary doctorates, and was also awarded various medals. In short, there is no factual evidence for the claim that Boltzmann was ignored or suffered any unusual lack of recognition from his contemporaries. His suicide seems to have been due to factors in his personal life (depressions and decline of health) rather than to any academic matters.

1.2 Debates and controversies

Boltzmann was involved in various disputes. But this is not to say that he was the innocent victim of hostilities. In many debates he took the initiative by launching a polemic attack on his colleagues. The most important debates were those with Mach and Ostwald on the reality of atoms, and with colleagues who criticized Boltzmann's own work in the form of the famous reversibility objection (Loschmidt) and the recurrence objection (Zermelo).

Ostwald and Mach clearly resisted the atomic view of matter (although for different reasons). Boltzmann certainly defended and promoted this view. But he was not the naive realist or unabashed believer in the existence of atoms that the more popular literature has made of him. Instead, he stressed from the 1880s onwards that the atomic view yielded at best an analogy, or a picture or model of reality (cf. de Regt 1999). In his debate with Mach he advocated (1897c, 1897d) this approach as a useful or economical way to understand the thermal behavior of gases. This means that his views were quite compatible with Mach's views on the goal of science.[2] What divided them was more a strategic issue. Boltzmann claimed that no approach in natural science that avoids hypotheses completely could ever succeed. He argued that those who reject the atomic hypothesis in favor of a continuum view of matter were guilty of adopting hypotheses too. Ultimately, the choice between such views should depend on their fruitfulness, and here Boltzmann had no doubt that the atomic hypothesis would be more successful.[3]

In the case of Ostwald, and his ‘energetics’, Boltzmann did become involved in a more heated dispute at a meeting in Lübeck in 1895. Roughly speaking, energetics presented a conception of nature that took energy as the most fundamental physical entity, and thus represented physical processes as transformations of various forms of energy. It resisted attempts to comprehend energy, or these transformations in terms of mechanical pictures.

It has been suggested that in the 1890s "the adherents of energetics reigned supreme in the German school and even throughout Europe" (Dugas 1959, 82). But this is surely a great exaggeration. It seems closer to the truth to say that energetics represented a rather small but vocal minority in the physics community, that claimed to put forward a seemingly attractive conception of natural science, and being promoted in the mid-90s by reputed scientists, could no longer be dismissed as the work of amateurs (cf. Deltete 1999).

The 1895 gathering of the Naturforscherversammlung in Lübeck (the annual meeting of physicists, chemists, biologists and physicians) was programmed to devote special sessions to the state of the art of energetics. Boltzmann, who was member of the programme committee, had already shown interest in the development of energetics in private correspondence with Ostwald. Georg Helm was asked to prepare a report, and at Boltzmann's own suggestion, Ostwald also contributed a lecture. All agreed that the meeting should follow the "British style", i.e., manuscripts would be circulated beforehand and there would be ample room for discussion, following the example of the British Association for the Advancement of Science meeting that Boltzmann had attended the previous year.

Both Helm and Ostwald, apparently, anticipated that they would have the opportunity to discuss their views on energetics in an open-minded atmosphere. But at the meeting Boltzmann surprised them with devastating criticism. According to those who were present Boltzmann was the clear winner of the debate.[4] Yet the energeticists experienced the confrontation as an ambush (Höflechner 1994, I, 169), for which he had not been prepared. Nevertheless, Boltzmann and Ostwald remained friends, and in 1902 Ostwald made a great effort to persuade his home university in Leipzig to appoint Boltzmann (cf. Blackmore 1995, 61–65).

Neither is there any hostile attitude in the famous ‘reversibility objection’ by Loschmidt in 1875. Loschmidt was Boltzmann's former teacher and later colleague at the University of Vienna, and a life-long friend. He had no philosophical reservations against the existence of atoms at all. (Indeed, he is best known for his estimate of their size.) Rather, his main objection was against the prediction by Maxwell and Boltzmann that a gas column in thermal equilibrium in a gravitational field has the same temperature at all heights. His now famous reversibility objection arose in his attempts to undermine this prediction. Whether Boltzmann succeeded in refuting the objection or not is still a matter of dispute, as we shall see below (section 4.1).

Zermelo's opposition had a quite different background. When he put forward the recurrence objection in 1896, he was an assistant to Planck in Berlin. And like his mentor, he did not favor the mechanical underpinning of thermal phenomena. Yet his 1896 paper (Zermelo 1896a) is by no means hostile. It presents a careful logical argument that leads him to a dilemma: thermodynamics with its Second Law on the one hand and gas theory (in the form as Zermelo understood it) on the other cannot both be literally true. By contrast, it is Boltzmann's (1896b) reaction to Zermelo, drenched in sarcasm and bitterness which (if anything) may have led to hostile feelings between these two authors. In any case, the tone of Zermelo's (1896b) is considerably sharper. Still, Zermelo maintained a keen, yet critical, interest in gas theory and statistical physics, and subsequently played an important role in making Gibbs' work known in Germany.

In fact, I think that Boltzmann's rather aggressive reactions to Zermelo and Ostwald should be compared to other polemical exchanges in which he was involved, and sometimes initiated himself (e.g. against Clausius, Tait, Planck, and Bertrand — not to mention his essay on Schopenhauer). It seems to me that Boltzmann enjoyed polemics, and the use of sharp language for rhetorical effect.[5] Boltzmann's complaints in 1896–1898 about an hostile environment are, I think, partly explained by his love of polemic exaggerations, partly also by his mental depression in that period. (See Höflechner 1994, 198–202) for details.) Certainly, the debates with Ostwald and Zermelo might well have contributed to this personal crisis. But it would be wrong to interpret Boltzmann's plaintive moods as evidence that his critics were, in fact, hostile.

Even today, commentators on Boltzmann's works are divided in their opinion. Some praise them as brilliant and exceptionally clear. Often one finds passages suggesting he possessed all the right answers all along the way — or at least in his later writings, while his critics were simply prejudiced, confused or misguided (von Plato, Lebowitz, Kac, Bricmont, Goldstein). Others (Ehrenfests, Klein, Truesdell) have emphasized that Boltzmann's work is not always clear and that he often failed to indicate crucial assumptions or important changes in his position, while friendly critics helped him in clarifying and developing his views.

Fans and critics of Boltzmann's work alike agree that he pioneered much of the approaches currently used in statistical physics, but also that he did not leave behind a unified coherent theory. His scientific papers, collected in Wissenschaftliche Abhandlung, contain more than 100 papers on statistical physics alone. Some of these papers are forbiddingly long, full of tedious calculations and lack a clear coherent structure. Sometimes, vital assumptions, or even a complete change of approach, are stated only somewhere tucked away between the calculations, or at the very last page. Even Maxwell, who might have been in the best position to appreciate Boltzmann's work, expressed his difficulty with Boltzmann's longwindedness (in a letter to Tait, August 1873; see Garber, Brush, and Everett 1995, 123).[6] But not all of his prose is cumbersome and heavy-going. Boltzmann at his best could be witty, passionate and a delight to read. He excelled in such qualities in much of his popular work and some of his polemical articles.

1.3 Boltzmann's relevance for the foundations of statistical physics

The foundations of statistical physics may today be characterized as a battlefield between a dozen or so different schools, each firmly dug into their own trenches, e.g., ergodic theory, coarse-graining (Markovianism), interventionism, BBKGY, Jaynes, Prigogine, etc. Still, many of the protagonists of these schools, regardless of their disagreements, frequently express their debt to ideas first formulated to Boltzmann. Even to those who consider the concept of ensembles as the most important tool of statistical physics, and claim Gibbs rather than Boltzmann as their champion, it has been pointed out that Boltzmann introduced ensembles long before Gibbs. And those who advocate Boltzmann while rejecting ergodic theory, may similarly be reminded that the latter theory too originated with Boltzmann himself.

It appears, therefore, that Boltzmann is the father of many approaches, even if these approaches are presently seen as conflicting with each other. This is due to the fact that during his forty years of work on the subject, Boltzmann pursued many lines of thought. Typically, he would follow a particular train of thought that seemed promising and fruitful, only to discard it in the next paper for another one, and then pick it up again years later. This meandering approach is of course not unusual among theoretical physicists but it makes it hard to pin down Boltzmann on a particular set of rock-bottom assumptions, that would reveal his true colors in the modern debate on the foundations of statistical physics. The Ehrenfests (1912) in their famous Encyclopedia article, set themselves the task of constructing a more or less coherent framework out of Boltzmann's legacy. But their presentation of Boltzmann was, as is rather well known, not historically adequate.

Without going into a more detailed description of the landscape of the battlefield of the foundations of statistical physics, or a sketch of the various positions occupied, it might be useful to mention only the roughest of distinctions. I use the term ‘statistical physics’ as a deliberately vague term that includes at least two more sharply distinguished theories: the kinetic theory of gases and statistical mechanics proper.

The first theory aims to explain the properties of gases by assuming that they consist of a very large number of molecules in rapid motion. (The term ‘kinetic’ is meant to underline the vital importance of motion here, and to distinguish the approach from older static molecular gas models.) During the 1860s probability considerations were imported into this theory. The aim then became to characterize the properties of gases, in particular in thermal equilibrium, in terms of probabilities of various molecular states. This is what the Ehrenfests call "kineto-statistics of the molecule". Here, molecular states, in particular their velocities, are regarded as stochastic variables, and probabilities are attached to such molecular states of motion. These probabilities themselves are conceived of as mechanical properties of the state of the total gas system. Either they represent the relative number of molecules with a particular state, or the relative time during which a molecule has that state.

In the course of time a transition was made to what the Ehrenfests called "kineto-statistics of the gas model", or what is nowadays known as statistical mechanics. In this latter approach, probabilities are not attached to the state of a molecule but of the entire gas system. Thus, the state of the gas, instead of determining the probability distribution, now itself becomes a stochastic variable.

A merit of this latter approach is that interactions between molecules can be taken into account. Indeed, the approach is not restricted to gases, but also applicable to liquids or solids. The price to be paid, however, is that the probabilities themselves become more abstract. Since probabilities are attributed to the mechanical states of the total system, they are no longer determined by such mechanical states. Instead, in statistical mechanics, the probabilities are usually determined by means of an ‘ensemble’, i.e., a fictitious collection of replicas of the system in question.

It is not easy to pinpoint this transition in the course of history, except to say that in Maxwell's work in the 1860s definitely belong to the first category, and Gibbs' book of 1902 to the second. Boltzmann's own works fall somewhere in the middle. His earlier contributions clearly belong to the kinetic theory of gases (although his 1868 paper already applies probability to an entire gas system), while his work after 1877 is usually seen as elements in the theory of statistical mechanics. However, Boltzmann himself never indicated a clear distinction between these two different theories, and any attempt to draw a demarcation at an exact location in his work seems somewhat arbitrary.

From a conceptual point of view, the transition from kinetic gas theory to statistical mechanics poses two main foundational questions. On what grounds do we choose a particular ensemble, or the probability distribution characterizing the system? Gibbs did not enter into a systematic discussion of this problem, but only discussed special cases of equilibrium ensembles (i.e. canonical, micro-canonical etc.). A second problem is to relate the ensemble-based probabilities with the probabilities obtained in the earlier kinetic approach for a single gas model.

The Ehrenfests (1912) paper was the first to recognize these questions, and to provide a partial answer: Assuming a certain hypothesis of Boltzmann's, which they dubbed the ergodic hypothesis, they pointed out that for an isolated system the micro-canonical distribution is the unique stationary probability distribution. Hence, if one demands that an ensemble of isolated systems describing thermal equilibrium must be represented by a stationary distribution, the only choice for this purpose is the micro-canonical one. Similarly, they pointed out that under the ergodic hypothesis infinite time averages and ensemble averages were identical. This, then, would provide a desired link between the probabilities of the older kinetic gas theory and those of statistical mechanics, at least in equilibrium and in the infinite time limit. Yet the Ehrenfests simultaneously expressed strong doubts about the validity of the ergodic hypothesis. These doubts were soon substantiated when in 1913 Rozenthal and Plancherel proved that the hypothesis was untenable for realistic gas models.

The Ehrenfests' reconstruction of Boltzmann's work thus gave a prominent role to the ergodic hypothesis, suggesting that it played a fundamental and lasting role in his thinking. Although this view indeed produces a more coherent view of his multifaceted work, it is certainly not historically correct. Boltzmann himself also had grave doubts about this hypothesis, and expressly avoided it whenever he could, in particular in his two great papers of 1872 and 1877b. Since the Ehrenfests, many other authors have presented accounts of Boltzmann's work. Particularly important are Klein (1973) and Brush (1976). Still, much confusion remains about what exactly his approach to statistical physics was, and how it developed.

1.4 A concise chronography of Boltzmann's writings

Roughly speaking, one may divide Boltzmann's work in four periods. The period 1866-1871 is more or less his formative period. In his first paper (1866), Boltzmann set himself the problem of deriving the full second law from mechanics. The notion of probability does not appear in this paper. The following papers, from 1868 and 1871, were written after Boltzmann had read Maxwell's work of 1860 and 1867. Following Maxwell's example, they deal with the characterization of a gas in thermal equilibrium, in terms of a probability distribution. Even then, he was set on obtaining more general results, and extended the discussion to cases where the gas is subject to a static external force, and might consist of poly-atomic molecules. He regularly switched between different conceptions of probability: sometimes this referred to a time average, sometimes a particle average or, in an exceptional paper (1871b), it referred to an ensemble average. The main result of those papers is that from the Stoßzahlansatz (SZA) (or an analogous assumption) the Maxwellian distribution function is stationary, and thus an appropriate candidate for the equilibrium state. In some cases Boltzmann also argued it was the unique such state.

However, in this period he also presented a completely different method, which did not rely on the SZA but rather on the ergodic hypothesis. This approach led to a new form of the distribution function that, in the limit N → ∞, reduces to the Maxwellian form. In the same period, he also introduced the concept of ensembles, but this concept would not play a prominent role in his thinking until the 1880's.

The next period is that of 1872-1878, in which he wrote his two most famous papers: (1872) (Weitere Studien) and (1877b) (Über die Beziehung). The 1872 paper contained the Boltzmann equation and the H-theorem. Boltzmann claimed that the H-theorem provided the desired theorem from mechanics corresponding to the second law. However, this claim came under a serious objection due to Loschmidt's criticism of 1876. The objection was simply that no purely mechanical theorem could ever produce a time-asymmetrical result. Boltzmann's response to this objection will be summarized later.

The result was, however, that Boltzmann rethought the basis of his approach and in 1877b produced a conceptually very different analysis, which might be called the permutational argument, of equilibrium and evolutions towards equilibrium, and the role of probability theory. The distribution function, which formerly represented the probability distribution, was now conceived of as a stochastic variable (nowadays called a macrostate) subject to a probability distribution. That probability distribution was now determined by the size of the volume in phase space corresponding to all the microstates giving rise to the same macrostate, (essentially given by calculating all permutations of the particles in a given macrostate). Equilibrium was now conceived of as the most probable macrostate instead of a stationary macrostate. The evolution towards equilibrium could then be reformulated as an evolution from less probable to more probable states.

Even though all commentators agree on the importance of these two papers, there is still disagreement about what Boltzmann's claims actually were, and whether he succeeded (or indeed even attempted) in avoiding the reversibility objection in this new permutational argument, whether he intended or succeeded to prove that most evolutions go from less probable to more probable states and whether or not he (implicitly) relied on the ergodic hypothesis in these works. I shall comment on these issues in due course.

The third period is taken up by the papers Boltzmann wrote during the 1880's have attracted much less attention. During this period, he abandoned the permutational argument, and went back to an approach that relied on a combination of the ergodic hypothesis and the use of ensembles. For a while Boltzmann worked on an application of this approach to Helmholtz's concept of monocyclic systems. However, after finding that concept did not always provide the desired thermodynamical analogies, he abandoned this topic again.

Next, in the 1890s the reversibility problem resurfaced again, this time in a debate in the columns of Nature. This time Boltzmann chose an entirely different line of counterargument than in his debate with Loschmidt. A few years later, Zermelo presented another objection, now called the recurrence objection. The same period also saw the publication of the two volumes of his Lectures on Gas Theory. In this book, he takes the hypothesis of molecular disorder (a close relative of the SZA) as the basis of his approach. The permutational argument is only discussed as an aside, and the ergodic hypothesis is not mentioned at all. His last paper is an Encyclopedia article with Nabl presenting a survey of kinetic theory.

2. The Stoßzahlansatz and the ergodic hypothesis

Boltzmann's first paper (1866) in statistical physics aimed to reduce the second law to mechanics. Within the next two years he became acquainted with Maxwell's papers on gas theory of 1860 and 1867, which introduced probability notions in the description of the gas. Maxwell had studied specific mechanical models for a gas (as a system of hard spheres (1860) or of point particles exerting a mutual force on each other inversely proportional to the fifth power of their distance), and characterized the state of such a gas by means of a probability distribution f over the various values of the molecular velocities vec(v). For Maxwell, the probability f(vec(v))d3vec(v) denoted the relative number of particles in the gas with a velocity between vec(v) and vec(v) + d3vec(v). In particular, he had argued that the state of equilibrium is characterized by the so-called Maxwell distribution function:

(1)     f(vec(v)) = Ae−|vec(v)|2/B

where A is a normalization constant and B is proportional to the absolute temperature.

The argument that Maxwell had given in 1860 to single out this distribution relied on the fact that this is the only probability distribution that is both spherically symmetric and factorizes into functions of the orthogonal components vx, vy, vz separately. In 1867, however he replaced these desiderata with the more natural requirement that the equilibrium distribution should be stationary, i.e. it should not change shape as a result of the continual collisions between the particles. This called for a more elaborate argument, involving a detailed consideration of the collisions between particles. The crucial assumption in this argument is what is now known as the SZA. Roughly speaking, it states that the number of particle pairs, dN(vec(v)1,vec(v)2) with initial velocities between vec(v)1 and vec(v)1+d3vec(v)1, and between vec(v)2 and vec(v)2+d3vec(v)2, respectively, which are about to collide in a time span dt is proportional to

(2)     N(vec(v)1,vec(v)2) ∝ N2f(vec(v)1)f(vec(v)2) dtd3vec(v)1d3vec(v)2

where the proportionality constant depends on the geometry of the collision and the relative velocity. For Maxwell, and Boltzmann later, this assumption seemed almost self-evident. One ought to note, however, that by choosing the initial, rather than the final velocities of the collision, the assumption introduced an explicit time-asymmetric element. This, however, was not noticed until 1895. Maxwell showed that, under the SZA, the distribution (1) is indeed stationary. He also argued, but much less convincingly, that it should be the only stationary distribution.

In his (1868), Boltzmann set out to apply this argument to a variety of other models (including gases in a static external force field). However, Boltzmann started out with a somewhat different interpretation of probability in mind than Maxwell. For him, f(vec(v))d3vec(v) is introduced firstly as the relative time during which a (given) particle has a velocity between vec(v) and vec(v)+d3vec(v) (WA I, 50). But, in the same breath, he identifies this with the relative number of particles with this velocity. This equivocation between different meanings of probability returned again and again in Boltzmann's writing.[7] Either way, of course, whether we average over time or particles, probabilities are defined here in strictly mechanical terms, and therefore objective properties of the gas. Yet apart from this striking difference in interpretation, the first section of the paper is a straightforward continuation of the ideas Maxwell had developed in his 1867. In particular, the main ingredient is always played by the SZA, or a version of that assumption suitably modified for the case discussed.

But in the last section of the paper he suddenly shifts course. He now focuses on a general Hamiltonian system, i.e., a system of N material points with an arbitrary interaction potential. The state of this system may be represented as a phase point x = (vec(p)1,…,vec(p)N,vec(q)1,…,vec(q)N) in the mechanical phase space Γ. By the Hamiltonian equations of motion, this point evolves in time, and thus describes a trajectory xt. This trajectory is constrained to lie on a given energy hypersurface H(x) = E, where H(x) denotes the Hamiltonian function.

Now consider an arbitrary probability density ρ(x) over this phase space. He shows, by (what is now known as) Liouville's theorem, that ρ remains constant along a trajectory, i.e., ρ(x0) = ρ(xt). Assuming now for simplicity that all points in a given energy hypersurface lie on a single trajectory, the probability should be a constant over the energy hypersurface. In other words, the only stationary probability with fixed total energy is the microcanonical distribution.

(3)     ρmc = δ(H(x) − E),

where δ is Dirac's delta function.

By integrating this expression over all momenta but one, and dividing this by the integral of ρmc over all momenta, Boltzmann obtained the marginal probability density ρmc(vec(p)1 | vec(q)1,…,vec(q)N) for particle 1's momentum, conditionalized on the particle positions vec(q)1vec(q)N . He then showed that this marginal probability distribution tends to the Maxwell distribution when the number of particles tends to infinity.

Some comments on this result.

First, the difference between the approach relying on the ergodic hypothesis and that relying on the SZA is rather striking. Instead of concentrating on a specific gas model, Boltzmann here assumes a much more general model with an arbitrary interaction potential V(vec(q)1,…,vec(q)N). Moreover, the probability density ρ is defined over phase space, instead of the space of molecular velocities. This is the first occasion where probability considerations are applied to the state of the mechanical system as whole, instead of its individual particles. If the transition between kinetic gas theory and statistical mechanics may be identified with this caesura, (as argued by the Ehrenfests and by Klein) it would seem that the transition has already been made right here.

But of course, for Boltzmann the transition did not involve a major conceptual move, thanks to his conception of probability as a relative time. Thus, the probability of a particular state of the total system is still identified with the fraction of time in which that state is occupied by the system. In other words, he had no need for ensembles or non-mechanical probabilistic assumptions in this paper.

However, note that the equivocation between relative times and relative numbers, which was relatively harmless in the first section of the 1868 paper, is no longer possible in the interpretation of ρ. The probability ρmc(vec(p)1 | vec(q)1,…,vec(q)n)d3vec(p)1 gives us the relative time that the total system is in a state for which particle 1 has a momentum between vec(p)1 and vec(p)1 + d3vec(p)1, for given values of all positions. There is no route back to infer that this has anything to do with the relative number of particles with this momentum.

Second, and more importantly, these results open up a perspective of great generality. It suggests that the probability of the molecular velocities for an isolated system in a stationary state will always assume the Maxwellian form if the number of particles tends to infinity. Notably, this argument completely dispenses with any particular assumption about collisions, or other details of the mechanical model involved, apart from the assumption that it is Hamiltonian. Indeed it need not even represent a gas.

Third, and finally, the main weakness of the present result is its assumption that the trajectory actually visits all points on the energy hypersurface. This is what the Ehrenfests called the ergodic hypothesis.[8] Boltzmann returned to this issue on the final page of the paper (WA I, 96). He notes there that exceptions to his theorem might occur, if the microscopic variables would not, in the course of time, take on all values compatible with the conservation of energy. For example this would be the case when the trajectory is periodic. However, Boltzmann observed, such cases would be immediately destroyed by the slightest disturbance from outside, e.g., by the interaction of a single external atom. He argued that these exceptions would thus only provide cases of unstable equilibrium.

Still, Boltzmann must have felt unsatisfied with his own argument. According to an editorial footnote in the collection of his scientific papers (WA I, 96), Boltzmann's personal copy of the paper contains a hand-written remark in the margin stating that the point was still dubious and that it had not been proven that, even including interaction with an external atom, the trajectory would traverse all points on the energy hypersurface.

2.1 Doubts about the ergodic hypothesis

However, his doubts were still not laid to rest. His next paper on gas theory (1871a) returns to the study of a detailed mechanical gas model, this time consisting of polyatomic molecules, and avoids any reliance on the ergodic hypothesis. And when he did return to the ergodic hypothesis in (1871b), it was with much more caution. Indeed, it is here that he actually first described the worrying assumption as an hypothesis, formulated as follows:

The great irregularity of the thermal motion and the multitude of forces that act on a body make it probable that its atoms, due to the motion we call heat, traverse all positions and velocities which are compatible with the principle of [conservation of] energy. (WA I, 284)

Note that Boltzmann formulates this hypothesis for an arbitrary body, i.e., it is not restricted to gases. He also remarks, at the end of the paper, that "the proof that this hypothesis is fulfilled for thermal bodies, or even is fullfillable, has not been provided" (WA I, 287).

There is a major confusion among modern commentators about the role and status of the ergodic hypothesis in Boltzmann's thinking. Indeed, the question has often been raised how Boltzmann could ever have believed that a trajectory traverses all points on the energy hypersurface, since, as the Ehrenfests conjectured in 1911, and was shown almost immediately in 1913 by Plancherel and Rozenthal, this is mathematically impossible when the energy hypersurface has a dimension larger than 1.

It is a fact that both (1868) [WA I, 96] and (1871b) [WA I, 284] mention external disturbances as an ingredient in the motivation for the ergodic hypothesis. This might be taken as evidence for ‘interventionalism’, i.e., the viewpoint that such external influences are crucial in the explanation of thermal phenomena (see Blatt 1959, Ridderbos & Redhead 1998). Yet even though Boltzmann clearly expressed the thought that these disturbances might help to motivate the ergodic hypothesis, he never took the idea very seriously. The marginal note in the 1868 paper mentioned above indicated that, even if the system is disturbed, there is still no easy proof of the ergodic hypothesis, and all his further investigations concerning this hypothesis assume a system that is either completely isolated from its environment or at most acted upon by a static external force. Thus, interventionalism did not play a significant role in his thinking.[9]

It has also been suggested, in view of Boltzmann's later habit of discretising continuous variables, that he somehow thought of the energy hypersurface as a discrete manifold containing only finitely many discrete cells (Gallavotti 1994). In this reading, obviously, the mathematical no-go theorems of Rozenthal and Plancherel no longer apply. Now it is definitely true that Boltzmann developed a preference towards discretizing continuous variables, and would later apply this procedure more and more (although usually adding that this was fictitious and purely for purposes of illustration and more easy understanding). However, there is no evidence in the (1868) and (1871b) papers that Boltzmann implicitly assumed a discrete structure of mechanical phase space or the energy hypersurface.

Instead, the context of his (1871b) makes clear enough how he intended the hypothesis, as has already been argued by (Brush 1976). Immediately preceding the section in which the hypothesis is introduced, Boltzmann discusses trajectories for a simple example: a two-dimensional harmonic oscillator with potential V(x, y) = ax2 + by2. For this system, the configuration point (x, y) moves through the surface of a rectangle. See Figure 1 below. (See also Cercignani 1998, 148.)

figure1
Figure 1: a/b is rational (= 4/7).

He then notes that if a/b is rational, (actually: if √(a/b) is rational) this motion is periodic. However, if this value is irrational, the trajectory will, in the course of time, traverse "almählich die ganze Fläche" (WA I, 271) of the rectangle. See Figure 2:

figure2
Figure 2: a/b is irrational (= 1/e).

He says in this case that x and y are independent, since for each values of x an infinity of values for y in any interval in its range are possible. The very fact that Boltzmann considers intervals for the values of x and y of arbitrary small sizes, and stressed the distinction between rational and irrational values of the ratio a/b, indicates that he did not silently presuppose that phase space was essentially discrete, where those distinctions would make no sense.

Now clearly, in modern language, one should say in the second case that the trajectory lies densely in the surface, but not that it traverses all points. Boltzmann did not possess this language. In fact, he could not have been aware of Cantor's insight that the continuum contains more than a countable infinity of points. Thus, the correct statement that, in the case that √(a/b) is irrational, the trajectory will traverse, for each value of x, an infinity of values of y within any interval however small, could easily have lead him to believe (incorrectly) that all values of x and y are traversed in the course of time.

It thus seems eminently plausible, by the fact that this discussion immediately precedes the formulation of the ergodic hypothesis, that the intended reading of the ergodic hypothesis is really what the Ehrenfests dubbed the quasi-ergodic hypothesis, namely, the assumption that the trajectory lies densely (i.e. passes arbitrarily close to every point) on the energy hypersurface.[10] The quasi-ergodic hypothesis is not mathematically impossible in higher-dimensional phase spaces. However, the quasi-ergodic hypothesis does not entail the desired conclusion that the only stationary probability distribution over the energy surface is micro-canonical. One might then still conjecture that if the system is quasi-ergodic, the only continuous stationary distribution is microcanonical. But even this is fails in general (Nemytskii and Stepanov 1960).

Nevertheless, Boltzmann remained skeptical about the validity of his hypothesis. For this reason, he attempted to explore different routes to his goal of characterizing thermal equilibrium in mechanics. Indeed, both the preceding (1871a) and his next paper (1871c) present alternative arguments, with the explicit recommendation that they avoid hypotheses. In fact, he did not return to this hypothesis until the 1880s (stimulated by Maxwell's 1879 review of the last section of Boltzmann's 1868 paper). At that time, perhaps feeling fortified by Maxwell's authority, he would express much more confidence in the ergodic hypothesis (see Section 5).

So what role did the ergodic hypothesis play? It seems that Boltzmann regarded the ergodic hypothesis as a special dynamical assumption that may or may not be true, depending on the nature of the system, and perhaps also on its initial state. Its role was simply to help derive a result of great generality: for any system for which the hypothesis is true, its equilibrium state is characterized by (3), from which a form of the Maxwell distribution may be recovered in the limit N → ∞, regardless of any details of the inter-particle interactions, or indeed whether the system represented is a gas, fluid, solid or any other thermal body.

The Ehrenfests have suggested that the ergodic hypothesis played a much more fundamental role. In particular they have pointed out that if the hypothesis is true, averaging over an (infinitely) long time would be identical to phase averaging with the microcanonical distribution. Thus, they suggested that Boltzmann relied on the ergodic hypothesis in order to equate time averages and phase averages, or in other words, to equate two meaning of probability (relative time and relative volume in phase space.) There is however no evidence that Boltzmann ever followed this line of reasoning. He simply never gave any justification for equivocating time and particle averages, or phase averages, at all. Presumably, he thought nothing much depended on this issue and that it was a matter of taste.

3. The H-theorem and the reversibility objection

3.1 1872: The Boltzmann equation and H-theorem

In 1872 Boltzmann published one of his most important papers (Weitere Studien). It contained two celebrated results nowadays known as the Boltzmann equation and the H-theorem. The latter result was the basis of Boltzmann's renewed claim to have obtained a general theorem corresponding to the second law. This paper has been studied and commented upon by numerous authors. Indeed an integral translation of the text has been provided by (Brush 1966). Thus, for the present purposes, a succinct summary of the main points might have been sufficient. However, there is still dispute among modern commentators about its actual content.

The issue at stake is the question whether the results obtained in this paper are presented as necessary consequences of the mechanical equations of motion, or whether Boltzmann explicitly acknowledged that they would allow for exceptions. Klein has written

I can find no indication in his 1872 memoir that Boltzmann conceived of possible exceptions to the H-theorem, as he later called it. (Klein 1973, 73)

Klein argues that Boltzmann only came to acknowledge the existence of such exceptions thanks to Loschmidt's critique in 1877. An opposite opinion is expressed by von Plato (1994). He argues that, already in 1872, Boltzmann was well aware that his H-theorem had exceptions, and thus "already had a full hand against his future critics". Indeed, von Plato states that

… contrary to a widely held opinion, Boltzmann is not in 1872 claiming that the Second Law and the Maxwellian distribution are necessary consequences of kinetic theory. (von Plato 1994, 81)

It might be of some interest to try and settle this dispute.

The Weitere Studien starts with an appraisal of the role of probability theory in the context of gas theory. The number of particles in a gas is so enormous, and their movements are so swift that we can observe nothing but average values. The determination of averages is the province of probability calculus. Therefore, "the problems of the mechanical theory of heat are really problems in probability calculus" (WA I, 317). But, Boltzmann says, it would be a mistake to believe that the theory of heat would therefore contain uncertainties.

He emphasizes that one should not confuse incompletely proven assertions with rigorously derived theorems of probability theory. The latter are necessary consequences from their premisses, as in any other theory. They will be confirmed by experience as soon as one has observed a sufficiently large number of cases. This last condition, however, should be no significant problem in the theory of heat because of the enormous number of molecules in macroscopic bodies. Yet, in this context, one has to make doubly sure that we proceed with the utmost rigor.

Thus, the message expressed in the opening pages of this paper seems clear enough: the results Boltzmann is about to derive are advertised as doubly checked and utterly rigorous. Of course, their relationship with experience might be less secure, since any probability statement is only reproduced in observations by sufficiently large numbers of independent data. Thus, Boltzmann would have allowed for exceptions in the relationship between theory and observation, but not in the relation between premisses and conclusion.

He continues by saying what he means by probability, and repeats its equivocation as a fraction of time and the relative number of particles that we have seen earlier in 1868a:

If one wants […] to build up an exact theory […] it is before all necessary to determine the probabilities of the various states that one and the same molecule assumes in the course of a very long time, and that occur simultaneously for different molecules. That is, one must calculate how the number of those molecules whose states lie between certain limits relates to the total number of molecules (WA I, 317).

This equivocation is not vicious however. For most of the paper the intended meaning of probability is always the relative number of molecules with a particular molecular state. Only at the final stages of his paper (WA I, 400) does the time-average interpretation of probability (suddenly) recur.

Boltzmann says that both he and Maxwell had attempted the determination of these probabilities for a gas system but without reaching a complete solution. Yet, on a closer inspection, "it seems not so unlikely that these probabilities can be derived on the basis of the equations of motion alone…" (WA I, 317). Indeed, he announces, he has solved this problem for gases whose molecules consist of an arbitrary number of atoms. His aim is to prove that whatever the initial state in such a system of gas molecules, it must inevitably approach the state characterized by the Maxwell distribution (WA I, 320).

The next section specializes to the simplest case of monatomic gases and also provides a more complete specification of the problem he aims to solve. The gas molecules are modelled as hard spheres, contained in a fixed vessel with perfectly elastic walls (WA I, 320). Boltzmann represents the state of the gas by a time-dependent distribution function ft(vec(v)) which gives us, at each time t, the relative number of molecules with velocity vec(v).[11] He also states three more special assumptions:

  1. Already in the initial state of the gas, each direction of velocity is equally probable, i.e.,
    f0(vec(v)) = f0(v).
  2. The gas is spatially uniform. That is, the relative number of molecules with their velocities in any given interval does not depend on the location within the vessel.
  3. The Stoßzahlansatz (2).

After a few well-known manipulations, the result from these assumptions is a differentio-integral equation (the Boltzmann equation) that determines the evolution of the distribution function ft(v) from any given initial form.

There are also a few unstated assumptions that go into the derivation of this equation. First, the number of molecules must be large enough so that the (discrete) distribution of their velocities can be well approximated by a continuous and differentiable function f. Secondly, f changes under the effect of binary collisions only. This means that the density of the gas should be low (so that three-particle collisions can be ignored) but not too low (so that collisions would be too infrequent to change f at all. (The modern procedure to put these requirements in a mathematically precise form is that of taking the so-called Boltzmann-Grad limit.) A final ingredient is that all the above assumptions are not only valid at an instant but remain true in the course of time.

The H-theorem. Assuming that the Boltzmann equation is valid for all times, one can prove without difficulty the "H-theorem": the quantity

(4)     H[ft ] := ft(v) ln ft(v) d3vec(v)

decreases monotonically in time, i.e.,

(5)    
dH [ft ]
dt
0

as well as its stationarity for the Maxwell distribution, i.e.,

dH [ft ]/dt  =  0, if ft(v) = AeBv2

Boltzmann concludes this section of the paper as follows:

It has thus been rigorously proved that whatever may have been the initial distribution of kinetic energy, in the course of time it must necessarily approach the form found by Maxwell. […] This [proof] actually gains much in significance because of its applicability on the theory of multi-atomic gas molecules. There too, one can prove for a certain quantity E that, because of the molecular motion, this quantity can only decrease or in the limiting case remain constant. Thus, one may prove that, because of the atomic movement in systems consisting of arbitrarily many material points, there always exists a quantity which, due to these atomic movements, cannot increase, and this quantity agrees, up to a constant factor, exactly with the value that I found in [Boltzmann 1871c] for the well-known integral ∫dQ/T.

This provides an analytical proof of the Second Law in a way completely different from those attempted so far. Up till now, one has attempted to proof that ∫dQ/T = 0 for reversible (umkehrbaren) cyclic[12] processes, which however does not prove that for an irreversible cyclic process, which is the only one note-that occurs in nature, it is always negative; the reversible process being merely an idealization, which can be approached more or less but never perfectly. Here, however, we immediately reach the result that ∫dQ/T is in general negative and zero only in a limit case… (WA I, 345)

Thus, as in his 1866 paper, Boltzmann claims to have a rigorous, analytical and general proof of the Second Law.

3.2 Remarks and problems

1. As we have seen, The H-theorem formed the basis of a renewed claim by Boltzmann to have obtained a theorem corresponding to the second law, at least for gases. A main difference with his previous (1866) claim, is that he now strongly emphasized the role of probability calculus in his derivation. Even so, it will be noted that his conception of probability is still a fully mechanical one. Thus, there is no conflict between his claims that on the one hand, "the problems of the mechanical theory of heat are really problems in probability calculus" and that the probabilities themselves are "derived on the basis of the equations of motion alone", on the other hand. Indeed, it seems to me that Boltzmann's emphasis on the crucial role of probability is only intended to convey that probability theory provides a particularly useful and appropriate language for discussing mechanical problems in gas theory. There is no indication in this paper yet that probability theory could play a role by furnishing assumptions of a non-mechanical nature, i.e., independent of the equations of motion.

2. Note that Boltzmann stresses the generality, rigor and "analyticity" of his proof. He put no emphasis on the special assumptions that go into the argument. Indeed, the Stoßzahlansatz, later identified as the key assumption that is responsible for the time-asymmetry of the H-theorem, is announced as follows:

The determination [of the number of collisions] can only be obtained in a truly tedious manner, by consideration of the relative velocities of both particles. But since this consideration has, apart from its tediousness, not the slightest difficulty, nor any special interest, and because the result is so simple that one might almost say it is self-evident I will only state this result. (WA I, 323)

It thus seems natural that Boltzmann's contemporaries must have understood him as claiming that the H-theorem followed necessarily from the dynamics of the mechanical gas model. Indeed this is exactly how Boltzmann's claims were understood. For example, the recommendation written in 1888 for his membership of the Prussian Academy of Sciences mentions as Boltzmann's main feat that had proven that, whatever its initial state, a gas must necessarily approach the Maxwellian distribution (Kirsten and Körber 1975, 109).

Is there then no evidence at all for von Plato's reading of the paper? Von Plato quotes a passage from Section II, where Boltzmann repeats the previous analysis by assuming that energy can take on only discrete values, and replacing all integrals by sums. He recovers, of course, the same conclusion, but now adds a side remark, which touches upon the case of non-uniform gases:

Whatever may have been the initial distribution of states, there is one and only one distribution which will be approached in the course of time. […] This statement has been proved for the case that the distribution of states was already initially uniform. It must also be valid when this is not the case, i.e. when the molecules are initially distributed in such a way that in the course of time they mix among themselves more and more, so that after a very long time the distribution of states becomes uniform. This will always be the case, with the exception of very special cases, e.g., when all molecules were initially situated along a straight line, and were reflected by the walls onto this line. (WA I, 358)

True enough, Boltzmann in the above quote indicates that there are exceptions. But he mentions them only in connection with an extension of his results to the case when the gas is not initially uniform, i.e., when condition (b) above is dropped. There can be no doubt that under the assumption of the conditions (a. – c.), Boltzmann claimed rigorous validity of the H-theorem.

3. Note that Boltzmann misconstrues, or perhaps understates, the significance of his results. Both the Boltzmann equation and the H theorem refer to a body of gas in a fixed container that evolves in complete isolation from its environment. There is no question of heat being exchanged by the gas during a process, let alone in an irreversible cyclic process. His comparison with Clausius' integral ∫dQ/T (i.e., ointinexact-differentialQ/T in modern notation) is therefore really completely out of place.

The true import of Boltzmann's results is rather that they provide a generalization of the entropy concept to non-equilibrium states, and a claim that this non-equilibrium entropy −kH increases monotonically as the isolated gas evolves from non-equilibrium towards an equilibrium state. The relationship with the second law is, therefore, indirect. On the one hand, Boltzmann proves much more than was required, since the second law does not speak of non-equilibrium entropy, nor of monotonic increase; on the other hand it proves also less, since Boltzmann does not consider more general adiabatic processes.

3.3 1877: The reversibility objection

According to Klein (1973) Boltzmann seemed to have been satisfied with his treatments of 1871 and 1872 and turned his attention to other matters for a couple of years. He did come back to gas theory in 1875 to discuss an extension of the Boltzmann equation to gases subjected to external forces. But this paper does not present any fundamental changes of thought. However, the 1875 paper did contain a result which, two years later, led to a debate with Loschmidt. It showed that a gas in equilibrium in an external force field (such as the earth's gravity) should have a uniform temperature, and therefore, the same average kinetic energy at all heights. This conclusion conflicted with the intuition that rising molecules must do work against the gravitational field, and pay for this by having a lower kinetic energy at greater heights.

Now Boltzmann (1875) was not the first to reach the contrary result, and Loschmidt was not the first to challenge it. Maxwell and Guthrie entered into a debate on the very same topic in 1873. But actually their main point of contention need not concern us very much. The discussion between Loschmidt and Boltzmann is important for quite another issue which Loschmidt only introduced as a side remark:

By the way, one should be careful about the claim that in a system in which the so-called stationary state has been achieved, starting from an arbitrary initial state, this average state can remain intact for all times. […]

Indeed, if in the above case [i.e. starting in a state where one particle is moving, and all the others lie still on the bottom], after a time τ which is long enough to obtain the stationary state, one suddenly assumes that the velocities of all atoms are reversed, we would obtain an initial state that would appear to have the same character as the stationary state. For a fairly long time this would be appropriate, but gradually the stationary state would deteriorate, and after passage of the time τ we would inevitable return to our original state: only one atom has absorbed all kinetic energy of the system […], while all other molecules lie still on the bottom of the container.

Obviously, in every arbitrary system the course of events must be become retrograde when the velocities of all its elements are reversed. (Loschmidt 1876, 139)

Putting the point in more modern terms, the laws of (Hamiltonian) mechanics are such that for every solution one can construct another solution by reversing all velocities and replacing t by −t. Since H[f] is invariant under the velocity reversal, it follows that if H[f] decreases for the first solution, it will increase for the second. Accordingly, the reversibility objection is that the H-theorem cannot be a general theorem for all mechanical evolutions of the gas.

Boltzmann's response (1877a). Boltzmann's responses to the reversibility objection are not easy to make sense of, and varied in the course of time. In his immediate response to Loschmidt he acknowledges that certain initial states of the gas would lead to an increase of the H function, and hence a violation of the H theorem. The crux of his rebuttal was that such initial states were extremely improbable, and could hence safely be ignored.

This argument shows that Boltzmann was already implicitly embarking on an approach that differed from the context of the 1872 paper. Recall that this paper used the concept of probability only in the guise of a distribution function, giving the probability of molecular velocities. There was no such thing in that paper as the probability of a state of the gas as whole. This conceptual shift would become more explicit in Boltzmann's next paper (1877b).

This rebuttal of Loschmidt is far from satisfactory. Any reasonable probability assignment to gas states is presumably invariant under the velocity reversal of the molecules. If an initial state leading to an increase of H is to be ignored on account of its small probability, one ought to assume the same for the state from which it was constructed by velocity reversal. In other words, any non-equilibrium state would have to be ignored. But that in effect saves the H-theorem by restricting it to those cases where it is trivially true, i.e., where H is constant.

The true source of the reversibility problem was only identified by Burbury (1894a) and Bryan1 (1894), by pointing out that already the Stoßzahlansatz contained a time-asymmetric assumption. Indeed, if we replace the SZA by the assumption that the number of collisions is proportional to the product f(vec(v)1′)f(vec(v)2′) for the velocities vec(v)1′, vec(v)2′ after the collision, we would obtain, by a similar reasoning, dH/dt ≤ 0. The question is now, of course, we would prefer one assumption above the other, without falling into some kind of double standards. One thing is certain, and that is that any such preference cannot be obtained from mechanics and probability theory alone.

4. 1877b: The permutational argument

Succinctly, and rephrased in modern terms, the argument is as follows. Apart from Γ, the mechanical phase space containing the possible states x for the total gas system, we consider the so-called μ-space, i.e., the state space of a single molecule. For monatomic gases, this space is just a six-dimensional space with (vec(p),vec(q)) as coordinates. With each state x is associated a collection of N points in μ-space.

We now partition μ into m disjoint cells: μ = ω1∪…∪ωm. These cells are taken to be rectangular in the position and momentum coordinates and of equal size. Further, it is assumed we can characterize each cell in μ with a molecular energy εi.

For each x, henceforth also called the microstate, we define the macrostate (Boltzmann's term was Komplexion) as Ζ := (n1,…,nm), where ni is the number of particles that have their molecular state in cell ωi. The relation between macro- and microstate is obviously non-unique since many different microstates, e.g., obtained by permuting the molecules, lead to the same macrostate. One may associate with every given macrostate Ζ0 the corresponding set of microstates:

AΖ0 = {x∈Γ : Ζ(x) = Ζ0}

The volume |AΖ0| of this set is proportional to the number of permutations that lead to this macrostate. Boltzmann proposes the problem to determine for which macrostate Ζ the volume |AΖ| is maximal, under the constraints of a given total number of particles, and a given total energy:

(6)     N = m

i=1
ni ,       E = m

i=1
niεi .

This problem can easily be solved with the Lagrange multiplier technique. Under the Stirling approximation for ni >> 1 we find

nieλεi ,

which is a discrete version of the Maxwell distribution.

Moreover, the volume of the corresponding set in Γ is related to a discrete approximation of the H-function. Indeed, one finds

(7)     −NH ≈ ln |AΖ|

In other words, if we take −kNH as the entropy of a macrostate, it is also proportional to the logarithm of the volume of the corresponding region in phase space.

Boltzmann also refers to these volumes as the "probability" of the macrostate. He therefore now expresses the second law as a tendency to evolve towards ever more probable macrostates.

4.1 Remarks and problems

1. No dynamical assumption is made; i.e., it is not relevant to the argument whether or how the particles collide. It might seem that this makes the present argument more general than the previous one. Indeed, Boltzmann suggests at the end of the paper that the same argument might be applicable also to dense gases and even to solids.

However, it should be noticed that the assumption that the total energy can be expressed in the form E = ∑iniεi means that the energy of each particle depends only on the cell in which it is located, and not the state of other particles. This can only be maintained, independently of the number N, if there is no interaction at all between the particles. The validity of the argument is thus really restricted to ideal gases.

2. The procedure of dividing μ space into cells is essential here. Indeed, the whole prospect of using combinatorics would disappear if we did not adopt a partition. But the choice to take all cells equal in size in position and momentum variables is not quite self-evident, as Boltzmann himself shows. In fact, before he develops the argument above, his paper first discusses an analysis in which the particles are characterized by their energy instead of position and momentum. This leads him to carve up μ-space into cells of equal size in energy. He then shows that this analysis fails to reproduce the desired Maxwell distribution as the most probable state. This failure is remedied by taking equally sized cells in position and momentum variables. The latter choice is apparently ‘right’, in the sense that leads to the desired result. However, since the choice clearly cannot be relegated to a matter of convention, it leaves the question for a justification.

3. A crucial new ingredient in the argument is the distinction between micro- and macrostates. Note in particular that where in the previous work the distribution function f was identified with a probability (namely of a molecular state), in the present paper it, or its discrete analogy Ζ is a description of the macrostate of the gas. Probabilities are not assigned to the particles, but to the macrostate of the gas as a whole. According to Klein (1973, 84), this conceptual transition in 1877b marks the birth of statistical mechanics. While this view is not completely correct (as we have seen, Boltzmann 1868 already applied probability to the total gas), it is true that (1877b) is the first occasion where Boltzmann identifies probability of a gas state with relative volume in phase space, rather than its relative time of duration.

Another novelty is that Boltzmann has changed his concept of equilibrium. Whereas previously the essential characteristic of an equilibrium state was always that it is stationary, in Boltzmann's new view it is conceived as the macrostate (i.e., a region in phase space) that can be realized in the largest number of ways. As a result, an equilibrium state need not be stationary: in the course of time, the system may fluctuate in and out of equilibrium.

5. Some later work

5.1 Return of the ergodic hypothesis

As we have seen, the 1877 papers introduced some conceptual shifts in Boltzmann’ approach. Accordingly, this year is frequently seen as a watershed in Boltzmann's thinking. Concurrent with that view, one would expect his subsequent work to build on his new insights and turn away from the themes and assumptions of his earlier papers. Actually, Boltzmann's subsequent work in gas theory in the next decade and a half was predominantly concerned with technical applications of his 1872 Boltzmann equation, in particular to gas diffusion and gas friction. And when he did touch on fundamental aspects of the theory, he returned to the issues and themes raised in his 1868–1871 papers, in particular the ergodic hypothesis and the use of ensembles.

This step was again triggered by a paper of Maxwell, this time one that must have pleased Boltzmann very much, since it was called "On Boltzmann's theorem" (Maxwell 1879) and dealt with the theorem discussed in the last section of his (1868). He pointed out that this theorem does not rely on any collision assumption. But Maxwell also made some pertinent observations along the way. He is critical about Boltzmann's ergodic hypothesis, pointing out that "it is manifest that there are cases in which this does not take place" (Maxwell 1879, 694). Apparently, Maxwell had not noticed that Boltzmann's later papers had also expressed similar doubts. He rejected Boltzmann'a time-average view of probability and instead preferred to interpret ρ as an ensemble density. Further, he states that any claim that the distribution function obtained was the unique stationary distribution "remained to be investigated" (Maxwell 1879, 722). Maxwell's paper seems to have revived Boltzmann's interest in the ergodic hypothesis, which he had been avoiding for a decade. This renewed confidence is expressed, for example in Boltzmann (1887):

Under all purely mechanical systems, for which equations exist that are analogous to the so-called second law of the mechanical theory of heat, those which I and Maxwell have investigated … seem to me to be by far the most important. … It is likely that thermal bodies in general are of this kind [i.e., they obey the ergodic hypothesis]

However, he does not return to this conviction in later work. His Lectures on Gas Theory (1896,1898), for example, does not even mention the ergodic hypothesis.

5.2 Return of the reversibility objection

The first occasion on which Boltzmann returned to the reversibility objection is in (1887b). This paper delves into a discussion between Tait and Burbury about the approach to equilibrium for a system consisting of gas particles of two different kinds. The details of the debate need not concern us, except to note that Tait raised the reversibility objection to show that taking any evolution approaching equilibrium one may construct, by reversal of the velocities, another evolution moving away from equilibrium. At this point Boltzmann entered the discussion:

I remark only that the objection of Mr. Tait regarding the reversal of the direction of all velocities, after the special state [i.e., equilibrium] has been reached, […] has already been refuted in my [(1877a)]. If one starts with an arbitrary non-special state, one will get […] the to special state (of course, perhaps after a very long time). When one reverses the directions of all velocities in this initial state, then, going backwards, one will not (or perhaps only during some time) reach states that are even further removed from the special state; instead, in this case too, one will eventually again reach the special state. (WA III, 304)

This reply to the reversibility objection uses an entirely different strategy from his (1877a). Here, Boltzmann does not exclude the reversed motions on account of their vanishing probability, but rather argues that, sooner or later, they too will reach the equilibrium state.

Note how much Boltzmann's strategy has shifted: whereas previously the idea was that a gas system should approach equilibrium because of the H-theorem; Boltzmann's idea is now, apparently, that regardless of the behavior of H as a function of time, there are independent reasons for assuming that the system approaches equilibrium. Boltzmann's contentions may of course very well be true. But they do not follow from the H-theorem, or by ignoring its exceptions, and would have to be proven otherwise.

5.3 The debate in Nature

The 1890s brought three major events in Boltzmann's work on statistical physics. The foremost of these was his participation in the 1894 meeting of the British Association for the Advancement of Science (BAAS) in Oxford, where issues in gas theory were lively debated. Another was his debate with Zermelo, in 1896–1897. The third was the appearance of his two-volume book Lectures in Gas Theory in 1896 and 1898.

The BAAS meeting had a lively aftermath: A discussion between half a dozen authors in the columns of Nature in the years 1894-1895. The central topic of this debate was the paradoxical relation between the H-theorem and the reversibility objection.

The kick-off was an letter by Culverwell (1894) with the innocently-sounding question, "Will anyone say exactly what the H-theorem proves?" The question triggered several responses. Culverwell noted, with some amusement, in a subsequent letter (1895) that they all seemed to argue quite differently. Perhaps the most important responses were Burbury (1894a) and (1894b) and Bryan (1894). Burbury argued that the H-theorem rests on an additional assumption, which was independent of mechanical theory, which he coined "Condition A". His actual phrasing of the condition was somewhat involved. In any case, Burbury, argued out that the reversed motion would not satisfy Condition A, and that the H-theorem would thus fail to be applicable. He thus succeeded in finally illuminating the logical situation between the reversibility objection and the H-theorem. The theorem assumed the validity of a condition, which would be violated in the reversed motion; i.e., Condition A itself already contained a time-asymmetrical element. Further, given the time-reversal invariance of the mechanical laws governing the system, any motivation for the condition would have to come from beyond these laws and must thus be non-mechanical.

Although Burbury had thus succeeded in clarifying the logic of the H-theorem, his actual formulation of Condition A was not particularly transparent. By contrast, Bryan's contribution to the debate may be said to possess the opposite qualities. He was the first to clearly point out that the SZA itself is not invariant under time-reversal. Less convincing is his argument that the time-reversed assumption would suppose the molecules possessing some kind of foresight.

Finally Boltzmann himself intervened in the debate (Boltzmann 1895). He recognized, of course, that the same issues that he discussed with Loschmidt 20 years earlier were again at stake, and elucidated:

My minimum theorem as well as the so-called Second Law of Thermodynamics are only theorems of probability (WA III, 539).

It can never be proved from the equations of motion alone, that the minimum function H must always decrease. It can only be deduced from the laws of probability, that if the initial state is not specially arranged for a certain purpose, but haphazard governs freely, the probability that H decreases is always greater than that it increases. (WA III, 540)

In more detail, his argument is as follows. Consider a gas in a vessel with perfectly smooth and elastic walls, in an arbitrary initial state and let it evolve in the course of time. At each time t we can calculate H(t). Now draw a graph of this function: the H-curve. (In a later discussion, Boltzmann 1897, he actually produced a diagram.)

Barring all cases in which the motion is ‘regular’, e.g., when all the molecules move in one plane, Boltzmann claims the following properties of the curve:

  1. For most of the time, H(t) will be very close to its minimum value, say Hmin.
  2. Because greater values of H are improbable but not impossible, the curve will occasionally, but very rarely, rise to a peak or summit, that may be well above Hmin.
  3. Higher summits are extremely less probable then lower summits.

Suppose that, at some time t = 0, the function attain a certain value H0, well above the minimum value. Now, Boltzmann says, two[13] cases are possible for the evolution of H in time. (a) H0 lies at or near the top of a peak. Then H(t) will decrease, whether we move away in either the positive are negative time direction. (b) H0 lies on an ascending or descending part of the curve, so that H(t) decreases or increases. But, because of (iii), case (a) is much more probable than case (b). Hence, Boltzmann says,

… if we choose an ordinate of given magnitude H0 guided by haphazard in the curve, it will not be certain but very probable that the ordinate decreases if we go in either direction. (WA III, 540)

And,

What I have proved in my papers is as follows: It is extremely probable that H is very near to its minimum value; if it is greater, it may increase or decrease, but the probability that it decreases is always greater. (WA III, 541)

Together, the claims (i)–(iii) constitute a statement of what the Ehrenfests called a ‘statistical H-theorem’.

After having thus elucidated the content of his theorem, Boltzmann addressed the reversibility objection. Suppose the gas is initially in a non-equilibrium state, with a large value of H. Then it is probable, but certain, that it will decrease, and eventually reach its minimum

If at an intermediate stage we reverse all velocities we get an exceptional state where H increases for a certain time and decreases again. But the existence of such cases does not disprove our theorem. On the contrary the theory of probability itself shows that the probability of such cases is not mathematically zero, only extremely small. (WA III, 541)

It is not immediately clear how this refutes the reversibility objection. If we focus on the fact that the reversed state is "exceptional", and take it to belong to those cases ("regular motions") that were explicitly barred in the statement of the theorem, then his reply is like that to Loschmidt. But perhaps his central idea here is rather that in the reversed motion, H increases for a short while, but will eventually decrease again, and continue to do so. In that case his defence is more like the reply to Tait. In actual fact the two are hard to combine. If we admit that the reversed state is exceptional in the sense that the statistical H-theorem does not hold for it, we have no grounds for claiming that eventually H will decrease again.

This particular article is often cited as one of the clearest expositions that Boltzmann ever wrote, and one in which "he is right on the money" (Lebowitz 1999). Still there are a few thing that deserve attention.

1. It seems to me that Boltzmann here adopts an (implicit) identification of probability and relative duration, At the very least, he does not indicate any difference in meaning between phrases like "During the greater part of that time, H will be very near to its minimum" and "It is extremely probable that H lies near its minimum". Also, it seems that to me that his claim (ii) that peaks in the curve are "very rare", and his claim (iii) that they are "very improbable" are intended as synonymous. This way of reading Boltzmann would indeed not fall out of tune with his previous interpretations of probability.

2. But a larger problem looms. Boltzmann says that the statistical H-theorem is something he had "proved in his papers". But that is surely a gross overstatement. No proof of the claims (i)–(iii) for a gas system is to be found anywhere.

The most well-known view to the problem of how to explain this lacuna is, again, supplied by the Ehrenfests. In essence, they suggest that Boltzmann somehow silently relied on the ergodic hypothesis.

It is indeed evident that if the ergodic hypothesis holds, a state will spend time in the various regions of phase space in proportion to their volume. That is to say, during the evolution of the system along its trajectory, regions with a small volume, are visited only sporadically, and regions with larger volume more often.

This would make it plausible how Boltzmann could identify probabilities with relative times. But also, it would make it plausible that if a system starts out from a very small region (an improbable state) it will display a ‘tendency’ to evolve towards the overwhelmingly larger equilibrium state. Of course, this ‘tendency’ would have to be interpreted in a qualified sense: the same ergodic hypothesis would imply that the system cannot stay inside the equilibrium state forever and thus there would indeed be incessant fluctuations in and out of equilibrium. Indeed, one would have to state that the tendency to evolve from improbable to probable states is itself a probabilistic affair: as something that holds true for most of the initial states, or for most of the time, or as some type of expectation value or another. In short, we would then hopefully obtain some statistical version of the H-theorem. Exactly how this statistical H-theorem should be formulated remains an open problem in the Ehrenfests' point of view. Indeed they distinguish between several possible statistical interpretations of the theorem.

The Ehrenfests' reading of Boltzmann's intentions has thus some undeniable advantages. However, there is no evidence that Boltzmann really had the ergodic hypothesis in mind. It seems m more likely that he relied on a naive identification of the various meanings of probability. Further, nobody has ever succeeded in proving a statistical H-theorem on the basis of the ergodic hypothesis, or on the basis of a modern relative such as the hypothesis of ‘metrical transitivity’. Much stronger dynamical and probabilistic conditions seem to be needed for such a proof, and even then it remains a difficult problem whether these are satisfied in a realistic gas model.

Bibliography

In the foregoing and below, "WA" refers to:

Boltzmann, L. (1909), Wissenschaftliche Abhandlungen, Vol. I, II, and III, F. Hasenöhrl (ed.), Leipzig: Barth; reissued New York: Chelsea, 1969.

Primary Sources

Secondary Sources

Other Internet sources

Related Entries

Boltzmann, Ludwig | Mach, Ernst | physics: intertheory relations in | probability, interpretations of | statistical physics: philosophy of statistical mechanics