The Fastest, Simplest, Quickest Derivation Ever of the Ideal Gas Law

John D. Norton
Department of History and Philosophy of Science, University of Pittsburgh
Pittsburgh PA 15260. Homepage: www.pitt.edu/~jdnorton
This page is available at www.pitt.edu/~jdnorton/goodies

Consider a gas in a homogeneous gravitational field, such as described in the main text, where it is presumed that the gas is governed by Maxwell-Boltzmann statistical physics. It is shown that (micro to macro) the presumption that the gas consists of finitely many, spatially localized, independent molecules leads to the ideal gas law. The derivation is sufficiently spare in its assumptions that it applies to many other systems as well. And the converse (macro to micro) is shown in the following sense: if we assume that the gas consists of finitely many, spatially localized molecules and it is governed by Maxwell-Boltzmann statistical physics, then if it obeys the ideal gas law, its molecules are independent.

Quick and Dirty Version

For an ideal gas in a homogeneous gravitational field, the probability that a molecule is at height h is proportional to exp(-E(h)/kT), where E(h) is the energy of the gas at height h.

Therefore, the density rho of the gas is given by
   rho(h) = rho(0) . exp(-E(h)/kT)

The density gradient is found by differentiation
   d rho/dh = -(1/kT) . (dE/dh) . rho

The gravitational force density f is just
   f = - (dE/dh) . rho
and it is balanced by a pressure gradient for which
   f = dP/dh

Combining the last three equations we have
   (d/dh)(P - rho kT) = 0

Its solution is
   P = rho kT
which is equivalent to the usual expression for the ideal gas law for the case of a gas of n molecules of uniform density spread over volume V in which rho = n/V
   PV = nkT

The derivation is sufficiently direct for it to be plausible that it can be reversed and the independence of the molecules deduced from the ideal gas law. Of course the details of the inference in both directions are a little more complicated as the following shows.

Before we proceed, note what is not in the derivation. It is not assumed that a gas must consist of molecules moving uniformly in straight lines between collisions; or that the gas molecules are the only matter present. As a result, the derivation works for many other systems such as: a component gas or vapor in a gas mixture; a solute exerting osmotic pressure in a dilute solutions; and larger, microscopically visible particles suspended in a liquid.

Slightly Longer and Messier Version

Micro to macro

The gas consists of a large number n of molecules at thermal equilibrium at temperature T in a homogeneous gravitational field. According to the Boltzmann distribution, the probability of any given configuration of molecules is determined by the total energy E_tot of the n molecules and is proportional to exp(-E_tot/kT).
More precisely, this factor gives the probability density in the system's canonical phase space.

This total energy is given by the sum of the energies of the individual molecules
E_tot = E₁ + ... + E_n The independence of the molecules is represented by the absence of an interaction term in the expression for the energy E_tot. The total energy is just the arithmetic sum of the energies E_i of the individual molecules; so each molecule may change its energy without affecting the energies of the others.

The energy E_i of each individual (i-th) molecule is in turn determined by the molecule's speed and height h in the gravitational field E_i= E_KE + E(h) where E_KE is the kinetic energy of the molecule and E(h) is the energy of height for a molecule at height h. By convention, we set E(0)=0.

Since exp(-(E_KE + E(h))/kT) = exp(-E_KE/kT) . exp(-E(h)/kT) the kinetic energy of the molecule will be probabilistically independent of the energy of height. Thus the kinetic energy is independent of height and so can be neglected in what follows.

Factoring the above exponential term from the Boltzmann distribution, we find that the probability that a given molecule will be found at height h is P(h) = constant. exp(-(E_KE+E(h))/kT)
Since the position of the molecules are independent of one another, the spatial density rho(h) of molecules at height h of the gas is proportional to the probability that any given molecule is at height h. So it is
rho(h) = rho(0) . exp(-E(h)/kT)
The density gradient is recovered by differentiation with respect to h
(d/dh) rho(h)= -1/kT . (dE(h)/dh) . rho(h)
Gravitational forces are exerted on the gas. The gravitational force density f is given by
f = - (dE(h)/dh) . rho(h)
The gas is also subject to a homogeneous pressure P. The downward action of the gravitational force is balanced by a pressure gradient. At equilibrium
f = dP/dh
Combining, we have
(d/dh) rho(h)kT = -(dE(h)/dh) . rho(h)
= f = dP/dh
Rearranging the terms, we have
(d/dh)(P - rho kT) = 0 which yields on integration P = rho kT where the constant of integration has been set to zero on the assumption that the pressure P of the gas vanishes for vanishing density rho.

This last equation is the ideal gas law in local form. It reverts to the more familiar form if we apply it to the limiting case of a volume V of gas in an infinitely weak gravitational field--i.e. the gravitation free case. Then the gas is homogeneous and its density rho = n/V. Then the law becomes
PV = nkT
Note that gravitation plays an indirect role only in this derivation. It is merely a way to probe the gas pressure and any other field would serve equally well. What has simplified the derivation is that the probe is local and distributed throughout the gas, whereas the more usual way of probing pressure is to determine the forces exerted by the gas on a containing wall.

Macro to micro^*

To reverse the inference, we assume that we have a gas of finitely many, spatially localized molecules that obeys the ideal gas law
PV = nkT If the gas is at equilibrium in a homogeneous gravitational field, we must use the local form of the ideal gas law P = rho kT where rho is the spatial density of molecules.

Differentiating, we recover a relation between the pressure and density gradients.
dP/dh = kT . (d rho/dh)
The gas is subject to a gravitational force density. To determine it, we take the state of the gas at just one instant and consider the energy of a molecule at height h. Its energy will be given by some expression E(h,x_eq) where the vector quantity x_eq represents the positions of all n molecules of the gas at that moment in the equilibrium distribution, excluding the height component of the position of the molecule in question. The presence of this quantity as an argument for E represents the possibility that the energy of the molecule may also depend on the positions of the remaining molecules; that is, that the molecule is not independent of the others.

The gravitational force density at height h at that instant is given by f = - (dE(h,x_eq)/dh) . rho(h)
This force density is balanced by a gradient in the homogeneous pressure gradient satisfying the equilibrium condition
f = dP/dh
Combining the last three equations, we have
(d/dh) rho(h) = - (1/kT) . (dE(h,x_eq)/dh) rho(h) The solution of this differential equation is
rho(h) = rho(0) . (exp(-E(h,x_eq)/kT) where by convention E(0,x_eq)=0.

Let us presume, as is standard, that the energy of interaction between the molecules is a function only of the displacements between the molecules.** Thus it is independent of the direction in space of these displacements.

To see that there are no such low order interaction terms, consider the density of clusters of m molecules at the same height h, where m is much smaller than n. Since the clusters are only required to be at height h, the molecules forming the clusters may be well separated in space horizontally. Recalling that the gas is homogeneous in the horizontal direction, the ideal gas law, re-expressed in term of the density rho_m of clusters of size m is P = rho_m mkT
Repeating the derivation above, we find that the density at height h of these m-clusters is rho_m(h) = rho_m(0) . exp(-E_m(h,x_eq)/mkT) where E_m(h,x_eq) is the energy of each m-cluster of molecules at this same instant in the equilibrium distribution.

Recalling that rho = m . rho_m, we now have rho(h) = rho(0) . exp(-E_m(h,x_eq)/mkT)
Comparing this expression for rho(h) with the similar one derived earlier, we infer E_m(h,x_eq) = m . E(h,x_eq)
That is, the energy of a cluster of m molecules at height h is just m times the energy of one molecule at height h, which asserts the independence of the energy of each molecule in the cluster from the others. Since the molecules in the cluster may be widely spaced horizontally and the law of interaction does not distinguish horizontal and vertical directions, it follows that there is no interaction, either short or long range, for m molecules.

Thus we preclude any interaction between the molecules up to m-fold interactions. That leaves the possibility of interactions that only activate when more than m molecules are present. We can preclude any such higher order interaction being activated and relevant to the equilibrium distribution if we assume that all interactions are short range, for the above argument allows us to set m at least equal to the number of molecules that can cluster together in one small location over which a short range interaction can prevail.

^* In "macro to micro," I try to do compactly what is done more systematically by employing the theory of virial coefficients. In that theory, the ideal gas law P = rho kT is generated from a gas Hamiltonian that has no terms representing interactions between the molecules. Adding interaction terms augments the rho dependence of pressure to P = rho kT (1 + B(T)rho + C(T)rho² + ... ), where the second, third, ... virial coefficients B(T), C(T), ... arise from adding terms to the Hamiltonian that represent pairwise molecular interactions (for B(T)), three-way molecular interactions (for C(T)), and so on. Since the nth virial coefficient appears only if there is an n-fold interaction between molecules, the reversed macro to micro inference is automatic, under the usual assumptions of the theory. (Notably, they include that the interaction terms are functions of the differences of molecular positions only.) Since the second, third and all higher order virial coefficients vanish for the ideal gas law, we infer from the law that the gases governed by it have non-interacting molecules. (I am grateful to George Smith for drawing my attention to the virial coefficients.)

^**Note that conformity to the ideal gas law does not preclude interactions via the momentum degrees of freedom of the molecules. For the presence of such interactions would not preclude recovery of the ideal gas law. Such interactions could appear as a dependence of the kinetic energy E_KE of a molecule on the canonical momenta of the remaining moleclues. In computing the probability that the molecule is at height h, we integrate over all these momenta--those of the molecule in question and all others. The resulting term is absorbed into the constant of P(h) = constant. exp(-E(h)/kT). Here, as a separate assumption, we assume that there are no interactions mediated by the momenta. This is a standard assumption in the classical realm. Without it, the molecules of a gas would be interacting with equal strength with all other molecules, no matter how far away they may be. The result would be a breakdown of the locality of the state of a gas and the possibility of divergences as the size of the system becomes arbitrarily large.