LIOUVILLE’S THEOREM AND THE FOUNDATION OF CLASSICAL MECHANICS

1. Introduction

In this article, the theory of classical mechanics is approached from a different perspective. Its purpose is entirely pedagogical. I have taught classical mechanics in the traditional way, starting with Newton’s laws of motion, and following up with Hamilton’s principle, the Euler–Lagrange equations of motion, the Hamilton’s equations and the Liouville’s theorem. The students have, in general, had problems seeing the relations between the standard mathematical representations of classical mechanics. In each class, there are typically a few students that ask whether there exists a foundational principle of classical mechanics that is independent of the specific mathematical representation being chosen. I have never been able to answer this question in a satisfactory manner. This article grew out of the desire to address this question.

Clearly, there is no reason to believe that there exists a unique principle. However, in this article a specific point of departure is identified and shown to lead to the traditional formulations. The suggested principle is that of the conservation of information. It is argued that the Liouville’s theorem is the mathematical representation of this principle. The Hamilton’s equations, the Hamilton’s principle of the least action and the invariance of the Poisson algebra are then understood as different manifestations of the Liouville’s theorem.

There is nothing new appearing in this article. Everything is known from before. What then, one might ask, is the purpose and use of the article? The answer is threefold. First, and foremost, it suggests an alternative way to teach the subject. As a teacher for many years, it is obvious that it is beneficial to have a diverse repertoire when it comes to presenting and explaining a topic. Personally, I take great pleasure in being able to explain the subject to my students in different ways. Secondly, it provides a different point of view on an old and well-known subject. Even though it might not be of any use in the immediate, or near, future, it is generally good to be aware of a multitude of equivalent perspectives on any given problem. Thirdly, to the best of my knowledge, the Hamilton’s principle has never been derived from the Liouville’s theorem.

2. Determinism and information

In classical mechanics, it is a fundamental assumption that the evolution of a system is deterministic in both directions of time, i.e. both into the future and into the past. The deterministic evolution of a system means that it is possible, with absolute certainty, to say that any given state of the system evolved from a definite single state in the past and will evolve into a definite single state in the future. There cannot be any ambiguity in the evolutionary history of a system. Thus, the deterministic evolution implies that nowhere on the phase space can states converge or diverge (see Fig. 1).

Fig. 1. Non-deterministic evolution implies that system trajectories would cross each other on the phase space, here at point (q₀, p₀).

Systems that appear to evolve non-deterministically give rise to the appearance of irreversible processes in nature. The reason for this is that if a system starts out in a given state, it is not necessarily the case that the system ends up at the same initial state by reversing the motion of the system in time. An example of a seemingly irreversible process is the sliding of a block of cheese along a table. Due to friction the block will always come to rest, apparently independently of the initial condition of the block. Thus, it appears as though the multitude of possible initial states for the block, given by the possibility of sending off the block with different initial speeds, all converge to the same final state where the block is at rest. Knowing the final state of the system does not help in predicting the initial state of the system. Therefore, the experiment with sending off the block of cheese seems to represent an evolution which is non-deterministic into the past.

The origin for the apparent violation of reversibility in physical processes is not due to a fundamental character in physical laws, but rather it is due to the ignorance of the observer. The observer has not taken into account all the details of the system. Degrees of freedom for the system have been ignored. In the case of the sliding block of cheese, it is the individual motion of atoms in the block and the table which has been ignored. Assuming that all degrees of freedom for the block and the table are followed in detail as the block slides on the table it is clear that each unique initial state will give rise to a unique final state where the distinction between the final states is given by the distinct final position and the velocity of each atom in the block and table.

A direct consequence of the assumption of deterministic evolution is that distinctions between physical states never disappear. If there is an initial distinction between the states, this distinction will survive throughout the entire motion of the system. These distinctions between the states seem to disappear as time unfold is merely a consequence of the difficulty for an observer to keep a perfect track of the motion of all particles. In the case of the sliding block, for a human observer, the distinction between individual motions of atoms in the block and the table is too small to measure and therefore it appears as though two distinct initial states, characterized by distinct initial speeds, which are easy to measure, converge to the same final state, i.e. that the block is at rest. In conclusion, the assumption of deterministic evolution can equivalently be stated as follows:

The distinction between the physical states of a closed system is conserved in time.

Due to the conservation of the distinction between physical states, any set of states which lie in the interior of some volume element on phase space will remain the interior of this volume element as the system evolves in time.

If a system is followed, as it evolves in time, in detail by an observer, it means that the observer has perfect and complete knowledge about all the degrees of freedom of the system, i.e. the observer knows, with an infinite precision, the exact position and momenta of all particles within the system. In such an ideal scenario, the observer has no problem to see the distinction between the states of the system. The amount of knowledge, or information, about the system possessed by the observer, at any instant of time, is complete. Since the ideal observer never loses the track of the system, the distinction between states is never lost. In other words, the knowledge, or information, that the observer has about the system is not lost as the system evolves in time.

If, however, as is the case in practical reality, the observer has a limited ability to track the motion of individual particles, the observer does not possess complete information about the system. Even worse, the observer may, as is usually the case for complicated systems with many degrees of freedom, find it more and more difficult to track the system as time unfolds. In such a scenario, the amount of information about the system, possessed by the observer, decreases with time. In other words, from the perspective of the ignorant observer, information about the system is lost. However, it is important to emphasize that this apparent loss of information is entirely due to the ignorance of the observer. If all the degrees of freedom were tracked with an infinite precision, information would never be lost. In the case of the sliding block of cheese, the observer has lost information because the system was known to exist in one of two distinct initial states, obtained by measuring the initial speed of the block, whereas it is not possible to distinguish between the two final states.

In conclusion, the loss of the distinction between states implies that information has been lost. Thus, the conservation of distinction between the states can equivalently be stated as an assumption of information conservation:

The information contained within a closed system is conserved in time.

In other words, the assumption that classical systems evolve deterministically, i.e. that the state of the system is perfectly predictable by an observer both into the future and back to the past, is equivalent to the statement that an observer of the system possesses complete information about the system, and assufming that the system is closed, this amount of information is never lost.

3. The Liouville’s theorem

Consider the arbitrary region Ω on the 2-dimensional phase space, with the volume V_Ω and the volume element ∆q∆p. The mathematical condition imposing information conservation is

\frac{Δ N}{Δ t} = 0, (1)

where N is the number of states within the phase space volume Ω. The condition states that N can neither increase nor decrease within the time interval ∆t. For this condition to be satisfied, it is necessary that the incoming and outgoing flow of states through Ω within ∆t cancels, i.e. that

Δ (ρ (q, p) \dot{q}) + Δ (ρ (q, p) \dot{p}) = 0, (2)

where ρ(q, p) is the density of states on the phase space, and the flow differences are defined by, respectively,

Δ (ρ (q, p) \dot{q}) \equiv \{ρ (q_{out}, p) {\dot{q}}_{out} - ρ (q_{in}, p) {\dot{q}}_{in}\} Δ p (3)

and

Δ (ρ (q, p) \dot{p}) \equiv \{ρ (q, p_{out}) {\dot{p}}_{out} - ρ (q, p_{in}) {\dot{p}}_{in}\} Δ q . (4)

In a differential form the condition, aﬅer having been extended to be valid for an arbitrary length of time, reads in the vector notation as

\frac{\partial ρ}{\partial t} + \nabla \cdot (ρ v) = 0, (5)

where

\nabla \equiv (\frac{\partial}{\partial q}, \frac{\partial}{\partial p}) (6)

is the differential operator on the phase space, and

v \equiv (\dot{q}, \dot{p}) (7)

is the velocity by which states flow on the phase space. Equation (5) is the Liouville’s continuity equation [1] for the density of states on the phase space. It says that the number of states is locally conserved. The term ∇·(ρv) represents the net flow of states through Ω, i.e. the difference between the outflow and inflow of states. The continuity equation can be rewritten as

\frac{d ρ}{d t} + ρ \nabla \cdot v = 0, (8)

by using the total time derivative of the density of states and the product rule applied to the net flow of states. Thus, if the divergence of the phase flow velocity vanishes, i.e. if

\nabla \cdot v = 0, (9)

then, by the continuity equation, the density of states on the phase space is constant in time along the flow on the phase space, i.e.

\frac{d ρ}{d t} = 0. (10)

In such a situation, the flow of the system on the phase space is incompressible because the condition that the density of states at any given location (q, p) on the phase space, within an arbitrary region Ω, does not change over time ensures that the states do not lump together. In other words, in conclusion, a necessary and sufficient condition for the flow of the system on the phase space to conserve information is that the divergence of the phase flow velocity vanishes. This is the Liouville’s theorem [1].¹

The 2-dimensional Liouville’s theorem straight-forwardly generalizes to the 6N-dimensional phase space. Each conjugate pair (q_j, p_j), where j ∈ [1, 3N], gives rise to an independent Liouville’s continuity equation, i.e.

\frac{d ρ_{j}}{d t} + p_{j} \nabla \cdot v_{j} = 0, j \in [1, 3 N], (11)

where ρ_j ≡ ρ(q_j, p_j) is the density of states in the 2-dimensional subset (q_j, p_j) of the 6N-dimensional phase space and ${\vec{v}}_{j} \equiv ({\dot{q}}_{j}, {\dot{p}}_{j})$ is the phase flow velocity along this subset. Thus, information is conserved on the 6N-dimensional phase space if the divergence of each phase flow velocity ${\vec{v}}_{j}$ vanishes, i.e. if

\nabla \cdot v_{j} = 0 \forall j \in [1, 3 N] . (12)

4. Hamilton’s equations

The vanishing divergence of the flow velocity v_j for all conjugate pairs (q_j, p_j), j ∈ [1, 3N], written out explicitly in terms of its velocity components q̇_j and ṗ_j, becomes

\frac{\partial {\dot{q}}_{j}}{\partial q_{j}} + \frac{\partial {\dot{p}}_{j}}{\partial p_{j}} = 0 \forall j \in [1, 3 N] . (13)

Let $H$ be a smooth function on the 6N-dimensional phase space with the property that it contains no terms that mix different conjugate pairs, e.g. p_i · p_j, ∀i ≠ j. In this situation, taking into account that the set of conjugate pairs ${\{(q_{j}, p_{j})\}}_{j = 1}^{3 N}$ are postulated to be independent, the condition of vanishing divergence can equivalently be stated by the set of differential equations known as Hamilton’s equations,

{\dot{q}}_{j} = \frac{\partial H}{\partial p_{j}} \forall j \in [1, 3 N], (14)

{\dot{p}}_{j} = - \frac{\partial H}{\partial q_{j}} \forall j \in [1, 3 N] . (15)

Under these circumstances, the Hamilton’s equations are, according to the Liouville–Arnold theorem [4, 5], integrable from a set of known initial conditions. This simply means that they describe an evolution of the system which is unique and deterministic. Thus, given the function $H$ , the flow of the system in time is determined by how $H$ changes on the phase space. In this sense, $H$ is said to be the generator for the motion in time of the system on the phase space. The flow of the system on the phase space, described by the Hamilton’s equations, is referred to as a Hamiltonian flow.

5. The Hamiltonian and Lagrangian

Equation (14), for a specific conjugate pair (q_j, p_j), corresponds to the integral equation

H (p_{j}) = \int d p_{j} {\dot{q}}_{j} (p_{j}) . (16)

The momentum p_j and speed q̇_j are assumed to be in one-to-one correspondence. This means that for each value of q̇_j there is a unique value for p_j, and vice versa. The function $H$ (p_j) is then geometrically interpreted as the unique area under the q̇_j(p_j)-graph, bounded by (0, p_j) and (0, q̇_j(p_j)), see Fig. 2.

Fig. 2. The areas under q̇_j(p_j) and ṗ_j(q_j) graphs define the Hamiltonian and Lagrangian, respectively.

Due to the one-to-one correspondence between p_j and q̇_j it is possible to define a related area, $L$ (q̇_j), given by the unique area under the p_j(q̇_j)-graph,

L ({\dot{q}}_{j}) = \int d {\dot{q}}_{j} p_{j} ({\dot{q}}_{j}) . (17)

This integral equation corresponds to the differential equation

\frac{d L ({\dot{q}}_{j})}{d {\dot{q}}_{j}} = p_{j} . (18)

The total area of the rectangle bounded by (0, p_j) and (0, q̇_j) is given by

L ({\dot{q}}_{j}) + H (p_{j}) = p_{j} \cdot {\dot{q}}_{j} . (19)

It is possible to include a dependence on the generalized coordinate q_j under the constraint that any q_j-dependent terms in the functions $H$ and $L$ cancel, such that the total area is q_j-independent. Thus, in general, the functions $H$ and $L$ , referred to as the Hamiltonian and Lagrangian, respectively, satisfy the so-called Legendre transformation, i.e.

L (q_{j}, {\dot{q}}_{j}) + H (q_{j}, p_{j}) = p_{j} \cdot {\dot{q}}_{j}) (20)

where

L (q_{j}, {\dot{q}}_{j}) = \int_{0}^{{\dot{q}}_{j}} d {\dot{q}}_{j} p_{j} ({\dot{q}}_{j}) - U (q_{j}), (21)

H (q_{j}, p_{j}) = \int_{0}^{p_{j}} d p_{j} {\dot{q}}_{j} (p_{j}) + U (q_{j}) . (22)

The requirement that the total area is q_j-independent causes the Hamiltonian and Lagrangian to have a relative sign difference for the function U(q_j).

For the 6N-dimensional phase space, the Hamiltonian and Lagrangian are defined by

L (q, \dot{q}) = \sum_{j = 1}^{3 N} \int_{0}^{{\dot{q}}_{j}} d {\dot{q}}_{j} p_{j} ({\dot{q}}_{j}) - U (q), ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ (23)

H (q, p) = \sum_{j = 1}^{3 N} \int_{0}^{p_{j}} d p_{j} {\dot{q}}_{j} (p_{j}) + U (q), (24)

where the function U(q), defined by

U (q) \equiv \sum_{j = 1}^{3 N} U (q_{j}), (25)

is referred to as the potential energy of the system.

6. Principle of stationary action

The pair of Hamilton’s equations

- \frac{\partial H}{\partial q_{j}} - {\dot{p}}_{j} = 0, (26)

{\dot{q}}_{j} - \frac{\partial H}{\partial p_{j}} = 0 (27)

is the local differential representation of the principle of information conservation on the phase space. A global, or integral representation can be obtained by considering the entire evolutionary path from some initial time t_i to some final time t_f, where the Hamilton’s equations are integrated over time.² For this purpose, multiply the Hamilton’s equations with two independent arbitrary functions of time, δq_j(t) and δp_j(t), representing, respectively, small displacements in q_j and p_j on the phase space, in the following manner:

(- \frac{\partial H}{\partial q_{j}} - {\dot{p}}_{j}) δ q_{j} (t) = 0, (28)

({\dot{q}}_{j} - \frac{\partial H}{\partial p_{j}} -) δ p_{j} (t) = 0. (29)

The displacements δq_j(t) and δp_j(t) are pictured as slight variations of the physical path on the phase space, i.e.

q_{j} (t) \to q_{j} (t) + δ q_{j} (t), (30)

p_{j} (t) \to p_{j} (t) + δ p_{j} (t) . (31)

Equations (28) and (29) are equivalent to the Hamilton’s equations since they hold for arbitrary variations. The fact that it is necessary to introduce two displacement functions is due to the independence of the state parameters q_j and p_j. The boundary conditions are given by

δ q_{j} (t_{i}) = δ q_{j} (t_{f}) = 0, ​ (32)

δ p_{j} (t_{i}) = δ p_{j} (t_{f}) = 0, ​ (33)

i.e. the variations vanish at the initial and final times. Integrating the Hamilton’s equations over time from t_i to t_f gives, to the leading order in the variations,

\int_{t_{i}}^{t_{f}} d t [(- \frac{\partial H}{\partial q_{j}} - {\dot{p}}_{j}) δ q_{j} (t) + ({\dot{q}}_{j} - \frac{\partial H}{\partial p_{j}}) δ p_{j} (t)] = 0. (34)

Integration by parts and recalling the boundary conditions give

δ A (q_{j}, {\dot{q}}_{j}) = 0, (35)

where

A (q_{j}, {\dot{q}}_{j}) \equiv \int_{t_{i}}^{t_{f}} d t L (q_{j}, {\dot{q}}_{j}) (36)

is the action of the system within the subset (q_j, p_j) on the 6N-dimensional phase space. The action on the entire phase space is given by

\begin{array}{r} A (q, \dot{q}) & \equiv \sum_{j = 1}^{3 N} \int_{t_{i}}^{t_{f}} d t L (q_{j}, {\dot{q}}_{j}) \\ = \int_{t_{i}}^{t_{f}} d t L (q, \dot{q}) . \end{array} (37)

This is the Hamilton’s formulation of the principle of stationary action, or briefly, the Hamilton’s principle. It is a global representation of information conservation, i.e. a statement on the entire evolutionary path which must be satisfied if the system is to adhere to the principle of information conservation.

Since the Hamilton’s principle can be derived from the Hamilton’s equations, which in turn are an immediate consequence of the requirement that the divergence of the Hamiltonian flow velocity vanishes, it should be possible to obtain the Hamilton’s principle directly from the requirement that ∇ · v_j = 0 is invariant under the displacements δq_j(t) and δp_j(t). Given that the variations are small, the flow velocity ${\vec{v}}_{j}$ can be expanded as a Taylor series about the state (q_j, p_j), where terms that are of quadratic or higher order in the variations δq_j and δp_jcan be ignored. The infinitesimal change in v_j thus becomes

\begin{array}{r} δ v_{j} & = v_{j} (q_{j} + δ q_{j}, p_{j} + δ p_{j}) - v_{j} (q_{j}, p_{j}) \\ = δ q_{j} \frac{\partial}{\partial q_{j}} v_{j} + δ p_{j} \frac{\partial}{\partial p_{j}} v_{j} . \end{array} (38)

The divergence of the flow velocity transforms as

\nabla \cdot v_{j} \to \nabla \cdot (v_{j} + δ v_{j}) = \nabla \cdot v_{j} + \nabla \cdot δ v_{j} . (39)

If ∇ · δv_j≠ 0, information is not conserved for the deviated path. Therefore, it is required that

\nabla \cdot δ v_{j} = 0, (40)

which is equivalent to

δ (\nabla \cdot v_{j}) = 0. (41)

This statement is for a blob of volume dV which encloses the single state (q_j, p_j). Information conservation should hold for all varied states along the evolutionary path of the system, from the initial state (q_j, p_j)_i, at time t_i, to the final state (q_j, p_j)_f, at time t_f. Thus, the above statement should be integrated over all the blobs of volume dV along the path, i.e. the integration is over a tube, with the volume V, the interior of which defines the region of extended phase space where the principle of information conservation is fulfilled. Thus,

δ \int_{t_{i}}^{t_{f}} d t \int_{V} d V \nabla \cdot v_{j} = 0. (42)

Applying the divergence theorem

\int_{V} d V \nabla \cdot v_{j} = \int_{\partial V} d S \cdot v_{j} (43)

gives

δ \int_{t_{i}}^{t_{f}} d t \int_{\partial V} \vec{dS} \cdot v_{j} = 0. (44)

The integrand dS · v_j represents the density of the net Hamiltonian flow out of the tube. The surface area element dS is given by

d S = d S n, (45)

where n = (p_j, q_j) is the normal vector to the surface of the tube, i.e. n gives the direction in the phase space in which the system has to flow if it is to eventually reach a region where the principle of the conservation of information no longer holds. Thus, with v_j = (q̇_j, ṗ_j), the integrand becomes

(p_{j}, q_{j}) \cdot ({\dot{q}}_{j}, {\dot{p}}_{j}) = p_{j} {\dot{q}}_{j} + q_{j} {\dot{p}}_{j} (46)

Using that q_j= ∫dq_j and the Hamilton’s equation ${\dot{p}}_{j} = - \frac{\partial H}{\partial q_{j}}$ , the integrand can be written as

p_{j} {\dot{q}}_{j} - \int d q_{j} \frac{\partial H}{\partial q_{j}} = p_{j} {\dot{q}}_{j} - \int d H = p_{j} {\dot{q}}_{j} - H . (47)

Equivalently, the integrand could have been written as

q_{j} {\dot{p}}_{j} + H, (48)

by using that p_j = ∫dp_j and the other Hamilton equation ${\dot{q}}_{j} = \frac{\partial H}{\partial p_{j}}$ . However, the form p_jq̇_j – $H$ is the preferred choice due to the fact that it is equal to the Lagrangian $L$ (q_j, q̇_j). Thus, on the 6N-dimensional phase space it is obtained that

δ \int_{t_{i}}^{t_{f}} d t \int d S L = 0. (49)

The equality must hold independently of the surface area of the tube, i.e. the principle of information conservation should hold true independently of the number of states in which the system can exist. Therefore, the integration over the surface area can be taken outside of the infinitesimal variation, giving that

δ \int_{t_{i}}^{t_{f}} d t L = 0, (50)

which is, again, the Hamilton’s principle. Thus, the Hamilton’s principle can be derived directly from the Liouville’s theorem.

7. Invariance of the Poisson algebra

Given that the divergence of the Hamiltonian flow velocity vanishes, the Liouville equation can be written as

\frac{\partial ρ}{\partial t} + \nabla ρ \cdot v = 0. (51)

The Poisson bracket {ρ, $H$ } between the density of states ρ and the Hamiltonian $H$ is defined by

\{ρ, H\} \equiv \nabla ρ \cdot v = \frac{\partial ρ}{\partial q} \frac{\partial H}{\partial p} - \frac{\partial ρ}{\partial p} \frac{\partial H}{\partial q} . (52)

In general, the Poisson bracket {A, B} between any two arbitrary functions A and B on the phase space is defined by

\{A, B\} \equiv \frac{\partial A}{\partial q} \frac{\partial B}{\partial p} - \frac{\partial A}{\partial p} \frac{\partial B}{\partial q} . (53)

In this notation, the Hamilton’s equations are written as

\dot{q} = \{q, H\}, (54)

\dot{p} = \{p, H\} . (55)

The Poisson bracket satisfies a set of algebraic properties. It is antisymmetric, i.e.

\{A, B\} = - \{B, A\} . (56)

It satisfies linearity, i.e.

\{a A + b B, C\} = a \{A, C\} + b \{B, C\} . (57)

Furthermore, it satisfies the product rule and the Jacobi identity, i.e.

\{A B, C\} = A \{B, C\} + \{A, C\} B, (58)

\{A, \{B, C\}\} + \{B, \{C, A\}\} + \{C, \{A, B\}\} = 0. (59)

These properties define the Poisson algebra of classical mechanics. Since the Liouville’s equation for the incompressible Hamiltonian flow can be expressed in terms of the Poisson bracket, the Liouville’s theorem can equivalently be stated by saying that the evolution in time of any given system conserves information if it leaves the Poisson algebra invariant.

8. Conclusions

The Liouville’s theorem is interpreted as the mathematical condition representing the physical conservation of information in classical mechanics. The Hamilton’s equations, the Hamilton’s principle and the invariance of the Poisson algebra are distinct, but equivalent, manifestations of the theorem.

References

[1] F. Bloch, Fundamentals of Statistical Mechanics, Manuscript and Notes of Felix Bloch, 3rd ed. (Imperial College Press and World Scientific Publishing, London, 2000).

[2] J.W. Gibbs, Elementary Principles in Statistical Mechanics (Charles Scribner’s Sons, New York, 1902).

[3] J. Liouville, Note sur la Theorié de la Variation des constantes arbitraires, J. Math. Pures Appl. 3(1), 342–349 (1838).

[4] J. Liouville, Note sur l’intégration des équations différentielles de la Dynamique, J. Math. Pures Appl. 20(1), 137–138 (1855).

[5] V.I. Arnold, Mathematical Methods of Classical Mechanics, 2nd ed. (Springer-Verlag, New York, 1989).

[6] H. Jeffreys and B.S. Jeffreys, Methods of Mathematical Physics, 3rd ed. (Cambridge University Press, 1956).

¹ To the best of the author’s knowledge, the physical formulation and relevance of the Liouville theorem was first stated by J.W. Gibbs in 1902 [2]. There it was referred to as the ‘Principle of Conservation of Density-in-phase’ or equivalently as the ‘Principle of Conservation of Extension-in-phase’. However, the mathematical background for the theorem dates back to J. Liouville in 1838 [3].

² For the derivation of an integral representation on the configuration space starting from the Newton’s second law of motion, see Chapter 10 in Ref. [6].

LIUVILIO TEOREMA IR KLASIKINĖS MECHANIKOS PAGRINDAS

A. Henriksson

Stavangerio katedros mokykla, Stavangeris, Norvegija