Asymptotic equipartition property
In information theory, the asymptotic equipartition property is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.
Roughly speaking, the theorem states that although there are many series of results that may be produced by a random process, the one actually produced is most probably from a loosely defined set of outcomes that all have approximately the same chance of being the one actually realized. Although there are individual outcomes which have a higher probability than any outcome in this set, the vast number of outcomes in the set almost guarantees that the outcome will come from the set. One way of intuitively understanding the property is through Cramér's large deviation theorem, which states that the probability of a large deviation from mean decays exponentially with the number of samples. Such results are studied in large deviations theory; intuitively, it is the large deviations that would violate equipartition, but these are unlikely.
In the field of pseudorandom number generation, a candidate generator of undetermined quality whose output sequence lies too far outside the typical set by some statistical criteria is rejected as insufficiently random. Thus, although the typical set is loosely defined, practical notions arise concerning sufficient typicality.
Definition
Given a discrete-time stationary ergodic stochastic process on the probability space, the asymptotic equipartition property is an assertion thatwhere or simply denotes the entropy rate of, which must exist for all discrete-time stationary processes including the ergodic ones. The asymptotic equipartition property is proved for finite-valued stationary ergodic stochastic processes in the Shannon–McMillan–Breiman theorem using the ergodic theory and for any i.i.d. sources directly using the law of large numbers in both the discrete-valued case and the continuous-valued case. The definition of the asymptotic equipartition property can also be extended for certain classes of continuous-time stochastic processes for which a typical set exists for long enough observation time. The convergence is proven almost sure in all cases.
Discrete-time i.i.d. sources
Given is an i.i.d. source which may take values in the alphabet, its time series is i.i.d. with entropy. The weak law of large numbers gives the asymptotic equipartition property with convergence in probability,since the entropy is equal to the expectation of
The strong law of large numbers asserts the stronger almost sure convergence,
Discrete-time finite-valued stationary ergodic sources
Consider a finite-valued sample space, i.e., for the discrete-time stationary ergodic process defined on the probability space. The asymptotic equipartition property for such stochastic source is known as the Shannon–McMillan–Breiman theorem, due to Claude Shannon, Brockway McMillan, and Leo Breiman.- Let x denote some measurable set for some
- Parameterize the joint probability by n and x as
- Parameterize the conditional probability by i, k and x as
- Take the limit of the conditional probability as k → ∞ and denote it as
- Argue the two notions of entropy rate
- Argue that both
- Define the k-th order Markov approximation to the probability as
- Argue that is finite from the finite-value assumption.
- Express in terms of the sample mean of and show that it converges almost surely to Hk
- Define the probability measure
- Express in terms of the sample mean of and show that it converges almost surely to H∞.
- Argue that as k → ∞ using the stationarity of the process.
- Argue that H = H∞ using the Lévy's martingale convergence theorem and the finite-value assumption.
- Show that
- Show that
- Show that
- Similarly, show that
- Show that limsup of
- Show that liminf and limsup of
- Complete the proof by pointing out that the upper and lower bounds are shown previously to approach H as k → ∞.
Non-stationary discrete-time source producing independent symbols
We assume that the source is producing independent symbols, with possibly different output statistics at each instant. We assume that the statistics of the process are known completely, that is, the marginal distribution of the process seen at each time instant is known. The joint distribution is just the product of marginals. Then, under the condition that for all i, for some M > 0, the following holds :
where
It is obvious that the proof holds if any moment is uniformly bounded for r > 1.
Even this condition is not necessary, but given a non-stationary random process, it should not be difficult to test whether the asymptotic equipartition property holds using the above method.
Applications
The asymptotic equipartition property for non-stationary discrete-time independent process leads us to the source coding theorem for non-stationary source and noisy-channel coding theorem for non-stationary memoryless channels.Continuous-time stationary ergodic sources
Discrete-time functions can be interpolated to continuous-time functions. If such interpolation f is measurable, we may define the continuous-time stationary process accordingly as. If the asymptotic equipartition property holds for the discrete-time process, as in the i.i.d. or finite-valued stationary ergodic cases shown above, it automatically holds for the continuous-time stationary process derived from it by some measurable interpolation. i.e.where n corresponds to the degree of freedom in time. and are the entropy per unit time and per degree of freedom respectively, defined by Shannon.
An important class of such continuous-time stationary process is the bandlimited stationary ergodic process with the sample space being a subset of the continuous functions. The asymptotic equipartition property holds if the process is white, in which case the time samples are i.i.d., or there exists T > 1/2W, where W is the nominal bandwidth, such that the T-spaced time samples take values in a finite set, in which case we have the discrete-time finite-valued stationary ergodic process.
Any time-invariant operations also preserves the asymptotic equipartition property, stationarity and ergodicity and we may easily turn a stationary process to non-stationary without losing the asymptotic equipartition property by nulling out a finite number of time samples in the process.
Category theory
A category theoretic definition for the equipartition property is given by Gromov. Given a sequence of Cartesian powers of a measure space P, this sequence admits an asymptotically equivalent sequence HN of homogeneous measure spaces .The above requires a definition of asymptotic equivalence. This is given in terms of a distance function, giving how much an injective correspondence differs from an isomorphism. An injective correspondence is a partially defined map that is a bijection; that is, it is a bijection between a subset and. Then define
where |S| denotes the measure of a set S. In what follows, the measure of P and Q are taken to be 1, so that the measure spaces are probability spaces. This distance is commonly known as the earth mover's distance or Wasserstein metric.
Similarly, define
with taken to be the counting measure on P. Thus, this definition requires that P be a finite measure space. Finally, let
A sequence of injective correspondences are then asymptotically equivalent when
Given a homogenous space sequence HN that is asymptotically equivalent to PN, the entropy H of P may be taken as
Journal articles
- Claude E. Shannon. "A Mathematical Theory of Communication". Bell System Technical Journal, July/October 1948.
- Sergio Verdu and Te Sun Han. "The Role of the Asymptotic Equipartition Property in Noiseless Source Coding." IEEE Transactions on Information Theory, 43: 847–857, 1997.
Textbooks