# Empirical process

In probability theory, an **empirical process** is a stochastic process that describes the proportion of objects in a system in a given state. For a process in a discrete state space a **population continuous time Markov chain**^{[1]}^{[2]} or **Markov population model**^{[3]} is a process which counts the number of objects in a given state (without rescaling). In mean field theory, limit theorems (as the number of objects becomes large) are considered and generalise the central limit theorem for empirical measures. Applications of the theory of empirical processes arise in non-parametric statistics.^{[4]}

## Definition

For *X*_{1}, *X*_{2}, ... *X*_{n} independent and identically-distributed random variables in **R** with common cumulative distribution function *F*(*x*), the empirical distribution function is defined by

where I_{C} is the indicator function of the set *C*.

For every (fixed) *x*, *F*_{n}(*x*) is a sequence of random variables which converge to *F*(*x*) almost surely by the strong law of large numbers. That is, *F*_{n} converges to *F* pointwise. Glivenko and Cantelli strengthened this result by proving uniform convergence of *F*_{n} to *F* by the Glivenko–Cantelli theorem.^{[5]}

A centered and scaled version of the empirical measure is the signed measure

It induces a map on measurable functions *f* given by

By the central limit theorem, converges in distribution to a normal random variable *N*(0, *P*(*A*)(1 − *P*(*A*))) for fixed measurable set *A*. Similarly, for a fixed function *f*, converges in distribution to a normal random variable , provided that and exist.

**Definition**

- is called an
*empirical process*indexed by , a collection of measurable subsets of*S*. - is called an
*empirical process*indexed by , a collection of measurable functions from*S*to .

A significant result in the area of empirical processes is Donsker's theorem. It has led to a study of Donsker classes: sets of functions with the useful property that empirical processes indexed by these classes converge weakly to a certain Gaussian process. While it can be shown that Donsker classes are Glivenko–Cantelli classes, the converse is not true in general.

## Example

As an example, consider empirical distribution functions. For real-valued iid random variables *X*_{1}, *X*_{2}, ..., *X*_{n} they are given by

In this case, empirical processes are indexed by a class It has been shown that is a Donsker class, in particular,

- converges weakly in to a Brownian bridge
*B*(*F*(*x*)) .

## See also

## References

- ↑
**Lua error in Module:Citation/CS1/Identifiers at line 47: attempt to index field 'wikibase' (a nil value).** - ↑
**Lua error in Module:Citation/CS1/Identifiers at line 47: attempt to index field 'wikibase' (a nil value).** - ↑
**Lua error in Module:Citation/CS1/Identifiers at line 47: attempt to index field 'wikibase' (a nil value).** - ↑
- ↑

## Further reading

- Billingsley, P. (1995).
*Probability and Measure*(Third ed.). New York: John Wiley and Sons. ISBN 0471007102.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles> - Dudley, R. M. (1999).
*Uniform Central Limit Theorems*. Cambridge Studies in Advanced Mathematics.**63**. Cambridge, UK: Cambridge University Press.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles> - van der Vaart, Aad W.; Wellner, Jon A. (2000).
*Weak Convergence and Empirical Processes: With Applications to Statistics*(2nd ed.). Springer. ISBN 978-0-387-94640-5.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>

## External links

- Empirical Processes: Theory and Applications, by David Pollard, a textbook available online.
- Introduction to Empirical Processes and Semiparametric Inference, by Michael Kosorok, another textbook available online.