In probability theory, Chebyshev’s inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined. For example, it can be used to prove the weak law of large numbers.
Proposition Let be a random variable having finite mean and finite variance . Let (i.e., is a strictly positive real number). Then, the following inequality, called Chebyshev’s inequality, holds:
![[eq2]](https://www.statlect.com/images/Chebyshev-inequality__6.png)
The proof is a straightforward application of Markov’s inequality:
Since (X-µ)^2 is a positive random variable, we can apply Markov’s inequality to it:
![[eq4]](https://www.statlect.com/images/Chebyshev-inequality__8.png)
By setting c=k^2, we obtain
![[eq5]](https://www.statlect.com/images/Chebyshev-inequality__10.png)
But ( X-µ)^2 >= k^2 if and only if |X-µ| >= k, so we can write
![[eq8]](https://www.statlect.com/images/Chebyshev-inequality__13.png)
Furthermore, by the very definition of variance,
![[eq9]](https://www.statlect.com/images/Chebyshev-inequality__14.png)
Therefore,
![[eq10]](https://www.statlect.com/images/Chebyshev-inequality__15.png)
To illustrate the inequality, we will look at it for a few values of K:
- For K = 2 we have 1 – 1/K2 = 1 – 1/4 = 3/4 = 75%. So Chebyshev’s inequality says that at least 75% of the data values of any distribution must be within two standard deviations of the mean.
- For K = 3 we have 1 – 1/K2 = 1 – 1/9 = 8/9 = 89%. So Chebyshev’s inequality says that at least 89% of the data values of any distribution must be within three standard deviations of the mean.
- For K = 4 we have 1 – 1/K2 = 1 – 1/16 = 15/16 = 93.75%. So Chebyshev’s inequality says that at least 93.75% of the data values of any distribution must be within two standard deviations of the mean.
Use of the Inequality
If we know more about the distribution that we’re working with, then we can usually guarantee that more data is a certain number of standard deviations away from the mean. For example, if we know that we have a normal distribution, then 95% of the data is two standard deviations from the mean. Chebyshev’s inequality says that in this situation we know that at least 75% of the data is two standard deviations from the mean. As we can see in this case, it could be much more than this 75%.
The value of the inequality is that it gives us a “worse case” scenario in which the only things we know about our sample data (or probability distribution) is the mean and standard deviation. When we know nothing else about our data, Chebyshev’s inequality provides some additional insight into how spread out the data set is.