A Guide to Continuous Probability Distributions
A continuous probability distribution is a function which describes a random variable . When a sample of is taken, the probability of lying below a given value is governed by the distribution, according to the law .
Specifically, this means that satisfies:
- Non-decreasing: .
- Continuous: for all .
- Limits at infinity: while .
This is called the cumulative distribution function, or CDF. When its derivative exists, it is given the label and called the probability density function, or PDF.
On this page, I catalogue every probability density function which is important or famous enough to deserve a name, and provide a set of summary statistics and properties.
The Uniform Distribution
The uniform distribution is extremely simple: heuristically, every same-sized subinterval within a finite support interval has an equal probability of occurring. It was first named by James Victor Uspensky in his 1937 book Introduction to Mathematical Probability, though it has been used for centuries beforehand.
The distribution is parameterised by two real numbers:
- is the start. It controls the lower bound of the distribution.
- is the end. It controls the upper bound of the distribution.
A standard uniform distribution (usually implemented as the basic “random” function on computers) takes and .
The moments of the uniform distribution are:
with the quantiles given by
The uniform distribution has a few basic properties:
- Uniformity: the PDF is constant across the support, so any subinterval of length has probability .
- Symmetry: as a result, the PDF is symmetrical about the mean of the distribution.
- Coincident Averages: the mean and median of the distribution coincide: this value is also modal (as is every value within the support).
- Linearity: the CDF of the uniform distribution is linear on the entire support. In fact, this is the only distribution with this property.
- Invariances: if is uniformly distributed, so is for any .
Suppose is an iid. sequence of standard uniform random variables. Then
- The exponentiation follows a Beta distribution with shape and .
- In fact, the uniformly distributed is a special case of the Beta distribution with shape and scale .
- The average follows a standard Bates distribution.
- The sum follows a triangular distribution with start , end , and mode .
- The absolute difference follows another triangular distribution with start , end , and mode .
- The scaling for is itself uniformly distributed with start and end .
The uniform distribution frequently arises in situations where there’s no bias towards any particular value. For example, in a well-mixed thermal bath, the starting phase of oscillating particles is uniformly distributed in the interval .
The Exponential Distribution
The exponential distribution models the time between independent events occurring at a constant average rate. It was first derived by Agner Krarup Erlang in 1909 while studying telephone call arrivals, though similar work appeared earlier in kinetic theory. The distribution is characterized by the memoryless property: knowing an exponential process has lasted time doesn’t change the distribution of the remaining time.
The distribution is parameterised by a single positive real number:
- is the rate parameter. It controls how quickly the exponential decay occurs.
Sometimes the distribution is instead parameterised by , called the scale parameter. The standard exponential distribution takes .
The moments of the exponential distribution are:
with the quantiles given by
The exponential distribution has several important properties:
- Memorylessness: for any , .
- Constant Hazard Rate: the hazard function is constant.
- Maximum Entropy: among all continuous distributions with support and mean , the distribution has maximum entropy.
- Scaling: if is exponential with rate , then is exponential with rate for .
Suppose is an iid. sequence of exponential random variables with rate . Then:
- The sum follows a Gamma distribution with shape and rate .
- The minimum is exponential with rate .
- The maximum follows a transformed Weibull distribution.
- If is Poisson with rate , then .
- is a special case of the Gamma distribution with shape parameter .
- follows the standard exponential distribution with rate .
The exponential distribution models many real-world phenomena:
- Time between arrivals in a Poisson process, such as radioactive decay events
- Time until failure of electronic components, given a constant failure rate
- Length of telephone calls in a call center
The Poisson Distribution
The Poisson distribution models the number of events occurring in a fixed interval when these events happen at a constant average rate and independently of each other. It was introduced by Siméon Denis Poisson in 1838 in his work “Research on the Probability of Judgments in Criminal and Civil Matters”.
Note that the Poisson distribution is discrete: it takes only integer values. This means it has a probability mass function, rather than a density.
The distribution is parameterised by a single positive real number:
- is the rate parameter. It represents both the expected number of events and the variance.
A standard Poisson distribution takes .
The moments of the Poisson distribution are:
where is the Bell polynomial evaluated at .
with the quantiles given by
The Poisson distribution has several important properties:
- Additivity: if and are independent, then .
- Infinite Divisibility: for any , a Poisson() random variable can be represented as the sum of independent Poisson() random variables.
- Law of Small Numbers: the limit of Binomial() as and with fixed is Poisson().
- Maximum Entropy: among all discrete distributions on with fixed mean, the Poisson has maximum entropy.
- Memorylessness: the number of events in disjoint intervals are independent.
Suppose we have a Poisson process with rate . Then:
- The number of events in time follows Poisson().
- The time between events follows an Exponential distribution with rate .
- Given events occurred in time , their occurrence times follow a Uniform order statistic.
- The sum of independent Poisson() variables is Poisson().
- If and each event has probability , the successes follow Poisson().
- The waiting time until the th event follows a Gamma distribution with shape and rate .
The Poisson distribution models many real-world phenomena:
- Number of radioactive decay events in a fixed time interval
- Number of mutations in a DNA sequence of fixed length
- Number of typing errors per page in a manuscript
- Number of cosmic rays hitting a detector in a fixed time
The Normal Distribution
The normal (or Gaussian) distribution is arguably the most important probability distribution in statistics. It was first introduced by Abraham de Moivre in 1733, and later popularised by Carl Friedrich Gauss in the early 1800s while studying astronomical measurement errors. The distribution’s ubiquity is explained by the Central Limit Theorem: the sum of many independent random variables almost always tends towards normality.
The distribution is parameterised by two real numbers:
- is the mean parameter. It controls the location of the distribution’s peak.
- is the standard deviation parameter. It controls the spread of the distribution.
The standard normal distribution takes and , denoted .
The moments of the normal distribution are:
(note the double factorial) with the quantiles given by
The normal distribution has several important properties:
- Symmetry: the distribution is symmetric about its mean.
- Unimodality: it has a single peak at .
- Maximum Entropy: among all distributions with given mean and variance, the normal has maximum entropy.
- 68-95-99.7 Rule: approximately 68% of values lie within of , 95% within , and 99.7% within .
- Stability: linear combinations of independent normal variables are also normal.
- Reproductive Property: if and are independent, then .
- Ubiquity: by the Central Limit Theorem, for any sequence of iid. random variables with finite mean and variance, the value of converges in distribution to a standard normal random variable.
Suppose are iid. normal random variables. Then:
- The sample mean follows .
- The sum follows .
- If these are standard normal random variables, the sum follows a Chi-squared distribution with degrees of freedom.
- The ratio of two standard normals follows a Cauchy distribution.
- The maximum of normal variables follows a Gumbel distribution asymptotically.
The normal distribution appears frequently in the real world:
- Heights and weights in populations
- Measurement errors in scientific experiments
- IQ scores (by design)
- Manufacturing variations in industrial processes
- Noise in electronic systems