## binomial distribution

### Probability mass function

In general, if the random variable K follows the binomial distribution with parameters n and p, we write K ~ B(np). The probability of getting exactly ksuccesses in n trials is given by the probability mass function:

$f(k;n,p) = \Pr(K = k) = {n\choose k}p^k(1-p)^{n-k}$

for k = 0, 1, 2, ..., n, where

${n\choose k}=\frac{n!}{k!(n-k)!}$

is the binomial coefficient (hence the name of the distribution) "n choose k", also denoted C(nk),  nCk, or nCk. The formula can be understood as follows: we want k successes (pk) and n − k failures (1 − p)n − k. However, the ksuccesses can occur anywhere among the n trials, and there are C(nk) different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

$f(k,n,p)=f(n-k;n,1-p). \,$

Looking at the expression ƒ(knp) as a function of k, there is a k value that maximizes it. This k value can be found by calculating

$\frac{f(k+1,n,p)}{f(k,n,p)}=\frac{(n-k)p}{(k+1)(1-p)}$

and comparing it to 1. There is always an integer M that satisfies

$(n+1)p-1 < M \leq (n+1)p. \,$

ƒ(knp) is monotone increasing for k < M and monotone decreasing fork > M, with the exception of the case where (n + 1)p is an integer. In this case, there are two values for which ƒ is maximal: (n + 1)p and (n + 1)p − 1.M is the most probable (most likely) outcome of the Bernoulli trials and is called the mode. Note that the probability of it occurring can be fairly small.

### Cumulative distribution function

The cumulative distribution function can be expressed as:

$F(x;n,p) = \Pr(X \le x) = \sum_{i=0}^{\lfloor x \rfloor} {n\choose i}p^i(1-p)^{n-i}$

where $\scriptstyle \lfloor x\rfloor\,$ is the "floor" under x, i.e. the greatest integer less than or equal tox.

It can also be represented in terms of the regularized incomplete beta function, as follows:

\begin{align} F(k;n,p) & = \Pr(X \le k) = I_{1-p}(n-k, k+1) \\ & = (n-k) {n \choose k} \int_0^{1-p} t^{n-k-1} (1-t)^k \, dt. \end{align}

For k ≤ npupper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality yields the bound

$F(k;n,p) \leq \frac{1}{2}\exp\left(-2 \frac{(np-k)^2}{n}\right), \!$

and Chernoff's inequality can be used to derive the bound

$F(k;n,p) \leq \exp\left(-\frac{1}{2\,p} \frac{(np-k)^2}{n}\right). \!$

Moreover, these bounds are reasonably tight when p = 1/2, since the following expression holds for all k ≥ 3n/8[1]

$F(k;n,1/2) \geq \frac{1}{15} \exp\left(- \frac{16 (n/2 - k)^2}{n}\right). \!$

## Mean and variance

If X ~ B(np) (that is, X is a binomially distributed random variable), then the expected value of X is

$\operatorname{E}[X] = np$

and the variance is

$\operatorname{Var}[X] = np(1 - p).$

