Discrete Distrubutions

Distribution

Formula

Mean μ

Variance σ²

Moment Generating Function

Bernoulli Distribution

p if k = 1, 1 - p if k = 0, else 0

p(1 - p)

1 - p + pe^t

Binomial Distribution

f(k;n,p) =	n! * p^k(1 - p)^{n - k}
	k!(n - k)!

np(1 - p)

(1 - p + pe^t)ⁿ

Negative Binomial Distribution

f(n;k,p) =	(n - 1)! * p^k(1 - p)^{n - k}
	(k - 1)!(n - k)!

k
p

k(1 - p)
p²

(1 - p)^r
(1 - pe^t)^r

Geometric Distribution

P(x = n) = p * (1 - p)^{(n - 1)}

1
p

1 - p
p²

pe^t
1 - (1 - p)e^t

Poisson Distribution

P(k; λ) =	λ^k
	e^λk!

e^{λ(e^t - 1)}

Hypergeometric Distribution

P(x;n,N,k) =	(_kC_x) * (_{N - k}C_{n - x})
	_NC_n

nk
N

nk(N - k)(N - n)
N²(N - 1)

N/A

Multinomial Distribution

ƒ(x₀!·x₁!·x₂...·_n;x;n,θ₀,θ₁,θ₂,...,θ_n) =	n!(θ₀ · θ₁ · θ₂ · ... · θ_n)
	x₀ · x₁ · x₂ · ... · x_n

np_i

np_{i(1 - p_i)}

(Σp_ie^t_i)ⁿ

Uniform Distribution

P(k; λ) =	λ^k
	e^λk!

½a + b

(b - a)²
12

e^tb - e^ta
t(b - a)

Test Statistic Decision Tree:

Type

Keywords

Sample Size

Use Test

Z-score

Mean, Average

Greater than 30 or population σ is known

z =	X - μ
	σ/√n

t-score

Mean, Average

30 or less: population σ is not known

t =	x - μ
	s/√n

proportion-score

Proportion (Test p), Fraction, Percentage, Rate, Probability

more than 30

z =	p^ - p₀
	√p₀q₀/n

Variance σ²

Variance, Variability, Spread

N/A

χ² =	(n - 1)s²
	σ²

Equal Variances σ₁² = σ₂²

Equal Variances, Ratio or Difference in Variances

N/A

F =	σ₁²
	σ₂²

Confidence Interval Decision Tree

Type	Keywords	Sample Size	Use Test
Test the Mean	Confidence Interval, Mean, Average	Greater than 30	X - zscore_α/2 * s/√n < μ < X + zscore_α/2 * s/√n
Test the Mean	Confidence Interval, Mean, Average	30 or less	X - tscore_α * s/√n < μ < X + tscore_α * s/√n
Test the Variance	Confidence Interval, Variance	Greater than 30	(n - 1)s²/χ²_α/2 < σ² < (n - 1)s²/χ²_{1 - α/2}
Test the Standard Deviation	Confidence Interval, Standard Deviation	Greater than 30	Square Root((n - 1)s²/χ²_α/2) < σ² < Square Root((n - 1)s²/χ²_{1 - α/2})
Test the Proportion	Confidence Interval, Proportion, percentage, rate, Population	Greater than 30	(n - 1)s²/χ²_α/2 < σ² < (n - 1)s²/χ²_{1 - α/2}
Test the Difference of Means	Confidence Interval, Difference of Means	Greater than 30	(x₁ - x₂) - zscore_α x √a < μ₁ - μ₂ < (x₁ - x₂) - zscore_α x √a
Test the Difference of Means	Confidence Interval, Difference of Means	30 or less	(x₁ - x₂) - tscore_α x √a < μ₁ - μ₂ < (x₁ - x₂) - tscore_α x √a
p^ Confidence Interval	Confidence Interval (test p), criteria,characteristic, proportion	30 or less	p^ - z_α/2σ√p(1 - p)/n < p < p^ + z_α/2√p(1 - p)/n

Sample Size Decision Tree:

Type

Keywords

Use Test

Sample Size for μ

Sample Size, average, mean

n =	Z-score_α/2² x σ²
	SE²

Proportion Sample Size

Sample Size, Proportion, Population, Percentage, Rate

n =	Z-score² x p x (1 - p)
	SE²

μ₁ - μ₂ Sample Size

Sample Size, Difference of Means, μ₁ - μ₂

n =	Z-score²(σ₁² + σ₂²)
	ME²

p₁ - p₂ Sample Size

Sample Size, Difference of p, p₁ - p₂

n =	Z-score²(p₁q₁ + p₂q₂)
	ME²

Hypothesis Testing Decision Tree

p-value Significance Test (observed level of significance):

Find your z-score, then find the probability in the z-table associated with that score, and if α > probability (p-value), reject H₀

Hypothesis Testing Errors:

Type I error - Reject null hypothesis H₀ when H₀ is TRUE: Probability = α
Type II error - Accept null hypothesis H₀ when H₀ is FALSE: Probability = β
Power of the Test = Probability you Reject null hypothesis H₀ when H₀ is FALSE: --> 1 - β
Note: It is a bigger mistake to make a Type II error than a Type I error

Finite Population Correction Factor:

If n/N > 0.05, then you multiply your confidence interval by the following factor

√N - n
√N

Regression Testing and Correlation Coefficients:

Cov(X,Y) =	Σ(X_i - X)(Y_i - Y)
	n

Correlation Coefficient (r) =	Cov(X,Y)
	s_xs_y

β =	Σ(X_i - X)(Y_i - Y)
	Σ(X_i - X)²

Least Squares Regression Line ← α = Y - βX
y^ = α + βx where α is the y-intercept for the line that contains the points in X & Y and β is the is the slope of the line that the set of points lies on.
α & β are designed such that they produce the smallest possible SSE defined below
Sum of Squares about the Mean (SSM) = (y_i - y)²
Square of the Residual Difference (SSE) (y_i - y^_i)²
SSE represents the difference between the straight line that we create and the plotted points from our data

Coefficient of Determination (r²) =	SSM - SSE
	SSM

Large Sample Condition Requirement:

1. A random sample is selected from the target population.
2. The sample size n is large (i.e., n ≥ 30). (Due to the Central Limit Theorem, this condition guarantees that the test statistic will be approximately normal regardless of the shape of the underlying probability distribution of the population.)

Statistics Summary Calculator