Discrete Distrubutions
Distribution | Formula | Mean μ | Variance σ2 | Moment Generating Function | Bernoulli Distribution | p if k = 1, 1 - p if k = 0, else 0 | p | p(1 - p) | 1 - p + pet |
Binomial Distribution | f(k;n,p) = | n! * pk(1 - p)n - k | | k!(n - k)! |
| np | np(1 - p) | (1 - p + pet)n |
Negative Binomial Distribution | f(n;k,p) = | (n - 1)! * pk(1 - p)n - k | | (k - 1)!(n - k)! |
|
|
|
|
Geometric Distribution | P(x = n) = p * (1 - p)(n - 1) |
|
|
|
Poisson Distribution |
| λ | λ | eλ(et - 1) |
Hypergeometric Distribution | P(x;n,N,k) = | (kCx) * (N - kCn - x) | | NCn |
|
| nk(N - k)(N - n) | | N2(N - 1) |
| N/A |
Multinomial Distribution | ƒ(x0!·x1!·x2...·n;x;n,θ0,θ1,θ2,...,θn) = | n!(θ0 · θ1 · θ2 · ... · θn) | | x0 · x1 · x2 · ... · xn |
| npi | npi(1 - pi) | (Σpieti)n |
Uniform Distribution |
| ½a + b |
|
|
Test Statistic Decision Tree:
Type | Keywords | Sample Size | Use Test | Z-score | Mean, Average | Greater than 30 or population σ is known |
|
t-score | Mean, Average | 30 or less: population σ is not known |
|
proportion-score | Proportion (Test p), Fraction, Percentage, Rate, Probability | more than 30 |
|
Variance σ2 | Variance, Variability, Spread | N/A |
|
Equal Variances σ12 = σ22 | Equal Variances, Ratio or Difference in Variances | N/A |
|
Confidence Interval Decision Tree
Type | Keywords | Sample Size | Use Test | Test the Mean | Confidence Interval, Mean, Average | Greater than 30 | X - zscoreα/2 * s/√n < μ < X + zscoreα/2 * s/√n |
Test the Mean | Confidence Interval, Mean, Average | 30 or less | X - tscoreα * s/√n < μ < X + tscoreα * s/√n |
Test the Variance | Confidence Interval, Variance | Greater than 30 | (n - 1)s2/χ2α/2 < σ2 < (n - 1)s2/χ21 - α/2 |
Test the Standard Deviation | Confidence Interval, Standard Deviation | Greater than 30 | Square Root((n - 1)s2/χ2α/2) < σ2 < Square Root((n - 1)s2/χ21 - α/2) |
Test the Proportion | Confidence Interval, Proportion, percentage, rate, Population | Greater than 30 | (n - 1)s2/χ2α/2 < σ2 < (n - 1)s2/χ21 - α/2 |
Test the Difference of Means | Confidence Interval, Difference of Means | Greater than 30 | (x1 - x2) - zscoreα x √a < μ1 - μ2 < (x1 - x2) - zscoreα x √a |
Test the Difference of Means | Confidence Interval, Difference of Means | 30 or less | (x1 - x2) - tscoreα x √a < μ1 - μ2 < (x1 - x2) - tscoreα x √a |
p^ Confidence Interval | Confidence Interval (test p), criteria,characteristic, proportion | 30 or less | p^ - zα/2σ√p(1 - p)/n < p < p^ + zα/2√p(1 - p)/n |
Sample Size Decision Tree:
Type | Keywords | Use Test | Sample Size for μ | Sample Size, average, mean |
|
Proportion Sample Size | Sample Size, Proportion, Population, Percentage, Rate | n = | Z-score2 x p x (1 - p) | | SE2 |
|
μ1 - μ2 Sample Size | Sample Size, Difference of Means, μ1 - μ2 | n = | Z-score2(σ12 + σ22) | | ME2 |
|
p1 - p2 Sample Size | Sample Size, Difference of p, p1 - p2 | n = | Z-score2(p1q1 + p2q2) | | ME2 |
|
Hypothesis Testing Decision Tree
p-value Significance Test (observed level of significance):
Find your z-score, then find the probability in the z-table associated with that score, and if α > probability (p-value), reject H
0Hypothesis Testing Errors:
Type I error -
Reject null hypothesis H
0 when H
0 is TRUE: Probability = α
Type II error -
Accept null hypothesis H
0 when H
0 is FALSE: Probability = β
Power of the Test = Probability you
Reject null hypothesis H
0 when H
0 is FALSE: --> 1 - β
Note: It is a
bigger mistake to make a Type II error than a Type I error
Finite Population Correction Factor:
If n/N > 0.05, then you multiply your
confidence interval by the following factor
Cov(X,Y) = | Σ(Xi - X)(Yi - Y) |
| n |
Correlation Coefficient (r) = | Cov(X,Y) |
| sxsy |
β = | Σ(Xi - X)(Yi - Y) |
| Σ(Xi - X)2 |
Least Squares Regression Line ← α =
Y - β
Xy^ = α + βx where α is the y-intercept for the line that contains the points in X & Y and β is the is the slope of the line that the set of points lies on.
α & β are designed such that they produce the smallest possible SSE defined below
Sum of Squares about the Mean (SSM) = (y
i - y)
2Square of the Residual Difference (SSE) (
yi - y^
i)
2SSE represents the difference between the straight line that we create and the plotted points from our data
Coefficient of Determination (r2) = | SSM - SSE |
| SSM |
Large Sample Condition Requirement:
1. A random sample is selected from the target population.
2. The sample size n is large (i.e., n ≥ 30). (Due to the Central Limit Theorem, this condition guarantees that the test statistic will be approximately normal regardless of the shape of the underlying probability distribution of the population.)