0% found this document useful (0 votes)
84 views

Stats Exam 1 Cheat Sheet

The document discusses the Central Limit Theorem, emphasizing that the sampling distribution of the mean becomes normally distributed with a sufficiently large sample size (n>30). It outlines the steps for calculating the mean, standard deviation, Z-scores, and probabilities, as well as the concepts of null and alternate hypotheses, significance, and confidence intervals. Additionally, it covers methods for selecting a simple random sample and identifying outliers using interquartile ranges.

Uploaded by

J. Malinn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

Stats Exam 1 Cheat Sheet

The document discusses the Central Limit Theorem, emphasizing that the sampling distribution of the mean becomes normally distributed with a sufficiently large sample size (n>30). It outlines the steps for calculating the mean, standard deviation, Z-scores, and probabilities, as well as the concepts of null and alternate hypotheses, significance, and confidence intervals. Additionally, it covers methods for selecting a simple random sample and identifying outliers using interquartile ranges.

Uploaded by

J. Malinn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Central Limit Theorem - the sampling distribution of the mean Selecting a random sample from SRS Table

will always be normally distributed, as long as the sample size ● Population is from 1 - 50
is large enough (n>30). ● Starting on line 2 of the table, select the first 3 people
(will be selecting a two-digit numbers between 1-50)
Mean of sampling distribution of mean - equal to the
population mean (µ) Line 1: 45624 56918 34901 43899 28790
Line 2 52311 43620 61099 23404 48910
Standard deviation of sampling distribution of mean Line 3: 31900 75762 78017 23412 44876
● standard error of the mean
● calculated by dividing the population standard ● The 1st two #s in line 2 are 52 which is outside of the
deviation by the square root of the sample size range, so can’t use it
● formula: σ/√n ● The next two numbers are 31 (within range), 14 (within
range), and 36 (within range)
Probability that average score is > value of x ● The first 3 random people selected would be 31, 14, an
36
1: Identify parameters
● Population mean (µ) Median
● Population standard deviation (σ) ● Arrange the data points from smallest to largest.
● Sample size (n) ● If the number of data points is odd, the median is the
● Value of x middle data point in the list.
● If the number of data points is even, the median is the
2: Calculate standard deviation of sampling distribution of mean average of the two middle data points in the list.
● formula: σ/√n
Quartile 1 (Q1):
3: Calculate Z-score ● refers to the value below which 25% of the data falls
● z = (x - µ) / σ/√n ● Q1 represents the median of the lower half of the data
● Formula: (n + 1) / 4
4: Look up z-score on normal distribution table

5: Find the probability above x Quartile 3 (Q3):


● Since the probability shoul be greater than x, subtract ● value below which 75% of the data falls
the value from the z-table from 1. ● Q3 represents the median of the upper half of the data.
● Formula: 3(n + 1) / 4

Probability of selecting an individual from the population is Outliers


< value of x ● Identify outliers through the Interquartile Range (IQR)
1: Identify parameters ● Find IQR = Q3 - Q1
● Population mean (µ) ● Find lower range # = Q1 - (1.5 * IQR)
● Population standard deviation (σ) ● Find higher range # = Q3 + (1.5 * IQR)
● Sample size (n) ● Outliers will be any numbers below the lower range #
● Value of x and higher than the higher range #

2: Calculate Z-score Box Plot


● z = (x - µ) / σ /n

3: Look up z-score on normal distribution table

Labeling to select a Simple Random Sample (SRS)


● assign each individual in the population a unique
numerical label
● typically starting from 1 and going up to the total
number of individuals in the population

Histogram Intervals
● R = Max. range # - Min. range #
● # of intervals = R / = square root of # of data points
Null Hypothesis (H0): Dice - complete table using fractions w/ common
● a statement that there is no significant difference or denominator for probability of spots showing (example
effect between variables table)
● represented with an equal sign (=)
# spots 2 4 6 8 10
Alternate Hypothesis (Ha or H1)
● the opposing claim that there is a significant difference Prob 1/36 6/36 13/36 12/36 4/36
or effect or .028 or .167 or .361 or .333 or .111
● denoted by symbols like <, >, or ≠,
● Symbol depends on the direction of the test (increases, Dice - expected value and variance of distribution
decreases) Expected Value = Sum of (x * p(x) from the table or 2(.028) + 4
(.167) + 6 ( .361) + 8 (.333) + 10 (.111) = 6.664
Calculate z value for test and determine 1 sided p-vale for
test: Variance of the distribution
● z = (sample mean - population mean) / (population
standard deviation / sqrt(sample size)); to find the one- Probability that you roll a 2 or 10
sided p-value, look up the calculated z-value on a P (x=2) + P (x=10) = 0.28 + .111 = .139
standard normal distribution table (z-table), depending
on whether your test is left-tailed or right-tailed, and Graph on interval 0 ≤ x ≤ 3 anf function
use the corresponding area under the curve as the p- f(x)= x^2 +9
value.
● Calculating z-value: Does graph represent legitimate probability model
○ Formula: z = (x̄- μ) / (σ / √n)
○ Where:
■ x̄ is the sample mean
■ μ is the population mean
■ σ is the population standard deviation = 1/9 * 9 = 1
■ n is the sample size
● Finding the p-value: Mean of the model
○ One-tailed test:
■ For a right-tailed test: p-value
= 1 - P(Z ≤ z)
■ For a left-tailed test: p-value
= P(Z ≤ z)
= 2.25
Significance - is there evidence
● Interpreting p-value: A smaller p-value indicates Variance of the model
stronger evidence against the null hypothesis.

Power of the test if we want to detect a difference of 2 from


the null hypothesis

● Power = 1 - Beta (Beta is the probability of making a


Type II error, and Beta is calculated based on the
desired effect size (in this case, a difference of 2)

99% confidence level


● ± (2.576 * Standard Error), where "2.576" is the
critical z-score for a 99% confidence level and
"Standard Error" is calculated as the population
standard deviation divided by the square root of the
sample size (σ/√n
● Formula for Confidence interval: 2.576 * σ/√n
● Propose mean - confience interval ≤ μ ≤
Propose mean + confience interval

You might also like