Hypothesis Testing Using Minitab
Hypothesis Testing Using Minitab
A hypothesis test examines two opposing hypotheses about a population: the null hypothesis
and the alternative hypothesis. The null hypothesis is the statement being tested. Usually the
null hypothesis is a statement of "no effect" or "no difference". The alternative hypothesis is
the statement you want to be able to conclude is true based on evidence provided by the
sample data.
Based on the sample data, the test determines whether to reject the null hypothesis. You use a
p-value, to make the determination. If the p-value is less than the significance level (denoted as
α or alpha), then you can reject the null hypothesis.
The null hypothesis is: The population mean of all the observations is equal to a fixed value (say
10 cm). Formally, this is written as: H0: μ = 10
The population mean is less than the target. one sided: μ < 10
The population mean is greater than the target. one sided: μ > 10
A significance level of 0.05 is the most commonly used significance level. However, depending
on the criticality of the process, other values of α may be chosen.
After performing the hypothesis test, Minitab gives a p-value. Compare the p-value is against
the significance level α (0.05).
If p-value > α, then accept the null hypothesis, otherwise reject the null hypothesis.
T-test
A t-test is used to compare the mean of two given samples. A t-test requires a normal distribution of
the sample. A t-test is used when the population parameters (mean and standard deviation) are not
known.
2. Paired sample t-test which compares means from the same group at different times
3. One sample t-test which tests the mean of a single group against a known mean.
ANOVA
ANOVA, also known as analysis of variance, is used to compare multiple (three or more) samples with
a single test. There are 2 major flavors of ANOVA
1. One-way ANOVA: It is used to compare the difference between the three or more samples/groups of
a single independent variable.
2. MANOVA: MANOVA allows us to test the effect of one or more independent variable on two or more
dependent variables. In addition, MANOVA can also detect the difference in co-relation between
dependent variables given the groups of independent variables.
Chi-Square Test
Chi-square test is used to compare categorical variables. There are two type of chi-square test
2. A chi-square fit test for two independent variables is used to compare two variables in a contingency
table to check if the data fits.
One sample t-test using Minitab
Assumptions
A one-sample t-test has four assumptions.
These assumptions cannot be tested using the software and has to be ensured by the researcher.
Assumptions 3 and 4 relate to the nature of your data and can be checked using Minitab.
o Assumption 3: There should be no significant outliers. It is possible to test this assumption using
a simple box plot (see chapter on Graphs using Minitab). A box plot in Minitab will indicate all
those observations that fall outside the 3-sigma limits. These are outliers and can be removed
from the dataset before proceeding with the test.
Example
An order was placed for a component that was to weigh 50 milligrams with a tolerance of +/- 1.5
milligrams. 2 vendors were asked to provide 50 samples of the components so that they could be
compared and the order could be placed on one of them. The data from the samples is shown:
A B
49. 51. 48. 49. 46. 47. 52. 47. 49. 51.
26 07 05 56 41 33 84 13 09 13
49. 48. 49. 48. 48. 49. 51. 50. 52. 50.
75 95 70 32 16 31 08 35 24 16
49. 49. 49. 48. 50. 54. 50. 50. 52. 49.
57 98 34 07 72 19 79 35 44 44
49. 51. 49. 49. 50. 51. 51. 49. 47. 49.
17 17 22 35 12 34 27 09 59 42
49. 48. 50. 48. 50. 51. 48. 51. 49. 48.
53 40 27 60 57 06 96 09 67 61
49. 48. 49. 48. 49. 48. 47. 52. 51. 48.
87 97 36 50 79 29 11 20 32 65
49. 47. 47. 48. 48. 48. 47. 46. 50. 50.
92 32 99 08 36 70 11 69 04 19
48. 49. 48. 48. 49. 51. 52. 52. 49. 49.
10 87 33 37 27 97 09 04 30 07
50. 49. 49. 48. 50. 52. 48. 50. 51. 50.
90 55 37 39 34 08 50 92 07 06
49. 49. 50. 48. 49. 52. 48. 50. 46. 49.
76 67 16 89 39 03 33 55 20 55
Analyse the data and determine whether they conform to the target value of 50 mg.
Setup in Minitab
In Minitab, we set up the variable, Vendor A (say), under column “C1”. Then, we enter the scores on the
dependent variable (i.e., the weight of components) into the column.
5. Click the Graphs button, and then select Histogram. Click OK in each dialog box.
Note: By default, Minitab uses 95% confidence intervals, which equates to declaring statistical
significance at the p < .05 level. If you want to change this, you can do so by first clicking on
the “Options” button, which opens the 1-Sample t - Options dialogue box, where the level of significance
can be set to the required value.
Output of the one-sample t-test in Minitab
The Minitab output for the one-sample t-test is shown below:
Minitab will present the descriptive statistics including the sample size (the "N" column), mean (the
"Mean" column), standard deviation (the "StDev" column) and the standard error of the mean ("SE
Mean" column), as well as the 95% confidence interval (CI) of the mean ("95% CI").
Finally, the results of the one-sample t-test include the value of the known or hypothesized population
mean you are comparing your sample data to (the Test of mu = 50 vs not = 50 row), the observed t-
value (the "T" column) and the statistical significance (2-tailed p-value) of the one-sample t-test (the "P"
column).
As brought out earlier, if p-value > α (0.05, in this case), then accept the null hypothesis,
otherwise reject the null hypothesis. Here, the p-value is “0.000”. In other words, the
probability of getting a sample like the one above (Vendor A) is extremely low (0.000). Hence
we reject the null hypothesis that Average weight of the components is 50 mg.
Just as an exercise, we will repeat the same steps for “Vendor B”. The results for Vendor B are shown
below:
The p-value in this case is greater than α (0.05, in this case), hence, we accept the null
hypothesis that Average weight of the components is 50 mg.
Also, the histogram clearly shows that the observations are concentrated to the center of the required
target value.
Example
An order was placed for a component that was to weigh 50 milligrams with a tolerance of +/- 1.5
milligrams. 2 vendors were asked to provide 50 samples of the components so that they could be
compared and the order could be placed on one of them. The data from the samples is shown:
A B
49. 51. 48. 49. 46. 47. 52. 47. 49. 51.
26 07 05 56 41 33 84 13 09 13
49. 48. 49. 48. 48. 49. 51. 50. 52. 50.
75 95 70 32 16 31 08 35 24 16
49. 49. 49. 48. 50. 54. 50. 50. 52. 49.
57 98 34 07 72 19 79 35 44 44
49. 51. 49. 49. 50. 51. 51. 49. 47. 49.
17 17 22 35 12 34 27 09 59 42
49. 48. 50. 48. 50. 51. 48. 51. 49. 48.
53 40 27 60 57 06 96 09 67 61
49. 48. 49. 48. 49. 48. 47. 52. 51. 48.
87 97 36 50 79 29 11 20 32 65
49. 47. 47. 48. 48. 48. 47. 46. 50. 50.
92 32 99 08 36 70 11 69 04 19
48. 49. 48. 48. 49. 51. 52. 52. 49. 49.
10 87 33 37 27 97 09 04 30 07
50. 49. 49. 48. 50. 52. 48. 50. 51. 50.
90 55 37 39 34 08 50 92 07 06
49. 49. 50. 48. 49. 52. 48. 50. 46. 49.
76 67 16 89 39 03 33 55 20 55
Analyse the data and determine whether the average weight of components from each vendor is equal.
Setup in Minitab
We will be comparing the weight of components between Vendor A and Vendor B. We will use a data
set assuming that each data set is normally distributed with equal variances. The hypothesis will be:
Null Hypothesis (H0): μA = μB
Alternative Hypothesis (Ha): μA ≠ μB
Where μA is the mean of one population and μB is the mean of the other population of our interest.
In Minitab, we set up one variable, Vendor A (say), under column “C1” and the other Variable “Vendor
B” under column “C2”. Then, we enter the scores on both the variables into the respective columns.
Test in Minitab
In this example, we will be using a 2-Sample t data file for Minitab.
1. Click Stat → Basic Statistics → 2-Sample t.
2. From the drop down list select “Each sample is in its own column”. Click in the blank box next to
“Samples” and the “Vendor A” and “Vendor B” appears in the list box on the left.
3. Select “Vendor A” and “Vendor B” as the “Samples.”
4. Click options.
Set the required confidence level. Check the box that says “Assume Equal Variances”
5. Click “OK” to save, and click “OK” again to run the test.
Take notice of a couple of important bits of information provided by the output. The mean of Vendor A
and Vendor B, the number of data points for each state represented by ‘N’ as well as each standard
deviation.
The key statistical output provided by Minitab when running a 2-sample t test is the P-Value. Since the
p-value of the t-test (assuming equal variance) is 0.01, it is lesser than the alpha level of 0.05. Therefore
we reject the null hypothesis which was (H0): μA = μB.
One-way ANOVA using Minitab
Introduction
The one-way analysis of variance (ANOVA) is used to determine whether the mean of a dependent
variable is the same in two or more unrelated, independent groups of an independent variable.
However, it is typically only used when you have three or more independent, unrelated groups, since
an independent t-test is more commonly used when you have just two groups.
Assumptions
The one-way ANOVA has six assumptions. You cannot test the first three of these assumptions with
Minitab because they relate to your study design and choice of variables. However, you should check
whether your study meets these three assumptions before moving on. If these assumptions are not met,
there is likely to be a different statistical test that you can use instead. Assumptions 1, 2 and 3 are
explained below:
These assumptions cannot be tested using the software and has to be ensured by the researcher.
Assumptions 4, 5 and 6 relate to the nature of your data and can be checked using Minitab.
o Assumption 4: There should be no significant outliers. It is possible to test this assumption using
a simple box plot (see chapter on Graphs using Minitab). A box plot in Minitab will indicate all
those observations that fall outside the 3-sigma limits. These are outliers and can be removed
from the dataset before proceeding with the test.
o Assumption 6: There needs to be homogeneity of variances. You can test this assumption in
Minitab using Levene's test for homogeneity of variances.
Example:
An order was placed for a component that was to weigh 50 milligrams with a tolerance of +/- 1.5
milligrams. 4 vendors were asked to provide 50 samples of the components so that they could be
compared and the order could be placed on one of them. The data from the samples is shown:
Vendor A Vendor B Vendor C Vendor D
49.26 49.36 47.33 52.20 51.24 50.70 48.96 48.87
49.75 47.99 49.31 46.69 49.98 50.97 48.78 49.47
49.57 48.33 54.19 52.04 50.75 51.06 49.44 49.59
49.17 49.37 51.34 50.92 50.60 51.63 49.36 48.93
49.53 50.16 51.06 50.55 50.67 51.97 49.24 48.17
49.87 49.56 48.29 49.09 50.68 50.48 49.24 49.04
49.92 48.32 48.70 52.24 51.18 50.43 49.56 49.30
48.10 48.07 51.97 52.44 50.84 51.31 48.53 49.74
50.90 49.35 52.08 47.59 52.56 51.07 49.35 48.56
49.76 48.60 52.03 49.67 51.66 50.16 48.66 48.65
51.07 48.50 52.84 51.32 51.14 50.80 49.30 48.73
48.95 48.08 51.08 50.04 51.76 50.21 48.61 49.53
49.98 48.37 50.79 49.30 51.19 50.03 49.51 48.59
51.17 48.39 51.27 51.07 51.47 50.58 48.69 48.94
48.40 48.89 48.96 46.20 51.10 51.60 48.93 49.07
48.97 46.41 47.11 51.13 50.68 51.42 49.34 49.11
47.32 48.16 47.11 50.16 50.91 51.70 49.20 48.61
49.87 50.72 52.09 49.44 51.27 49.98 49.03 48.49
49.55 50.12 48.50 49.42 51.26 51.30 48.60 49.60
49.67 50.57 48.33 48.61 50.98 50.76 49.67 49.42
48.05 49.79 47.13 48.65 51.11 51.50 47.84 47.82
49.70 48.36 50.35 50.19 51.03 50.84 48.87 48.72
49.34 49.27 50.35 49.07 51.52 51.21 48.43 48.59
49.22 50.34 49.09 50.06 51.14 50.92 49.22 48.92
50.27 49.39 51.09 49.55 50.62 51.64 48.59 48.68
Analyse the data and determine whether the average weight of components from each vendor is equal
to each other.
Note: This problem is similar to the previous problem as shown in 2 sample t- test. However, the t-test is
limited to a maximum of 2 samples. In this case, since there are 4 samples, ANOVA will be used for
hypothesis testing.
Setup in Minitab
We will be comparing the weight of components between Vendor A, Vendor B, Vendor C and Vendor D.
We will use a data set assuming that each data set is normally distributed with equal variances. The
In Minitab, we set up each variable, under separate columns “C1, C2, C3, and C4”. Then, we enter the
scores on all the variables into the respective columns.
Test in Minitab
1. Click Stat → ANOVA → One-Way…
2. From the drop down list select “Response data are in separate column for each factor level”.
Click in the blank box under “Responses” and the “Vendor A”, “Vendor B”, “Vendor C” and
“Vendor D” appears in the list box on the left.
3. Select “Vendor A”, “Vendor B”, “Vendor C” and “Vendor D” as the “Samples.”
4. Click options.
Set the required confidence level. Check the box that says “Assume Equal Variances”
5. Click “OK” to save, and click “OK” again to run the test.
Results of ANOVA:
The results for our study of how to run ANOVA in Minitab appear automatically in the session window
after clicking “OK.” Minitab’s output is below.
The initial part of the output gives information about the Method used (the two hypotheses,
significance level, etc.) and the factors and levels. Here we see that there is only one factor and 4 levels.
The “Analysis of Variance” section gives us the p-value which can be compared against the level of
significance (alpha). Here, we see that the p-value is 0.000 which is less than alpha. As brought out
earlier, when p-value < alpha, we reject the null hypothesis. In other words, the sample data gives very
little evidence (0.000) that the weights of components from the 4 vendors have the same average.
To determine how well the model fits your data, examine the goodness-of-fit statistics in the model
summary table.
S is measured in the units of the response variable and represents the how far the data values fall from
the fitted values. The lower the value of S, the better the model describes the response.
R2 is the percentage of variation in the response that is explained by the model. The higher the R 2 value,
the better the model fits your data. R 2 is always between 0% and 100%.
However, despite the values of S (ideally low) and R 2 (ideally high), it does not indicate that the model
meets the model assumptions. You should check the residual plots to verify the assumptions.
Use predicted R2 to determine how well your model predicts the response for new observations. Models
that have larger predicted R2 values have better predictive ability.
A predicted R2 that is substantially less than R2 may indicate that the model is over-fit.
Use the interval plot to display the mean and confidence interval for each group.