0% found this document useful (0 votes)

53 views

Edur 8131 Notes 5 T Test

The document discusses the one sample t-test, including its formulas, hypotheses, critical values, assumptions, and an example. Key points: - The one sample t-test is similar to the Z-test but uses the sample standard deviation to estimate the standard error of the mean. - Hypotheses for a one sample t-test follow the same format as a Z-test, comparing the sample mean to a hypothesized population mean. - Critical t-values are found using degrees of freedom (df = n - 1) and significance level. Decision rules depend on whether the test is one-tailed or two-tailed. - In an example, a class's average weight is compared

Uploaded by

Nazia Syed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views

Edur 8131 Notes 5 T Test

Uploaded by

Nazia Syed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Notes 5: t tests

1. The one sample t-test

(a) Formulas
Recall the Z test formula:

X 
zX =
 n

The one sample t-test, which is very similar to the Z test, has the following formula:

X  X 
t= =
s n sX

where the only difference is SD versus  in the Z test. That is, the standard error of the mean is now
estimated by the formula:

SD
sX =
n

where the symbol, s X , is used to indicate that the standard error of the mean is being estimated with the
sample SD. Recall that the standard error of the mean for the Z test was calculated as:


X =
n

(b) Hypotheses for the one sample t-test

Hypotheses for the one sample t-test are formulated in exactly the same manner as in the one
sample Z test. Using SAT as an example (note,  = 1000), the one sample t-test non-directional
hypothesis is symbolized as:

Non-directional:
H1:   1000
H0:  = 1000

Directional (one-tail tests):

Lower-tailed test
H1:  < 1000
H0:   1000 or
H0:  = 1000 (this one is preferred)

Upper-tailed test
H1:  > 1000
H0:   1000 or
H0:  = 1000 (this one is preferred)
2

Note that for the directional hypotheses, the alternative, H1, states what one expects to find (as long as a
relationship, or difference, is expected). For example, if one expects that a sample of students will have a
higher than average IQ, then H1:  > 1000. Similarly, if one expects that a given sample of students will
have a lower than average IQ, the H1:  < 1000.

(c) Critical t-values, (tcrit)

Like the Z test, one may use critical values for hypothesis testing. Critical t-values are obtained from a t-
table (see text). Note that t-values have distributions that are similar to normal distributions, but they are
slightly fatter in the tails. Finding t-values in the t table is similar to the z table. To find the correct
critical t-values (denoted as tcrit), one must first calculate the degrees of freedom (df or ν). For the one-
sample t-test, degrees of freedom are defined as:

df (or ν) = n - 1

where, as before, n is the sample size. Degrees of freedom can be described as the amount of information
available in the sample after certain mathematical restrictions are applied to the data.

(d) Statistical significance ()

Next, one must determine the level of statistical significance for the analysis. As before, alpha is usually
set at .10, .05, or .01. Once the alpha level is determined, critical values for the one sample t-test can be
found.

(e) Decision rules

Deciding whether to reject or fail to reject the null can be determined by decision rules. The decision
rules are:

Two-tailed tests:
If t  -tcrit or t  tcrit, then reject H0; otherwise, fail to reject H0

One-tailed (upper-tailed) test

If t  tcrit, then reject H0; otherwise, fail to reject H0

One-tailed (lower-tailed) test

If t  - tcrit, then reject H0; otherwise, fail to reject H0

Note that tcrit symbolizes the critical t-value found in t tables and is different from t, which is the
calculated t-ratio obtained from sample data.

(f) An example
A physical education teacher wishes to know whether his class of students is statistically above or below
the national average in weight. The national average for eighth graders is  = 100. The student weights
for his class are: 99, 98, 105, 110, 115, 103, 88, 125, 130, and 115. For this sample, n = 10, X = 108.8,
SD = 12.839, so

X  X   108.8 - 100 8.8 8.8

t= = = = = = 2.167.
s n sX 12.839/ 10 12.839/3.162 4.06

Version: 3/1/2012
3

The df (or ν) = n - 1 = 10 - 1 = 9, and set  = .05.

The goal of the test is to determine whether the sample has an average weight that is statistically different
from the national average. This calls for a non-directional test because the specific direction of the mean
difference (higher or lower) was not indicated. Therefore,

H0:  = 100, and

H1:   100.

The critical value is tcrit = 2.262. The decision rule is:

If 2.167  -2.262 or 2.167  2.262, then reject H0; otherwise, FTR H0.

Since 2.167 is neither less than -2.262 nor greater than 2.262, the null is not rejected (i.e., fail to reject)
and one concludes that the sample does not have a statistically different mean from the national average.

What happens if instead it was hypothesized that the sample would have a lower than average weight,
i.e.,

H1:  < 100, and

H0:  = 100.

This is a lower tailed test since the sample is expected to have a lower mean score. If  = .05, the
corresponding critical t-value is tcrit = - 1.833. The decision rule states:

If 2.167  -1.833, then reject H0; otherwise, fail to reject H0.

Again, the null is not rejected, i.e., fail to reject.

Finally, had one hypothesized that the sample would be above average, then

H1:  > 100, and

H0:  = 100.

This is an upper-tailed test. The critical t-value, for an alpha of .05, is tcrit = 1.833, and this time the null
is rejected, so the sample can be said to have a statistically higher average weight than normal.

It is also possible to perform hypothesis testing with the t-test without using critical t-values. Recall that
the Z test had decision rules for p-values. Calculated probability values, p-values, are usually reported
with statistical software, and the decision rules for the Z test also apply to the t test. See earlier notes for
these decision rules. (This section to be discussed further in class.)

(g) Assumptions
The assumptions for the one sample t-test are identical to the Z test: normality and independence.

Version: 3/1/2012
4

(h) Exercises

(1) Raw MAT scores are 31, 38, 27, 41, 39, and 36. Is this sample statistically different from the national
average MAT of 30? Set  = .01.

(2) Same scores, but  = .05.

(3) Same scores and  = .05, but this time hypothesize that the sample average will be greater than
population average.

(4) Fifteen students have a sample MAT mean of 32.3 with a sample standard deviation of 4.73. Does
this sample of students have a mean MAT score that is statistically different from the national average at
the .01 level with a two tailed test? What about  = .10 and a two tailed test? What about  = .05 and an
upper-tailed test?

(5) Suppose the following random sample of ITBS-math scores are observed in your middle school: 45,
58, 65, 63, 35, 43, 78, 55, 58, 69, 81, and 49. Is this evidence that your middle school has a student
population above average in terms of mathematics skills (national average ITBS-math is 50)? Set alpha at
.05.

(6) You wish to determine whether you are getting cheated every time you buy a bag of apples. The
standard bag of apples that you buy states that it contains one pound (16 ounces) of apples. After you get
home you notice that the bag only contains 15.5 ounces, not the stated 16 ounces. To determine whether
or not the company is systematically cheating the consumer, you decide to buy every 16 ounce bag of
apples in the three local grocery stores. After weighing each bag you find the following weights: 14.3,
15.5, 16.3, 17.0, 15.2, 15.9, 14.8, 15.0, 15.2, 15.9, 15.7, 15.6, and 16.1. Setting the significance level at
.05, does it seem the company is systematically cheating the consumer? Which should you perform, an
upper-, lower-, or two-tailed test? Why?

(7) Ford claims that its new car, Aspire, gets 39 mpg on the highway. Consumer Reports magazine
wishes to test this claim, so they hire you for $1500 to perform the statistical testing. They buy 10
Aspires and road test each. They find the following mpg estimates for the cars: 32, 43, 39, 38, 34, 36, 35,
38, 39, and 36. Their question to you is: Does our sample of Aspires have an estimated mpg that is
different from Ford's claim? Set alpha at .05 and give them an answer.

Computer output for exercises 1 through 7

Exercises 1 through 3
. ttest mat=30

Variable | Obs Mean Std. Dev.

---------+---------------------------------
mat | 6 35.33333 5.316641

Ho: mean = 30
t = 2.46 with 5 d.f.
Pr > |t| = 0.0574

Version: 3/1/2012
5

Exercise 4
. ttesti 15 32.3 4.73 30

Variable | Obs Mean Std. Dev.

---------+---------------------------------
x | 15 32.3 4.73

Ho: mean = 30
t = 1.88 with 14 d.f.
Pr > |t| = 0.0806

Exercise 5
. ttest itbs=50

Variable | Obs Mean Std. Dev.

---------+---------------------------------
ITBS | 12 58.25 13.93573

Ho: mean = 50
t = 2.05 with 11 d.f.
Pr > |t| = 0.0649

Exercise 6
. ttest weight = 16

Variable | Obs Mean Std. Dev.

---------+---------------------------------
weight | 13 15.57692 .7013722

Ho: mean = 16
t = -2.17 with 12 d.f.
Pr > |t| = 0.0503

Exercise 7
. ttest mpg = 39

Variable | Obs Mean Std. Dev.

---------+---------------------------------
mpg | 10 37 3.091206

Ho: mean = 39
t = -2.05 with 9 d.f.
Pr > |t| = 0.0711

2. Confidence Intervals (CI) for Means

When estimating a parameter, one typically uses a point estimate like X , s, or s2. Using these point
estimates, one may construct an interval which will show a possible interval range of values which might
include the parameter being estimated.

A confidence interval (CI) for  is found by:

(1 - )CI = X  
1 / 2 t df  s 
X

which, stated differently, is

Version: 3/1/2012
6

(1 - )CI = X  1 / 2 tcritical  s X 

which is

 SD 
(1 - )CI = X  1 / 2 tcritical   
 n

or simply

  SD   SD  
(1 - )CI =  X 1 / 2 tcritical  , X 1 / 2 tcritical   
  n  n 

This is 100(1 - ) confidence interval. That is, if  = .05, then this is a 100(1 - .05) = 100(.95) = 95%
confidence interval, or .95CI. A .95CI means that one can be 95% confident that all intervals constructed
like this for 100 random samples, in the long run, will contain the population value . This means that if
100 such intervals were constructed, on average the population value of  would be correctly included in
95 of those intervals while would increase fail to include .

To calculate this CI, choose , say at .05, then construct the interval by simply finding the critical value
associated with  = .05, and filling in the rest of the formula.

Example:
Construct .95CI for a class of high school students (n = 12) with a mean IQ of 120 and a standard
deviation of 16.5.

 16.5 16.5 
120  2.201 , 120  2.201 
 12 12 

= 120  2.201 4.763, 120  2.201 4.763

= 120  10.483, 120  10.483

= (109.517, 130.483)

With such an interval, one may state that one is 95% confident that this interval contains the true  for all
students who are like the students in the particular high school class (apparently smart students).

Based upon this confidence interval, it seems that this high school class is quite different from the mean
score typically found for IQ tests in the population. How does one know this?

The CI may also be used as a non-directional hypothesis test. If the hypothesized population value of  is
not within the CI, then H0:  = 100 may be rejected. Since the value 100 is not within the interval
constructed, which ranges from 109.5 to 130.5, one may conclude that sample data appears to differ,
statistically, from the hypothesized value of 100. In this particular case, the sample data such a mean that
is higher than the expected value of 100.

Version: 3/1/2012
7

Exercises

(1) Construct a .99CI for the following scores: 120, 123, 125, 101, 98, 101. Test the hypothesis, using the
.99CI, that H0:  = 100.

(2) Same as (1), but use a .95CI.

(3) Same as (1), but use a .90CI.

(4) Fifteen students have an SAT mean of 1200 with a standard deviation of 150. Does this sample of
students have a mean SAT score statistically different from the population mean of 1000 at the  = .05?
Use a CI to answer this question. Is the mean statistically different if a .99CI is used?

3. The Two-Independent Samples t test (also called the Two Group t test)

(a) Situation
Both the Z test and the one sample t-test allow one to statistically comparing the mean of one sample of
observations with a given population value (e.g., ). If one is interested in comparing two independent
groups, then the two independent sample t-test may be appropriate.

For example, suppose one is using a posttest only control group design to examine the effect of computer
assisted learning in geography achievement among third graders. The control (or comparison) group is
taught U.S. geography with the traditional methods using maps, textbooks, and workbooks. The
experimental group uses the computer game Where in the U.S. is Carmen SanDiego. At the end of the
lesson, both groups are given the same posttest. A two group independent t-test would be appropriate for
determining statistical difference between the control and experimental groups.

(b) Hypothesis formulation:

One may formulate three different research hypotheses for the above example.

Non-directional:
The experimental and control group will have different levels of achievement in US geography.

H0: 1 = 2 and H1: 1  2, or

H0: 1 - 2 = 0.00 and H1: 1 - 2  0.00

where 1 represents for group 1 (experimental group) and 2 represents group 2 (control group).

Directional (group 1 has higher mean than group 2):

The experimental group will show a higher level of achievement.

H0: 1  2 and H1: 1 > 2, or

H0: 1 - 2  0.00 and H1: 1 - 2 > 0.00

Version: 3/1/2012
8

Directional (group 2 has higher mean than group 1):

The experimental group will show a lower level of achievement.

H0: 1  2 and H1: 1 < 2, or

H0: 1 - 2  0.00 and H1: 1 - 2 < 0.00

(c) Formulas for calculating the t ratio

To test the above hypotheses, the two sample independent t statistic is calculated as:

( X 1  X 2 )  ( 1   2)
t=
s X1  X 2

Since it is usually assumed that 1 - 2 = 0.00 (no difference in the population values), the t formula can
be simplified to

X1  X 2 X1  X 2
t = =
s X1  X 2 SEd

where
s12 s22
SEd = s X1  X 2 = 
n1 n2

Note that SEd represents the standard error of the difference, which, like the standard error of the mean,
represents the standard deviation of the sampling distribution for X1  X 2 . The symbols s12 and s22
represent the variances for group 1 and group 2, respectively.

Recall that the sampling distribution of the sample mean has a known distribution that approaches the
normal distribution when sample sizes are large. The sampling distribution for X1  X 2 also follows the
central limit theorem. Note that the mean of the sampling distribution of X1  X 2 is equal to 1 - 2. The
standard error for X1  X 2 is SEd = s X1  X 2 .

(d) Degrees of Freedom

Degrees of freedom for the two independent sample t-test are:

df (or ν) = n1 + n2 – 2

where the n1 is the sample size for group 1 (experimental group) and n2 is the sample size for group 2
(control group).

Version: 3/1/2012
9

(e) Decision Rules

The decision rules are the same as for the one sample t-test.

Two-tailed test
If t  -tcrit or t  tcrit, then reject H0; otherwise, fail to reject H0

One-tailed (upper-tailed, group 1 anticipated to have higher mean than group 2) test
If t  tcrit, then reject H0; otherwise, fail to reject H0

One-tailed (lower-tailed, group 1 anticipated to have lower mean than group 2) test
If t  - tcrit, then reject H0; otherwise, fail to reject H0

(f) Assumptions
The two independent samples t-test requires that the raw scores in both populations be normally
distributed and independent. Also, the two populations should have equal (homogeneous) variances. The
two group t-test is generally robust to non-normality and unequal variance (provided n1  n2), but is not
robust to dependence of observations.

(g) An Example
Recall the geography experiment. The scores for both groups are:

Experimental Group Control Group

88 79
89 75
91 86
95 91
86 92
87 82
88 80
79 82
88 81

X e = 87.889 X c = 83.111
s = 4.256 s = 5.578
n= 9 n= 9

The experimental group has a mean of 87.889 and a standard deviation of 4.256, and the control group
had a mean of 83.111 and a standard deviation of 5.578. There were 9 students in the experimental group
and 9 students in the control group. So the two independent group t-test, with an  = .05 and a non-
directional test would be:

Xe  Xc 87.889  83.111 4.778

t= = = = 2.043
sXe Xc 18.114 31.114 2.339

9 9

and the degrees of freedom are df = n1 + n2 - 2 = 9 + 9 - 2 = 16. The critical t is: tcrit = 2.120. The
rejection regions are: t <
_ -2.120, and t >
_ 2.120, and the decision rule is:

Version: 3/1/2012
10

If 2.043  -2.120 or 2.043  2.120, then reject H0; otherwise, FTR H0

The correct decision is fail to reject H0. One would therefore conclude the following:

There is not a statistically significant difference in geography achievement between the

experimental and control group for this sample at the .05 level of significance. This finding
indicates achievement scores for geography students do not appear to differ between those who
do and do not use the software Carmen SanDiego.

Note, however, what would happen if one hypothesized that the experimental group would have higher
scores than the control group. If  = .05, the critical value for an upper-tailed would be 1.746, so the
decision rule would be:

If 2.043  1.746, then reject H0; otherwise, fail to reject H0

Now H0 is rejected, and one could conclude the following:

The data indicate that students who learn with the computer program Carmen SanDiego show a
statistically significant, at the .05 level, higher achievement score in U.S. geography. Thus, use of
the software appears to benefit students.

(h) Confidence Intervals About Mean Differences

Recall the CI for a sample mean:

(1 - )CI = X  1 / 2 tcritical  s X 

One may similarly compute a CI for the difference between two means. The formula is:

(1 - )CI = X1  X 2   1 / 2 tcritical  s X  X 1 2


The .95CI for the above example is:

.95CI = X1  X 2   .975tcritical  s X  X

1 2

= (87.889 - 83.111)  2.12(2.339)
= (4.778)  4.959, or between -0.181 and 9.737

Since 0 is within this interval, H0 will not be rejected.

Version: 3/1/2012
11

Computer Analysis of Above Example

. ttest scores, by(group)

Variable | Obs Mean Std. Dev.

---------+---------------------------------
0 | 9 83.11111 5.577734
1 | 9 87.88889 4.255715
---------+---------------------------------
combined | 18 85.5 5.404247

Ho: mean(x) = mean(y) (assuming equal variances)

t = -2.04 with 16 d.f.
Pr > |t| = 0.0579

(i) Strength of Association for Two Group t-test (effect size)

While a statistically significant t-test indicates that the two groups are probably not equal, the t-test does
not indicate the strength of the association between the independent variable and the dependent variable.
In the study just discussed, the independent variable (IV) is the presence or absence of the treatment, and
the dependent variable (DV) is the posttest achievement score.

The question one may ask after rejecting H0 is just how strong an impact does the treatment have on
student achievement. One measure of the strength of the association between the treatment and the
outcome is eta squared, η2:

t2
η2 =
t 2  df

For example, the calculated t above was 2.043, so

2.0432 4.174
η2 = = = .207
2.043  16
2
4.174  16

The value obtained for η2 may be interpreted in a manner identical to r2, such as the variance explained or
predicted in posttest scores by the treatment. In fact, if one calculates a Pearson's correlation between the
two numerical variables listed in the table below (posttest scores and the indicator of treatment
[1=treatment, 0=control]), the obtained r will be equal to .455 and the r 2 will be .207!

Version: 3/1/2012
12

Posttest Scores Indicator of Treatment Treatment

Condition
88 1 Experimental
89 1 Experimental
91 1 Experimental
95 1 Experimental
86 1 Experimental
87 1 Experimental
88 1 Experimental
79 1 Experimental
88 1 Experimental
79 0 Control
75 0 Control
86 0 Control
91 0 Control
92 0 Control
82 0 Control
80 0 Control
82 0 Control
81 0 Control

This should indicate to you that one may actually use a Pearson correlation to determine whether two
groups are statistically different. For example, using the same experimental data, one could reproduce the
same t value obtained from the two independent groups t-test using only the correlation r:

r n2 .455 18  2 1.82

t= = = = 2.043
1 r2 1  .207 .891

In short, the two group independent t-test and the Pearson correlation coefficient provide identical
inferential results. The two group t-test requires the calculation of η2 in order to determine the strength of
the relationship between the IV and DV.

(j) Effect Size (ES)

One may choose to relate to the reader the magnitude of the effect of the treatment by providing η 2.
Another means of relaying this information, which is growing in importance in research today, is the
standardized ES indicator.

ES, denoted in the researcher literature as d and/or Δ, may be calculated with one of two formulas. First,
d is

X1  X 2
d=
SD within

where

Version: 3/1/2012
13

 X  X    X  X 
2 2
1 2
SDwithin =
n1  1  n2  1

SDwithin is essentially the average SD for the two groups.

Second, Δ is

X1  X 2
Δ=
SD controlgroup

where SDcontrol group is simply the SD of the control group (if one is present).

Note that both d and Δ describe the magnitude of the difference between the two group means in standard
deviation units. So, for example, if d or Δ = .2, then this indicates that the two group means differ by .2
standard deviations. The larger either d or Δ, the greater the difference between two groups, and, hence,
the larger the effect of the treatment.

In the example used above the ES is

 X  X    X  X 
2 2
1 2
SDwithin =
n1  1  n2  1

144.908  248.913
=
(9  1)  (9  1)

393.821
=
16

= 24.614 = 4.961

X 1  X 2 87.889  83.11
d= = = 0.963
SD within 4.961

If one wished to calculate Δ, then the corresponding ES is:

X1  X 2 87.889  83.11
Δ= = = 0.857
SD controlgroup 5.578

Either ES is appropriate to use when an experimental group is compared to a control group. When two
groups are compared and the two groups do not represent experimental and control (such as males vs.
females), then one should use d as the measure of ES.

Version: 3/1/2012
14

(k) Exercises

(1) Determine whether boys have a statistically different, at the 1% level, ITBS math score from girls.
The mean math score for boys is 78 (s = 5.3) and the mean for girls is 73 (s = 6.1). There are 25 boys and
25 girls.

(a) What is the correct H0 and H1 in both written and symbol form?
(b) What are the critical and calculated t-values?

(2) Determine whether a statistical difference exists between men and women in weight:
Men: 156, 158, 175, 203, 252, 195
Women: 149, 119, 168, 123, 155, 126

(a) Test for a non-directional H0 with  = .01; what is the correct H0, H1?
(b) Test for a non-directional H0 with  = .10.
(c) Test the hypothesis that men will have lower weight, and set  = .10. What is the correct H0, H1?

(3) Two classes of educational research were taught with two different methods of instruction, teacher
guided (TG) and self paced (SP). Which had the better student achievement at the end of the quarter?

TG scores: 95, 93, 87, 88, 82, 92

SP scores: 78, 89, 83, 90, 78, 86

(a) Test for a non-directional H0 with  = .01; what is the correct H0, H1?
(b) Test for a non-directional H0 with  = .10.
(c) Test the hypothesis that TG will have higher scores, and set  = .05. What is the correct H0, H1?

(l) Computer answers to exercises

Example 1
. ttesti 25 78 5.3 25 73 6.1

Variable | Obs Mean Std. Dev.

---------+---------------------------------
x | 25 78 5.3
y | 25 73 6.1
---------+---------------------------------
combined | 50 75.5 6.193644

Ho: mean(x) = mean(y) (assuming equal variances)

t = 3.09 with 48 d.f.
Pr > |t| = 0.0033

Version: 3/1/2012
15

Example 2
. ttest weight, by(sex)

Variable | Obs Mean Std. Dev.

---------+---------------------------------
0 | 6 140 20.07984
1 | 6 189.8333 35.89661
---------+---------------------------------
combined | 12 164.9167 38.02979

Ho: mean(x) = mean(y) (assuming equal variances)

t = -2.97 with 10 d.f.
Pr > |t| = 0.0141

Example 3
. ttest scores, by(groups)

Variable | Obs Mean Std. Dev.

---------+---------------------------------
0 | 6 84 5.25357
1 | 6 89.5 4.764452
---------+---------------------------------
combined | 12 86.75 5.57796

Ho: mean(x) = mean(y) (assuming equal variances)

t = -1.90 with 10 d.f.
Pr > |t| = 0.0867

4. Two Correlated Group t test (also called dependent samples t test)

The correlated t test allows the researcher to consider differences between two groups or sets of scores
that are related to one-another. Under what conditions is one likely to find correlated or dependent
samples or groups?

Condition 1
Before/After Studies; Multiple Measures on the Same Subject = This type of data occurs most often with
pretest-treatment-posttest experimental designs. These types of designs are used to determine whether
some treatment will change posttest scores relative to the pretest score. The pretest and posttest scores
are related because the scores are taken from the same individuals, i.e., each person is measured twice.

Examples:
(a) A student takes the SAT, enrolls in an SAT enhancement class, and then retakes the SAT. Two scores
from the same student exist.

(b) A teacher measured the reading performance of a third-grader, presented some treatment designed to
increase reading performance, then remeasured the student's reading performance again (two scores from
same individual).

(c) A PE teacher measures the vertical jumping ability of his class, provides his class a weight training
program for one month, then remeasures vertical jumping ability of each student (two scores from same
students).

Version: 3/1/2012
16

Condition 2
Matched-Subjects = Two groups are involved in the study (experimental and control); and they are
matched on some extraneous variable(s) that is likely to be related to the dependent variable being
examined.

Examples:
(a) A teacher is interested in determining whether "Hooked on Phonics" increases third-grade students'
reading performance. Using two groups of students, group A (the experimental group) will use "Hook on
Phonics" for one month, and group B (the control) will be exposed to the usual reading lessons during the
month. The teacher knows that IQ influences reading performance, so to control for the effects of IQ on
the dependent variable (which is a posttest on reading performance), the researcher matches students in
the two groups on their IQ levels in a fashion similar to the schematic below:

Group A (treatment) Group B (control)

IQ score IQ score
High (110+) Beth and Sue John and Ann High (110+)
Middle (90-110) Bob and Susan Fred and Bill Middle (90-110)
Low (<90) Bryan and Bill Josh and Walt Low (<90)

In this scheme, students from both groups are matched according to their IQ levels. It is important to
match on IQ since we would expect students with higher IQs to perform better on a reading test than
students with lower IQs.

(b) As another example, one might make a comparison of faculty salary between men and women to
determine whether sexual discrimination exists. It would be important to match men and women on
academic rank since we know that assistant professors, on average, make less than associate and full
professors.

Condition 3
Naturally occurring pairs = Natural pairs, such as husbands and wives, twins, brothers, sisters, brothers
and sisters, parents and their children, etc. With naturally occurring pairs, one would expect the pairs to
hold similar feelings, beliefs, attitudes, etc., so their scores will generally be related to one-another.

Examples:
(a) Determining whether husbands' attitudes toward politics are similar to their wives. Since people tend
to marry others like themselves, one would expect that most husbands and wives to hold similar political
views.

(b) Determining whether boys' IQ differs from girls' IQ. Since brothers and sisters are similar genetically,
one might anticipate the two to have similar IQs, that is, their IQs are likely to be related; therefore,
brothers and sisters need to be matched.

Hypothesis Formulation:
The hypothesis tested with the correlated t-test is the same as in the independent t-test.

For example, suppose one is in determining whether boys or girls get higher math scores on the ITBS.
Clearly, intelligence plays an important part in determining mathematics performance, so this is a factor
that needs to be controlled through matching. One may formulate several hypotheses, as demonstrated
below.

Version: 3/1/2012
17

Non-directional:
The average ITBS math scores will differ between boys and girls; their scores will differ on average.

H0: 1 = 2 and H1: 1  2, or

H0: 1 - 2 = 0.00 and H1: 1 - 2  0.00

where 1 represents for group 1 (boys) and 2 represents group 2 (girls).

Directional (group 1 has higher mean than group 2):

Boys will score higher, on average, than girls.

H0: 1  2 and H1: 1 > 2, or

H0: 1 - 2  0.00 and H1: 1 - 2 > 0.00

Directional (group 1 has lower mean than group 2):

Boys will score lower, on average, than girls.

H0: 1  2 and H1: 1 < 2, or

H0: 1 - 2  0.00 and H1: 1 - 2 < 0.00

Theoretical Formula for Correlated t test

The t ratio for the correlated t test can be calculated as:

X1  X 2 X1  X 2
t = =
s X1  X 2 SEd

where X 1  X 2 is the difference between the two sample means, and the denominator is the standard error
of the difference, SEd.

Note that this is identical to the formula for the two independent sample t test. The difference between
the formulas for the independent and the correlated t test occurs in the calculation of the standard error of
the difference.

For the correlated t test the standard error of the difference is calculated as:

s12 s22  s  s 
SEd = s X1  X 2 =   2r12  1  2 
n1 n2  n  n 

but in the independent t test it is assumed that the groups are not related (scores between groups are not
correlated), so the standard error looses the correlated term in the formula, i.e.:

Version: 3/1/2012
18

s12 s22  s  s  s2 s2  s  s  s2 s2 s2 s2
s X1  X 2 =   2r12  1  2  = 1  2  20 1  2  = 1  2  0 = 1  2
n1 n2  n  n  n1 n2  n  n  n1 n2 n1 n2

If there is no correlation, then the SEd formula reduces to the SEd formula given in the independent
samples t-test. In short, the primary difference between the two t tests is the calculation of the standard
error of the difference, SEd.

Practical Formula for Correlated t test

To calculate the correlated t statistic, the following formula is easier to use:

d  d d d
t = = =
sd2 / n sd2 / n SEd

where d is the mean of the differences between pairs of scores, i.e.,

d=
d
n

and SEd is the standard error of the differences:

SEd = sd2 / n

where sd2 is the variance of the difference scores, and is calculated like a regular variance, i.e.,

 d  d 
2

sd2 =
n 1

In short, the correlated t test is may be viewed as the mean of the difference, d , divided by the standard
error of the difference, SEd.

d
t =
SEd

Degrees of Freedom:
The df for the correlated t test is calculated as:

df = n - 1

where n represents the number of pairs across the two groups.

Decision Rules:
The decision rules are the same as in the independent two-sample t test:

Version: 3/1/2012
19

Two-tailed tests:
If t  -tcrit or t  tcrit, then reject H0; otherwise, fail to reject H0

One-tailed (upper-tailed) test

If t  tcrit, then reject H0; otherwise, fail to reject H0

One-tailed (lower-tailed) test

If t  - tcrit, then reject H0; otherwise, fail to reject H0

Note that tcrit symbolizes the critical t-value found in t tables and is different from t, which is the
calculated t-ratio obtained from sample data.

Example 1:
Suppose we are interested in determining whether salary differs between men and women faculty at
GSU. When randomly selecting subjects for the study, it is important that we take into consideration their
academic rank since full professors make more money than associate professors, and associates make
more money than assistant professors, on average. Test the hypothesis of no difference between men and
women, H0: 1 = 2, at the 5% significance level.

Income Difference Income

Rank Men Women Rank
Full Bill = 48,000 - 3000 Beth = 51,000 Full
Full Bob = 51,000 6000 Bertha = 45,000 Full
Associate Billy = 43,000 - 1000 Bobby = 44,000 Associate
Associate Burt = 38,500 2500 Bonnie = 36,000 Associate
Assistant Brando = 24,500 - 500 Brenda = 25,000 Assistant
Assistant Bart S. = 28,000 5000 Bette = 23,000 Assistant
Assistant Brent = 33,000 7000 Beulah = 26,000 Assistant

d=
 d = 16000 = 2285.714
n 7

Difference Mean of Difference Deviation Deviation Squared

D d (d  d ) (d  d ) 2
- 3000 - 2285.714 -5285.714 27938772.49
6000 - 2285.714 3714.286 13795920.49
- 1000 - 2285.714 -3285.714 10795916.49
2500 - 2285.714 214.286 45918.49
- 500 - 2285.714 -2785.714 7760202.49
5000 - 2285.714 2714.286 7367348.49
7000 - 2285.714 4714.286 22224492.49

SEd = sd2 / n

where sd2 is the variance of the difference scores, and is calculated like a regular variance, i.e.,

Version: 3/1/2012
20

 d  d 
2

sd2 =
n 1


 d  d  
2
   89928571.43 
 n 1   
7 1
SEd = sd / n = 
2 =  
n 7

 89928571.43  14988095.24
=   7= =1463.269
 6  7

so the t value will be:

d d 2285.714
t= = = = 1.562
sd2 /n SEd 1463.269

The critical values at the .05 level for df = n - 1 = 6 are  2.447, so fail to reject H0 and conclude that
salaries do not appear to differ between men and women faculty at GSU even after controlling for
academic rank.

What do you think would happen if an independent samples t test were used to analyze the above data?

Calculate the regular independent t test and see: Mmen = 38000, SDmen = 10012.49, Mwomen = 35714.29,
and SDwomen = 11250.4.

Which is more powerful (recall that power represents the probability of rejecting a false H0), the
independent or correlated t test? Why?

Version: 3/1/2012
21

Example 2:
A researcher wishes to discover whether or not the intake of orange juice increases the potassium level in
the bloodstream. A group of 12 elderly patients are selected from those in a nursing home, where
previous diet has been controlled. Potassium blood levels are measured for each subject. Next, each
subject is given a quart of orange juice, and, two hours later, potassium levels are again measured. Test
the difference in potassium levels at the 5% level. The data are as follows (the scaled scores represent
potassium blood levels):

Subject Before After Difference Mean of Deviation Deviation

Potassium Potassium Difference Squared
Level Level (d  d )
d (d  d ) 2
1 26 25 1 -2 3 9
2 25 28 -3 -2 -1 1
3 24 27 -3 -2 -1 1
4 23 26 -3 -2 -1 1
5 23 25 -2 -2 0 0
6 21 23 -2 -2 0 0
7 19 21 -2 -2 0 0
8 17 19 -2 -2 0 0
9 17 16 1 -2 3 9
10 16 19 -3 -2 -1 1
11 15 18 -3 -2 -1 1
12 14 17 -3 -2 -1 1

d=
d =  24
= -2.00
n 12

and the standard error of the difference is:


  (d  d ) 2

  24   24 
 n 1     
SEd = sd2 / n =   =  12  1  =  11  = 2.182 = .426
n 12 12 12

so the calculated t value will be:

d d 2
t= = = = -4.695
sd2 / n SEd .426

The hypothesis was that orange juice will increase potassium in the blood stream, i.e., the pretest scores
will be lower than the posttest scores. This hypothesis indicates that a lower-tailed test is needed since
H0: 1  2 and H1: 1 < 2.

The critical value at the .05 level for df = 12 - 1 = 11 is - 1.796, so we reject H0, and conclude that orange
juice does appear to increase the amount of potassium in the blood stream for elderly people.

Version: 3/1/2012
22

Exercises:

(1) A researcher is interested in determining whether typing speed is affected by the kind of typewriter
(electric versus manual) used. A group of student typists, equally experienced on both types of machines,
are randomly selected and are matched on the basis of their typing speed (error-free words per minute).
One group is then tested on an electric machine and the other group on a manual machine. Test H0 at the
1% significance level. The data are as follows:

(a) What are the correct H0 and H1 in both written and symbolic form?
(b) What is (are) the critical value(s)?
(c) What is the obtained (calculated) t value?
(d) Did you reject or fail to reject H0?
(e) Write your conclusion as if explaining the results to non-statisticians.

Pair Typing Speed Electric Manual

1 High 50 42
2 High 65 60
3 Middle 72 65
4 Middle 90 85
5 Middle 48 50
6 Low 62 60
7 Low 75 60
8 Low 50 51
9 Low 68 59

Version: 3/1/2012
23

(2) A psychologist wishes to look at the relationship between frustration and positive attitude. He
hypothesizes that frustration affects attitude. Students are given an "Attitude Toward Psychologists"
(ATP) instrument prior to taking their first exam in an introductory psychology course. After completing
the ATP instrument, the students are then administered their course exam. The teacher, a psychologist,
made the exam especially difficult in an attempt to frustrate his students. After completing the exam, all
students were asked to fill out the ATP instrument again. High scores on the ATP instrument indicate
more positive attitudes toward psychologist. The data are as follows:

Subject Before Exam Scores After Exam Scores

1 44 20
2 20 10
3 35 30
4 42 26
5 35 30
6 30 20
7 34 30
8 30 22
9 19 21
10 17 20
11 25 17
12 30 15
13 32 25
14 31 26
15 34 30
16 20 25
17 31 24
18 37 19
19 32 30
20 33 28
21 16 15

(a) What are the correct H0 and H1 in both written and symbolic form?
(b) What is (are) the critical value(s)?
(c) What is the obtained (calculated) t value?
(d) Did you reject or fail to reject H0?
(e) Did frustration influence the students' attitude toward psychologists? Write your conclusion as if
explaining the results to non-statisticians.

For additional examples, see chapter exercises in book and notes on course web page.

Version: 3/1/2012

Mock Test QM
No ratings yet
Mock Test QM
3 pages
ECON 601 - Module 2 PS - Solutions - FA 19 PDF
No ratings yet
ECON 601 - Module 2 PS - Solutions - FA 19 PDF
9 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Educational Measurement, Assessment and Evaluation
100% (13)
Educational Measurement, Assessment and Evaluation
53 pages
Advantages of Norm Referenced Tests
100% (2)
Advantages of Norm Referenced Tests
10 pages
T Test One Sample
No ratings yet
T Test One Sample
28 pages
4 Hypothesis Testing 1 Sample Mean For Students
No ratings yet
4 Hypothesis Testing 1 Sample Mean For Students
18 pages
Module - 6 PROB
No ratings yet
Module - 6 PROB
145 pages
ITM Chapter 6 On Testing of Hypothesis
No ratings yet
ITM Chapter 6 On Testing of Hypothesis
39 pages
Analysing and Presenting Data: Practical Hints: Daniele CEI, Giorgio MATTEI
No ratings yet
Analysing and Presenting Data: Practical Hints: Daniele CEI, Giorgio MATTEI
53 pages
5 Largesampletest
No ratings yet
5 Largesampletest
41 pages
Hypothesis testing Intro and Test for means
No ratings yet
Hypothesis testing Intro and Test for means
10 pages
Inferential Statistics For Print Part I
No ratings yet
Inferential Statistics For Print Part I
23 pages
Group 4 (Analysis of Variance)
No ratings yet
Group 4 (Analysis of Variance)
80 pages
Hypothesis Handouts
No ratings yet
Hypothesis Handouts
17 pages
Lec2 PDF
No ratings yet
Lec2 PDF
8 pages
Unit 5 Mba 1ST
No ratings yet
Unit 5 Mba 1ST
197 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
12 pages
5 Estimation and Hypothesis Testing
No ratings yet
5 Estimation and Hypothesis Testing
25 pages
Computing For Test Statistics For Population Mean PDF
50% (2)
Computing For Test Statistics For Population Mean PDF
19 pages
IGNOU Assignment
0% (1)
IGNOU Assignment
9 pages
Stats Lecture 09. One Sample T - Test
No ratings yet
Stats Lecture 09. One Sample T - Test
20 pages
Statistics Assignment
No ratings yet
Statistics Assignment
17 pages
Math
No ratings yet
Math
24 pages
Week4 Modified
No ratings yet
Week4 Modified
28 pages
Hypothesis Testting3
No ratings yet
Hypothesis Testting3
7 pages
MC Math 13 Module 12
No ratings yet
MC Math 13 Module 12
12 pages
Hypothesis Testing G
No ratings yet
Hypothesis Testing G
28 pages
Statstictics Problems
100% (1)
Statstictics Problems
40 pages
Unit 5: Hypothesis Testing
No ratings yet
Unit 5: Hypothesis Testing
6 pages
Ciadmin, Journal Manager, 1704-6756-1-CE
No ratings yet
Ciadmin, Journal Manager, 1704-6756-1-CE
10 pages
Bernard F Dela Vega PH 1-1
No ratings yet
Bernard F Dela Vega PH 1-1
5 pages
Hypothesis Testing - II: S. Devi Yamini
No ratings yet
Hypothesis Testing - II: S. Devi Yamini
145 pages
Z Test T Test For Students
No ratings yet
Z Test T Test For Students
6 pages
Practice Problem - Hypothesis Testing Pragati
No ratings yet
Practice Problem - Hypothesis Testing Pragati
17 pages
Lecture 8 Hypothesis Testing
No ratings yet
Lecture 8 Hypothesis Testing
44 pages
ZTest and T-Test G12 Stats
No ratings yet
ZTest and T-Test G12 Stats
46 pages
Final SB BT
No ratings yet
Final SB BT
85 pages
Handbook -Hypothesis Tests(1)
No ratings yet
Handbook -Hypothesis Tests(1)
17 pages
SP Lesson 32nd Quarter
No ratings yet
SP Lesson 32nd Quarter
47 pages
Statistical Analysis Data Treatment and Evaluation
No ratings yet
Statistical Analysis Data Treatment and Evaluation
55 pages
Parametric Test
No ratings yet
Parametric Test
49 pages
Sampling Theory
No ratings yet
Sampling Theory
7 pages
STAT 252-Notes-Topic 2-Inferences For One and Two Populations
No ratings yet
STAT 252-Notes-Topic 2-Inferences For One and Two Populations
50 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
Statistical Data Treatment and Evaluation Lecture
No ratings yet
Statistical Data Treatment and Evaluation Lecture
16 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
7 pages
Hypothesis-Testing
No ratings yet
Hypothesis-Testing
45 pages
Z Test
No ratings yet
Z Test
14 pages
Research III: Quarter 3 Week 4
No ratings yet
Research III: Quarter 3 Week 4
20 pages
Ie352l1 Labmanual
No ratings yet
Ie352l1 Labmanual
90 pages
Assi2 Exercies Ch678 MAS291 Đề
No ratings yet
Assi2 Exercies Ch678 MAS291 Đề
7 pages
Chapter 7 Hypothesis Testing Part 1 FAL
No ratings yet
Chapter 7 Hypothesis Testing Part 1 FAL
31 pages
Statistical Data Treatment and Evaluation Lecture 1
No ratings yet
Statistical Data Treatment and Evaluation Lecture 1
16 pages
Course Unit 6 - Introduction To Statistical Inference - T Test and Z Test
No ratings yet
Course Unit 6 - Introduction To Statistical Inference - T Test and Z Test
6 pages
T -test
No ratings yet
T -test
9 pages
Eda Group5 Hypothesis Testing
No ratings yet
Eda Group5 Hypothesis Testing
32 pages
Lab 5 - Hypothesis Testing Using One Sample T-Test: Table 1
No ratings yet
Lab 5 - Hypothesis Testing Using One Sample T-Test: Table 1
7 pages
Some Basic Null Hypothesis Tests
No ratings yet
Some Basic Null Hypothesis Tests
19 pages
2statistics Prac New
No ratings yet
2statistics Prac New
13 pages
Measurement of Length - Screw Gauge (Physics) Question Bank
From Everand
Measurement of Length - Screw Gauge (Physics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
0 Data Entry A
No ratings yet
0 Data Entry A
7 pages
0 Regression A
No ratings yet
0 Regression A
9 pages
Ptsd
No ratings yet
Ptsd
1 page
0 Data Entry B
No ratings yet
0 Data Entry B
5 pages
NodeJS Notes 01
No ratings yet
NodeJS Notes 01
6 pages
1 Correlation B
No ratings yet
1 Correlation B
3 pages
Edur 8131 Notes 2 Normal Standard Scores
100% (1)
Edur 8131 Notes 2 Normal Standard Scores
9 pages
0 Anova Oneway A
No ratings yet
0 Anova Oneway A
10 pages
0 Correlation A
No ratings yet
0 Correlation A
5 pages
Edur 8131 Graphical Display Examples
No ratings yet
Edur 8131 Graphical Display Examples
9 pages
EDUR8132 IntroductoryNotes 17august2010
No ratings yet
EDUR8132 IntroductoryNotes 17august2010
5 pages
1 Anova Oneway B
No ratings yet
1 Anova Oneway B
3 pages
1 Frequencies B
No ratings yet
1 Frequencies B
3 pages
Edur 8131 Notes 6 Correlation
No ratings yet
Edur 8131 Notes 6 Correlation
25 pages
Edur 8131 Notes 7 Chi Square
No ratings yet
Edur 8131 Notes 7 Chi Square
16 pages
EDUR 8131 Notes 8a Simple Regression
No ratings yet
EDUR 8131 Notes 8a Simple Regression
19 pages
EDUR8132 IntroductoryNotes
No ratings yet
EDUR8132 IntroductoryNotes
6 pages
1 Graphs C
No ratings yet
1 Graphs C
30 pages
Homework 2 Correlated Ttests Answers
No ratings yet
Homework 2 Correlated Ttests Answers
2 pages
Edur 8131 Notes 4 (Revised) Hypothesis Testing and One Sample Z Test
No ratings yet
Edur 8131 Notes 4 (Revised) Hypothesis Testing and One Sample Z Test
16 pages
1 Stemleaf C
No ratings yet
1 Stemleaf C
3 pages
Hypotheses
No ratings yet
Hypotheses
11 pages
EDUR 8131 Notes 8b Multiple Regression
No ratings yet
EDUR 8131 Notes 8b Multiple Regression
16 pages
Correlated Samples T-Test Exercise
No ratings yet
Correlated Samples T-Test Exercise
2 pages
1 Frequencies A
No ratings yet
1 Frequencies A
8 pages
Fraas Johnson Neyman Interaction Procedure
No ratings yet
Fraas Johnson Neyman Interaction Procedure
11 pages
Chi Square Exercise
No ratings yet
Chi Square Exercise
5 pages
Stevens 2007 ANCOVA Chapter
No ratings yet
Stevens 2007 ANCOVA Chapter
35 pages
RegressionResults1 Myles DissertatonTables
No ratings yet
RegressionResults1 Myles DissertatonTables
2 pages
Engqvist ANCOVA Interaction Term
No ratings yet
Engqvist ANCOVA Interaction Term
5 pages
9702 m19 Ms 12
No ratings yet
9702 m19 Ms 12
3 pages
Result - Cty224 - Weekend - Batches - Reshuffling - Test - Phase - 1 - Test - Date - 18 - Sep - 2022
No ratings yet
Result - Cty224 - Weekend - Batches - Reshuffling - Test - Phase - 1 - Test - Date - 18 - Sep - 2022
3 pages
9702 w19 Ms 12
No ratings yet
9702 w19 Ms 12
3 pages
Fitting of Poisson Distribution
No ratings yet
Fitting of Poisson Distribution
3 pages
Z - Test P8A, PS8B
No ratings yet
Z - Test P8A, PS8B
13 pages
STA100-Fall2024-HW5-Due-Nov-8th
No ratings yet
STA100-Fall2024-HW5-Due-Nov-8th
2 pages
vmmc obc cut off - Google Search
No ratings yet
vmmc obc cut off - Google Search
1 page
Purchase Order 023
No ratings yet
Purchase Order 023
2 pages
Brinell Hardness Test
No ratings yet
Brinell Hardness Test
4 pages
Provisional Broadsheet (Round-III) For Neet-Ug-2023
No ratings yet
Provisional Broadsheet (Round-III) For Neet-Ug-2023
107 pages
BS en 10003-1-1995 (1996)
100% (1)
BS en 10003-1-1995 (1996)
24 pages
2023-P5-Maths-End of Year Exam-Ai Tong
No ratings yet
2023-P5-Maths-End of Year Exam-Ai Tong
36 pages
computer science and application paper ugc net 2025 dec
No ratings yet
computer science and application paper ugc net 2025 dec
101 pages
Group Assignment 2 Statistic
No ratings yet
Group Assignment 2 Statistic
3 pages
Test Answer Key: Units
No ratings yet
Test Answer Key: Units
1 page
Danik Bhaskar Jaipur 08 31 2016 PDF
No ratings yet
Danik Bhaskar Jaipur 08 31 2016 PDF
28 pages
Safari
No ratings yet
Safari
34 pages
Psychological Testing and Assessment: Assoc. Prof. Dr. Othman Md. Johan
No ratings yet
Psychological Testing and Assessment: Assoc. Prof. Dr. Othman Md. Johan
15 pages
Grade IX A
No ratings yet
Grade IX A
14 pages
College test and answers kay
No ratings yet
College test and answers kay
1 page
Standardized Test & Class Room Test (Teacher-Made Tests)
100% (1)
Standardized Test & Class Room Test (Teacher-Made Tests)
2 pages
Ventas Inmobiliarias de Goodyear2
No ratings yet
Ventas Inmobiliarias de Goodyear2
36 pages
Econometrics Assignment Eviews PDF
No ratings yet
Econometrics Assignment Eviews PDF
10 pages
List of Students Qualified in Gpat: Institute of Pharmaceutical Education and Research
No ratings yet
List of Students Qualified in Gpat: Institute of Pharmaceutical Education and Research
15 pages
JNU MA English Entrance Question Paper 2019
No ratings yet
JNU MA English Entrance Question Paper 2019
16 pages
SelList-Only Selected in Stray_Vacy Round 3
No ratings yet
SelList-Only Selected in Stray_Vacy Round 3
1 page
Biotic Research and Methology
No ratings yet
Biotic Research and Methology
5 pages