The t-test:  a simple hypothesis test for equality of two mean values

An illustration of an hypothesis test that is frequently used in practice is provided by the t-test, one of several "difference-of-means" tests.  In the t-test, two sample mean values, or a sample mean and a theoretical mean value, are compared as follows"

  1. the null hypthesis is that the two mean values are equal, while the alternative hypothesis is that the means are not equal (or that one is greater than or less than the other)..  
  2. the test statistic is the t-statistic
  3. the significance level or p-value is determined using the t-distribution

The t-test

Details of the t-test

Example data sets:  [ttestdat.csv] [foursamples.csv]

# t-tests
attach(ttestdat)
boxplot(Set1 ~ Group1)

# two- and one-tailed tests
t.test(Set1 ~ Group1)
t.test(Set1 ~ Group1, alternative = "less")
t.test(Set1 ~ Group1, alternative = "greater")

# a second example
boxplot(Set2 ~ Group2)
t.test(Set2 ~ Group2)

detach(ttestdat)

Differences in group variances

One assumption that underlies the t-test is that the variances (or dispersions) of the two samples are equal.  A modification of the basic test allows cases when the variances are approximately equal to be handled, but large differences in variability between the two groups can have an impact on the interpretability of the test results:  

Example data:  [foursamples.csv]

# t-tests among groups with different variances
attach(foursamples)

# nice histograms
cutpts <- seq(0.0, 20.0, by=1)
par(mfrow=c(2,2))
hist(Sample1, breaks=cutpts, xlim=c(0,20))
hist(Sample2, breaks=cutpts, xlim=c(0,20))
hist(Sample3, breaks=cutpts, xlim=c(0,20))
hist(Sample4, breaks=cutpts, xlim=c(0,20))
par(mfrow=c(1,1))

boxplot(Sample1, Sample2, Sample3, Sample4)
mean(Sample1)-mean(Sample2)
t.test(Sample1, Sample2)

mean(Sample3)-mean(Sample4)
t.test(Sample3, Sample4)

mean(Sample1)-mean(Sample3)
t.test(Sample1, Sample3)

mean(Sample2)-mean(Sample4)
t.test(Sample2, Sample4)

detach(foursamples)

There is a formal test for equality of group variances that will be described with analysis of variance.

The shape of the t distribution can be visualized as follows (for df=30):

x <- seq(-3,3, by=.1)
pdf.t <- dt(x,3)
plot(pdf.t ~ x, type="l")