Characterization of samples (e.g. standard error of the mean)

Once a sample has been obtained, and descriptive statistics calculated, attention may then turn to the significance (representativeness as opposed to unusualness) of the sample or of the statistics. This information may be gained by comparing the specific value of a statistic with an appropriate reference distribution, and by the calculation of additional statistics that describe the level of uncertainty a particular statistic may have.

In the case of the sample mean, the appropriate reference distribution is the normal distribution, which is implied by the central limit theorem.

Standard error of the mean and confidence interval for the mean

Uncertainty in the mean can be described by the standard error of the mean or by the confidence interval for the mean. The standard error of the mean can be thought of as the standard deviation of a set mean values from repeated samples. The confidence interval provides a verbal or graphical characterization, based on the information in a sample, of the likely range of values within which the "true" or population mean lies.  [cidat.csv]

definitions

# confidence intervals
attach(cidat)
summary(cidat)

# get means for each group
group.means <- tapply(NormDat, Group, mean)
group.sd <- tapply(NormDat, Group, sd)
group.npts <- tapply(NormDat, Group, length)
group.semean <- (group.sd/(sqrt(group.npts)))

# plot means and data
par(mfrow=c(2,1))

plot(Group, NormDat)
points(group.means, col="red", pch=16)

mean(group.means)
sd(group.means)

# plot means and standard errors of means
plot(group.means, ylim=c(9, 11), col="red", pch=16)
points(group.means + 2.0*group.semean , pch="-")
points(group.means - 2.0*group.semean , pch="-")
abline(10,0)

par(mfrow=c(1,1))
detach(cidat)

Simple inferences based on the standard error of the mean

The standard error of the mean, along with the knowledge that the sample mean is normally distributed allows inferences about the mean to made.

simple inferential statistics