SE of Average

The Standard Error of the Average

Class experiment: Using this Close your eyes and count off 30 seconds, using this online stopwatch, to see how accurate your count is.
An experiment consists of repeating this 30 second count five times by the same person and recording the actual times recorded by the stopwatch.
The average of these five timings is 32.28. For this experiment, we can compare this average to the true value of 30. However, for most experiments, we don't know the true value, so we would like a way to estimate the accuracy of our average x = 32.28.
timings30.xls (Dataset 4 on the Datasets Page) contains the results from repeating this experiment 8 times (Batches A through H).
The term standard error for the average, abbreviated SE_ave, is an estimate of the accuracy of the average of an experiment.
We discuss two methods for estimating SE_ave.

timings30.xls contains the results of repeating the experiment 7 more times. Here are the individual averages for all 8 batches:
See timings30.xls:
Now compute the standard deviation SD+ of these averages: 1.52.
This gives the average timing with its error estimate: 32.28±1.52.
Remarks:
1. To use Method 1, several replications of the original experiment are required. This is usually expensive.
2. Compute the SDs of the 8 groups:
  compare them with the SD⁺ of the averages 1.261.
3. The individual observations have more variability than the average of the measurements, so we would normally expect the SD of the averages 1.261 to be smaller than the individual SD⁺s of the individual batches 1.261.
4. One reason for the variability in SD⁺ from batch to batch is that each batch of timings is recorded by a different person. Some persons were more accurate in counting off 30 seconds than others.

The short method uses a remarkable formula for the standard error of the average:
The remarkable part is that extra replications of the experiment are not required.
SE_ave indicates how much variation in the average to expect from the true measurement if the experiment were to be repeated n times, assuming that the measurements are unbiased and that the SE remains the same for each experiment.
Again, here are the timings from Batch A:
with the average = 32.28 and SD+ = 4.594. To estimate the standard error of the average, use the formula:
Compare this value 2.054 from Method 2 with the value 1.524 obtained from Method 1.
Method 2 is often more accurate, but Method 1 is easier because extra replications of the experiment are not needed.
Usually Method 2 is accurate for most data analyses.
Note that we were unlucky in our computation of SE_ave. The SD⁺ of Batch 1 is 4.594, whereas it is much smaller for most of the other batches. This inflates the value of SE_ave = 2.055.
Even if the data in an experiment are not normally distributed, if the sample size n is large enough, the sample means for the replicated experiments will have an approximately normal curve.
Usually statisticians take "large enough" to be about n = 25.