Class experiment: Using this
Close your eyes and count off 30 seconds,
using this
online stopwatch, to see how accurate your count is.
An experiment consists of repeating this 30 second count five times
by the same person and recording the actual times recorded by the
stopwatch.
28.8
40.3
29.9
31.2
31.2
The average of these five timings is 32.28. For this experiment, we
can compare this average to the true value of 30. However, for most
experiments, we don't know the true value, so we would like a way to
estimate the accuracy of our average x = 32.28.
timings30.xls
(Dataset 4 on the Datasets Page)
contains the results from repeating this experiment 8 times
(Batches A through H).
The term standard error for the average, abbreviated
SEave, is an estimate of the accuracy
of the average of an experiment.
We discuss two methods for estimating SEave.
Method 1: Long Method
timings30.xls contains
the results of repeating the experiment 7 more times.
Here are the individual averages for all 8 batches:
Now compute the standard deviation SD+ of these averages: 1.52.
This gives the average timing with its error estimate: 32.28±1.52.
Remarks:
To use Method 1, several replications of the original
experiment are required. This is usually expensive.
Compute the SDs of the 8 groups:
4.594 1.967 1.883 1.165
5.009 1.265 3.196 5.227
compare them with the SD+ of the averages 1.261.
The individual observations have more variability
than the average of the measurements, so we would normally expect
the SD of the averages 1.261 to be smaller than the individual
SD+s of the individual batches 1.261.
One reason for the variability in
SD+ from batch to batch is that
each batch of timings is recorded by a different person. Some persons
were more accurate in counting off 30 seconds than others.
Method 2: Short Method
The short method uses a remarkable formula for the
standard error of the average:
SEave =
SDx /
√n
The remarkable part is that
extra replications of the experiment are not required.
SEave indicates how much variation in
the average to expect from the true measurement if the experiment
were to be repeated n times, assuming that the measurements
are unbiased and that the SE remains the same for each experiment.
Again, here are the timings from Batch A:
28.8
40.3
29.9
31.2
31.2
with the average = 32.28 and SD+ = 4.594. To estimate
the standard error of the average, use the formula:
SEave =
SDx /
√n =
4.594 / √5 =
2.055
Compare this value 2.054 from Method 2 with the value 1.524
obtained from Method 1.
Method 2 is often more accurate, but Method 1 is easier because extra
replications of the experiment are not needed.
Usually Method 2 is accurate for most data analyses.
Note that we were unlucky in our computation of
SEave. The SD+ of Batch 1
is 4.594, whereas it is much smaller for most of the other batches.
This inflates the value of SEave = 2.055.
Even if the data in an experiment are not normally distributed,
if the sample size n is large enough, the sample means
for the replicated experiments will have an approximately
normal curve.
Usually statisticians take "large enough" to be about n = 25.