Sampling Animation

View the animation

This animation allows you to create a new sample with the push of a button. In some real-world studies you might take only one sample. However, in order to understand the meaning of the sample results, you must understand the possible outcomes of the sampling process.

In order to understand the statement, "There is a 30% chance of rain tomorrow" it is necessary to consider lots of "tomorrows". More precisely, we must consider lots of days with meteorological conditions identical to those anticipated for tomorrow. Then we would expect it to rain on 30 out of every 100 such days.

The circumstances of sampling are similar. If we take two different samples, we would not expect to get the same results twice. Similarly, if we were to take a sample and get an average of 59.2 inches, we would not automatically assume that the average of the population was exactly 59.2 inches.

On the other hand, 59.2 inches is probably close to correct. Describing the concept of "close" is where "lots of samples" comes into play.

The Animation

I usually present the population as a barrel of numbers and the process of sampling as dipping into that barrel with a measuring cup. The animation depicts this action. A cup dips into a barrel of numbers and then "pours" thirty numbers into an array. then the mean and standard deviation of the numbers is computed and displayed below the array. A histogram of the data is also created and displayed above the array.

The controls are simple. There is one button labeled "new sample and you click on it when you want to see a new sample.

The Sampling Process

The distribution in this sampling animation is the same as the distribution of IQ scores. For the purpose of intuition, I will speak of the data as if they were IQ scores.

In a sample we would expect to occasionally see scores of over 130. Of course, we would also see some low scores as well. When we take the average of the sample, we would expect these extremes be somewhat "balanced" and to average out. Thus, averages of samples are likely to be much closer to 100, the population mean. After all, a sample mean of 130 would be like having an entire class of geniuses.

What kind of sampling means should we expect? The average of these sample averages will again be 100, but how far away from 100 might we expect to find them? Do you remember the variable that measures "how far" from the mean one should expect to find values? The standard deviation measures how far values vary from the mean, on the average. You may observe that we already have a population standard deviation of 15. That referred to individuals, which is like looking at samples of size one. This animation generates samples of size thirty.

You will learn in class of a surprisingly simple formula for the standard deviation of means of samples. All that you have to do is divide the population standard deviation by the square root of the sample size. For a sample size of 30, we divide by the square root of 30, which is 5.47723. The result is 2.73861. Thus we would expect to find the means of about 95% of such samples within two standard deviations of the mean, or between 94.52277 and 105.47723.

In many studies it is not possible to consider a lot of samples but in a simulation like this it is very easy. Therefore, we will test these numbers with a web exercise.

Web Exercise

We will analyze the results of twenty simulated samples. When you start the animation a sample is taken. write down the mean of the sample. Click "New Sample" nineteen more times, writing down the sample mean each time. When you have twenty means create a histogram for those means. Also, find the standard deviations of those means.

Naturally, I have done this as well. The smallest sample mean that I got among the twenty was 92.72 and the largest was 105.73. I used a variation of our histigram program to produce a histogram of these means. It is displayed below.

View the animation