

The calculations are similar but not identical. The procedure to calculate the standard deviation depends on whether the numbers are the entire population or are data from a sample. For sample data, in symbols, a deviation is x – x ¯ x ¯. If the numbers belong to a population, in symbols, a deviation is x – μ. The deviations are used to calculate the standard deviation. In a data set, there are as many deviations as there are items in the data set. If x is a number, then the difference x – mean is called its deviation. The symbol x ¯ x is the sample mean, and the Greek symbol The lowercase letter s represents the sample standard deviation and the Greek letter σ (lower case) represents the population standard deviation. Population: x = μ + ( # o f S T D E V ) ( σ ).Sample: x = x ¯ + ( # o f S T D E V ) ( s ) x = x ¯ + ( # o f S T D E V ) ( s ).The equation value = mean + (#ofSTDEVs)(standard deviation) can be expressed for a sample and for a population as follows: One is two standard deviations less than the mean of five because 1 = 5 + (–2)(2).#ofSTDEV does not need to be an integer.where #ofSTDEVs = the number of standard deviations.In general, a value = mean + (#ofSTDEV)(standard deviation).If one were also part of the data set, then one is two standard deviations to the left of five because 5 + (–2)(2) = 1. We say, then, that seven is one standard deviation to the right of five because 5 + (1)(2) = 7. If we were to put five and seven on a number line, seven is to the right of five. The number line may help you understand standard deviation. You will learn more about this in later chapters. In general, the shape of the distribution of the data affects how much of the data is farther away than two standard deviations. Considering data to be far from the mean if they are more than two standard deviations away is more of an approximate rule of thumb than a rigid rule. The z-score of −2 tells us that Binh’s wait time is two standard deviations below the mean wait time of five minutes.Ī data value that is two standard deviations from the average is just on the borderline for what many statisticians would consider to be far from the average.

We can use the given information to create the table below. It tells us how many standard deviations a data value is from the mean and is calculated as the ratio of the difference in a particular score and the population mean to the population standard deviation. A z-score is a standardized score that lets us compare data sets. The standard deviation can be used to determine whether a data value is close to or far from the mean. At Supermarket A, the mean waiting time is five minutes, and the standard deviation is two minutes. Rosa waits at the checkout counter for seven minutes, and Binh waits for one minute. Suppose that both Rosa and Binh shop at Supermarket A. Overall, wait times at Supermarket B are more spread out from the average whereas wait times at Supermarket A are more concentrated near the average.

At Supermarket A, the standard deviation for the wait time is two minutes at Supermarket B, the standard deviation for the wait time is four minutes.īecause Supermarket B has a higher standard deviation, we know that there is more variation in the wait times at Supermarket B. The average wait time at both supermarkets is five minutes. Suppose that we are studying the amount of time customers wait in line at the checkout at Supermarket A and Supermarket B. The standard deviation is larger when the data values are more spread out from the mean, exhibiting more variation. The standard deviation is small when all the data are concentrated close to the mean, exhibiting little variation or spread. The standard deviation is always positive or zero. The standard deviation provides a measure of the overall variation in a data set.

In some data sets, the data values are concentrated closely near the mean in other data sets, the data values are more widely spread out from the mean. An important characteristic of any set of data is the variation in the data.
