This document requires Netscape 3.x or compatible Web Browser.


UT Bullet Biostatistics for the Clinician

Biostatistics for the Clinician

UT Logo

University of Texas-Houston
Health Science Center

Lesson 1.4

Variability

Lesson 1: Summary Measures of Data 1.4 - 1 UT Bullet

UT Bullet Biostatistics for the Clinician

1.4 Variability

1.4.1 Why Important

Why do you need to know about measures of variability? You need to be able to understand how the degree to which data values are spread out in a distribution can be assessed using simple measures to best represent the variability in the data. Why? Because, measures of variability also occur very frequently in the medical research literature. Again, as was the case with measures of central tendency, you cannot understand, let alone critically evaluate medical research studies unless you understand the appropriate usage of such measures.

Variability
Practice
Exercise 1:
You need to understand the measures of variability to:

No Response
Evaluate medical research studies
Know biostatistical vocabulary
Compute statistics
None of the above


Lesson 1: Summary Measures of Data 1.4 - 2 UT Bullet

UT Bullet Biostatistics for the Clinician
Let's look at the concept of variability. It is basically a fairly simple concept. When you're talking about variability, you're talking about how scattered or dispersed or spread out the data is. The concept basically has to do with the width of a distribution. In general, other things being equal, the wider the distribution, the more the variability (see Figure 2.5 below).

Figure 2.5: Normal Distribution
Figure 2.5 Normal Distribution

So, what variability refers to is how dispersed or spread out the data values are, or looking at it from another point of view how wide the data distribution is when it is graphed. If all data values are the same, then, of course, there is zero variability. The graph of the distribution would have zero width. If all the values lie very close to each other there is little variability and the distribution's graph would be quite narrow. If, on the other hand, the numbers are spread out all over the place, there is more variability and the graph would be wider.

Variability
Practice
Exercise 2:
Variability has to do with the:

No Response
Height of a distribution
Width of a distribution
Curvature of a distribution
Lopsidedness of a distribution


Lesson 1: Summary Measures of Data 1.4 - 3 UT Bullet

UT Bullet Biostatistics for the Clinician
Again, it turns out as was the case with measures of central tendency, that there are many measures of variability. In the medical research literature some of the most frequently used measures are the
standard deviation, interquartile range, and the range (see Figure 2.5).

Variability
Practice
Exercise 3:
Variability is measured by:

No Response
The standard deviation
The interquartile range
The range
All except "No Response" above


Lesson 1: Summary Measures of Data 1.4 - 4 UT Bullet

UT Bullet Biostatistics for the Clinician

1.4.2 Standard Deviation

One way to measure the spread of information or data is by looking at the standard deviation. It's just the mean spread which you extract from the information (see the standard deviation formula below).

Standard Deviation Formula
Standard Deviation Formula

To get the standard deviation, as you can see in the formula, first you square the distances values are from the mean. Then you sum those squared differences. Then you divide that sum by the number of differences. Finally, you take the square root of that quotient. The reaon that you subtract and square is pretty clear. Whether the value is above the mean or below the mean the squared difference between the value and the mean comes out the same when it is squared. So positive and negative makes no difference here. If you didn't square, they would tend to cancel each other out. When you divide by the number of values to get an average you find the square root of the whole thing because, it was squared earlier, to get back to the original measures. In other words by squaring to get rid of the negative and positive values you get squared measures. So you take the square root to get back to the original more intuitive kinds of measures like feet, cubic inches, or whatever else it might be. The standard deviation can be thought of as the average distance that values are from the mean of the distribution (see the standard deviation formula above).

Variability
Practice
Exercise 4:
The standard deviation measures:

No Response
Average distance from the mean
Distance between top and bottom values
Distance between 1st and 3rd quartiles
None of the above


Of course, given the formula, to compute a standard deviation you must be able to compute a meaningful mean. Consequently, computation of the standard deviation requires interval or ratio variables. Furthermore, in a distribution having a bell (normal) curve, it always turns out that when you know the standard deviation, you also know that approximately 68% of the values lie within 1 standard deviation of the mean. You also know that approximately 2.1% of the values lie in each tail of the distribution beyond 2 standard deviations from the mean (again see Figure 2.5).

Variability
Practice
Exercise 5:
In a normal distribution the percentage of scores within 1 standard deviation of the mean is approximately:

No Response
2.1%
5%
68%
95.8%


Lesson 1: Summary Measures of Data 1.4 - 5 UT Bullet

UT Bullet Biostatistics for the Clinician

1.4.3 Interquartile Range

You should recall that the median is the point in the distribution that 50% of the sample is below and 50% is above. In other words the median is at the 50th percentile. Quartiles can also be defined. The 1st quartile is at the 25th percentile. The 2nd quartile is at the 50th percentile. The 3rd quartile is at the 75th percentile. And, the 4th quartile is at the 100th percentile.

The interquartile range then extends from the 25th percentile to the 75th percentile. It includes 50% of the values in the sample. So, the interquartile range is the distance between the 25th percentile and the 75th percentile. The interquartile range then is another measure of variability. But unlike the standard deviation, it can be appropriately applied with ordinal variables. Therefore it is used especially in conjunction with nonparametric statistics (see the interquartile range in the figure below).

Ranges
Ranges

So, another way to display data that's been proposed by exploratory data analysis is to rank the data from low to high, then find the median and then the quartile values, the values between which one half of the data resides. When you do this you can then plot a box plot containing half the data (see the figure below). The rest of the data is out in the wings. And, you can see the interquartile range which contains those values between the lower and upper quartiles. You'll see more explicit clinical medicine examples of this in Lesson 1.5.

Exploratory Data Analysis
Exploratory Data Analysis

Variability
Practice
Exercise 6:
Is it appropriate to compute the standard deviation when the data consists of rankings?

No Response
Yes
No


Lesson 1: Summary Measures of Data 1.4 - 6 UT Bullet

UT Bullet Biostatistics for the Clinician

1.4.4 Range

The range is simply the difference between the highest and lowest value in the sample (see the figure below). It's a simple measure to compute and to understand. Unfortunately, it is particulary sensitive to extreme scores on the one hand and lacks sensitive to varying values between those extremes. Still you come across it fairly frequently in the literature.

Ranges
Ranges

Variability
Practice
Exercise 7:
The range measures:

No Response
Average distance from the mean
Distance between maximum and minimum values
Distance between 1st and 3rd quartiles
None of the above



Final Instructions

Press Button below for your score.

  • After completing Lesson 1.4, including all practice exercises, press the "Submit... " button below for Lesson 1.4 research participation credit.
  • After you press "Submit..." it is possible Netscape may tell you it is unable to connect because of unusually high system demands. If you receive no error message upon submission you're OK. But, if Netscape gives you an error message after you press the "Submit..." button, wait a moment and resubmit or consult the attendant.
  • Finally, press the "Table of Contents..." button below to correctly end Lesson 1.4 and return to the Lesson 1 Table of Contents so you may continue with Lesson 1.5.

End Lesson 1.4
Variability


Lesson 1: Summary Measures of Data 1.4 - 7 UT Bullet