NUMBAT OER - Open Educational Resources

1. Introduction

Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to quantify the amount of variation within the sample – commonly called either the 'variability' or 'dispersion'. Consider three datasets:

Set 1 3.0 4.2 4.4 4.5 4.7 4.9 5.1
Set 2 2.3 2.5 3.4 4.5 6.7 7.4 9.3
Set 3 2.3 3.4 6.7 6.9 7.4 8.1 9.3

You will see that Sets 1 and 2 have identical median values (shaded column). However, the lowest value in Set 1 is 3.9 and the highest value is 5.1 (a difference of 1.2), whilst for Set 2, the corresponding values are 2.3 and 9.3 (difference of 7.0). Set 2 is more variable than Set 1, even though the median value is the same for each.

Set 3 has the same minimum and maximum values as Set 2 – by simple measures the variability of the two datasets is the same. However their median values differ. If we were comparing the datasets, we would probably consider that the variability in the two is too large to accept that the differences in median values is meaningful.

This helpsheet contains references to the companion helpsheet on descriptive statistics, which covers different measures of averages, such as means and medians. Please refer to that helpsheet if you do not understand these concepts. All measures of variability covered here are appropriate for scale data (continuous or discrete), some are appropriate for ordinal data, but none can be used with nominal data.