This module allows you to calculate quantiles on a batch of numbers. The quantiles of principal interest are the median and the quartiles. These summary measures are most commonly used for distributions which are skewed. Skewness can be assessed visually by histogram views of the data.
The Quantiles module targets the following cognitive tasks:
| Task                 | SkillsConcepts | |
|---|---|---|
| Quan-1: | Calculate and interpret sample quantiles | Understand quantiles |
| Quan-2: | Understanding the need for quantile-based measures of center and dispersion | |
| Quan-3: | Calculate and interpret the sample median | Understand the sample median |
| Quan-4: | Calculate and interpret sample quartiles and the IQR | Understand quartiles and the IQR |
| Quan-5: | Identify when to use the sample median and IQR | Understand when to use the sample median and IQR | Quan-6: | Calculate and interpret the median-based sample skewness | Understand the median-based sample skewness |
| Quan-7: | Interpret quantile plots | Understand quantile plots |
The f quantile, q( f ), of a dataset is a value with an approximate fraction f of the data less than or equal to q( f ). A range of values may satisfy the definition and, in this case, an interpolated value is used.
The sample median, corresponding to f = 0.5, is a measure of the center of a distribution. It is the middle value of the ordered data. The lower and upper quartiles correspond to f = 0.25 and f = 0.75, respectively. The interquartile range, the difference between the upper and lower quartiles, is a measure of variation.
A rule is needed to compute the quantiles. Let x (i ) be the i th largest ranked value. For example, x (1) is the smallest value and x (n ) is the largest value. Define x (i ) as the fi quantile, where fi = i / (n + 1). The quantile plot (shown below) is constructed by graphing x (i ) versus fi.
Consider the normally distributed data first introduced in the Histograms module. The sample quantiles are displayed numerically and graphically in the Histograms Java applet below.
The Quantiles button in the Report panel is chosen by default. The sample median and quartiles are displayed on the right along with other quantile-based summary statistics. The sample median and quartiles can be visualized by choosing Quantile boxplot from the Options menu. The center line is the sample median and the box ends are the quartiles. Lines extend from the box to the minimum and maximum values. Collectively, the vertical lines define the five-number summary.
The quantile plot can be used to display the quantiles for f values between 1/(n + 1) and n/(n + 1). Linear interpolation is used as required. Select Quantile values from the Options menu. By default, the median is given by the blue lines (corresponding to f = 0.5). Any quantile within the f range can be displayed by clicking the x symbol and dragging it to the desired location.Example #1 uses the New Jersey county area dataset to compute and interpret quantile-based summary statistics.
Example #2 uses the heights for 40 randomly selected students at Oxford University to calculate various quantiles including the median. The interquartile range is computed as a measure of spread. Quantiles are revealed in the bottom part of the normal quantile plot. The dependence of the height quantiles on gender can be obtained from a conditioning normal quantile plot.
Example #3 numerically describes the distribution of the number of Congressional Representatives in each state of the USA using median, inter-quartile range, and related measures. Self-test