The Normal Distribution

Contents

Introduction  Normal Distribution  Normal Moments  Standard Normal Distribution  Normal Probabilities  Z Score Transform  Normal Quantiles  Examples 

Learning Objectives

The Normal module targets the following cognitive tasks:

Task        Skills Concepts
Norm-1: Identify a normal random variable Understand normal random variables
Norm-2: Understand the properties of normal distributions
Norm-3: Understand the equivalence of area and probability of an event
Norm-4: Compute z scores Understand the z-score as a measure of relative position
Norm-5: Compute normal probabilities Understand the standard normal distribution
Norm-6: Use the empirical rule Understand the empirical rule
Norm-7: Compute normal cumulative probabilities Understand the cumulative normal distribution
Norm-8: Compute normal quantiles Understand normal quantiles
Norm-9: Identify outliers Understand the formal definition of an outlier

^ Introduction

The normal curve was developed mathematically in 1733 by Abraham de DeMoivre as an approximation to the binomial distribution. His paper was not discovered until 1924 by Karl Pearson. Pierre-Simon Laplace used the normal curve in 1783 to describe the distribution of errors, e.g., measurement error in scientific experiments. Subsequently, Gauss used the normal curve to analyze astronomical data in 1809. The normal curve is often called the Gaussian distribution. The term bell-shaped curve is used in everyday usage.

^ Normal Distribution

The normal distribution is the most used statistical distribution. The principal reasons are:

  1. Normality arises naturally in many behaviorial, biological, and physical sciences.
  2. Normality is important in statistical inference.

Normality occurs when the obseerved value of a phenomena is composed additively of many small effects or components. For example, unlike human weight, many effects lead to the value of human height.

The applet below plots the normal curve.

The displayed curve is called the normal density and is denoted by f(x). The standard normal density curve (see below) is shown by default, but if it is hidden, it can be made visible by selecting the f(x) radio button in the bottom panel. The normal density is used to compute probabilities.

^ Normal Moments

The normal distribution is characterized by two parameters: the mean m and the standard deviation s. The mean is a measure of location or center of the distribution and the standard deviation is a measure of scale, dispersion, or spread. The mean can be any value between ± infinity and the standard deviation must be positive. Each possible value of m and s define a specific normal distribution and collectively all possible normal distributions define the normal family.

Any member of the normal family can be displayed by changing m and s in the above applet. This is done by clicking on the right or left arrow animation buttons of m or s in the bottom panel. Clicking on the right (left) arrow button of m moves the curve to the right (left) without changing the spread. Clicking on the right (left) arrow button of s flattens (contracts) the normal curve without changing the location. As m and s are changed, the curve can march off the display area. The image can be centered by clicking on the Rescale button. Once you are finished experimenting with the effects of changing m and s, click on the Reset button to get the standard normal density back.

^ Standard Normal Distribution

The standard (or canonical) normal distribution is a special member of the normal family that has a mean of 0 and a standard deviation of 1. A standard normal random variable, i.e., a normal random variable with a mean of 0 and a standard deviation of 1, is denoted by Z as will be discussed later.

The standard normal distribution is important since the probabilities and quantiles of any normal distribution can be computed from the standard normal distribution—if m and s are known.

^ Normal Probabilities

Often we want to compute the probability that our random outcome is within a specified interval, i.e., P(a <= X <= b) where a could be -infinity and/or b could be +infinity. For continuous random variables, this probability corresponds to the area bound by a and b and under the curve. The probability X is a specific value, i.e., P(X = x), is 0 since no area is above a singe point. It follows that P(a <= X <= b) = P(a < X <= b) = P(a <= X < b) = P(a < X < b).

Probabilities are computed by selecting the appropriate menu item from the Prob menu. In the menu items, a corresponds to the lower limit and b corresponds to the upper limit. For example, to compute P(0 <= Z <= 1): 1) select a <= x <= b from the Prob menu (assuming m is set to 0 and s to 1); 2) type 0 as the Lower Limit and 1 as the Upper Limit; and 3) click ok. The probability is 0.3413.

^ Z Score Transform Probabilities

Probabilities can be computed directly in the above applet simply by changing the values of m and s and entering the lower (a) and/or upper (b) limits when requested. For example, if X is normally distributed with a mean of 10 and a standard deviation of 2, then P(9 <= X <= 12) = 0.5326. This is computed by: 1) changing m to 10 and s to 2; 2) selecting a <= x <= b from the Prob menu; 3) entering 9 as the Lower Limit and 12 as the Upper Limit; 4) and clicking ok. The mean and standard deviation can be changed by clicking on the left or right arrow buttons or by typing directly into the text box.

This probability can also be computed from the standard normal distribution as P(-0.5 <= Z <= 1) = 0.5326 since 12 is z = (12 - 10)/2 = 1 standard deviation above the m = 10 and 9 is z =(9 - 10)/2 = -0.5 standard deviation below m. More generally, the z-value is computed as: z = (x - m)/s and this is called the Z score transform.

The Z score is a standardized value that computes how many standard deviations the measured value is from the mean. For example, IQ tests are designed to have a mean (m) of 100 and a standard deviation (s) of 10. If an individual has an IQ of 120, his/her Z score is 2. Using the applet, the proportion of individuals that have an IQ at least at great as this individual is computed to be 0.0228

^ Normal Quantiles

The qth normal quantile is the value xq defined by:

where F(x) denotes the cumulative normal distribution function. The cumulative normal distribution can be displayed by clicking on the F(x) radio button in the bottom panel.

Normal quantiles are computed from the normal cdf, which is displayed by clicking on the F(x) radio button in the applet.

^ Examples

Example #1 (LDL Chlosterol) shows how probabilities and quantiles are computed for the LDL chlosterol distribution of male adults.

Example #2 computes probabilities and quantiles for the distribution of male heights which are assumed to be normally distributed.

Example #3 illustrates computation of probabilities of mean temperatures for various cities in South Carolina.

Self-test