## The Normal Distribution

### Introduction

• A univariate dataset that has an approximately bell-shaped histogram is said to have a normal distribution.

• Here some examples of datasets that have approximately normal distributions:

Heights of living things, weights of living things, lengths of inert appendages (hair, claws, nails, teeth) of biological specimens in the direction of growth, blood pressure of adult humans of fixed gender, velocities of molecules in an ideal gas, measurement errors, IQ scores, SAT scores.

In finance, changes in the logarithm of exchange rates, price indices, and stock market indices are assumed normal in the Black-Scholes Model. The logarithm is used in the Black-Scholes model because these values behave like compound interest and so are multiplicative.

• Abraham DeMoivre was the first to write down the equation for the a normal histogram with center μ = 0 and spread σ = 1:

p(x) = (1 / √) e-0.5x2 = 0.39894 * 2.71828-0.5x2

Recall that π = 3.14159 and e = 2.71828.

• Use SPSS to plot the normal curve for the z values from -4 to 4 by 0.1

.

• If x and SD really a parsimonious description of a histogram, then given x and SD, we should be able to reconstruct the histogram; that is, we should be able to predict the proportion of observations in any bin of the form (-∞, a], (a, ∞], or [a, b].

• We start with the simplest case of a normal histogram with center 0 and spread 1.

• Our tool for finding areas under the normal curve is the standard normal table.

• Example 1:   What proportion of the observations are in this bin: (-∞, -1.00]?

Solution: The value -1.00 on the x-axis of the normal curve is called a z-value. Use the first table in the normal table for negative numbers. Look in the -1.0 row and the .00 column to find .1587. The answer is: 0.1587.

• Example 2:   What proportion of the observations are in this bin: (-∞, 2.00]?

Solution: Look the z-value 2.00 up in the second table for positive numbers, also in the normal table. Look in the 2.0 row and the 0.00 column to find .9772. The answer is 0.9772.

• Example 3:   What proportion of the observations are in this bin: (-3.00, 3.00]?

Solution: (-3.00, 2.00] can be written as the set-theoretic difference (-∞, 3.00] - (-∞, -3.00]. Look up these bins in the normal table: (-∞, 3.00] has the proportion 0.9987 and (-∞, -3.00] has the proportion 0.0013. Therefore the proportion of observations in (-3.00, 2.00] =
(-∞, 3.00] - (-∞,  -3.00] is 0.9987 - 0.0013 = 0.9974.

Note: is does not matter whether we ask for the proportion of observations in (-3.00, 2.00], (-3.00, 2.00), [-3.00, 2.00), or [-3.00, 2.00]. They contain the same proportion of observations because the normal curve is an idealized continuous histogram where the the proportion of observations at a single point is always zero.

• Example 4:   What proportion of the observations are in this bin: (1.5, 2.5]?

Solution: (1.50, 2.50] = (-∞, 2.50] - (-∞, 1.50]. The the normal table gives the proportion of observations as 0.9938 - 0.9332 = 0.0606.

• In the case where the normal histogram is not standard, any intervals must be first converted to standard units by subtracting off the mean and then dividing by the standard deviation.

• The values converted to standard units are called z-scores.

• Example 5: If the mean is 50 and the standard deviation is 10, convert 40 and 70 to z-scores. Solution:

z = (x - μ) / σ = (40 - 50) / 10 = -1

and

z = (x - μ) / σ = (70 - 50) / 10 = 2

Then the proportion of observations in the bin (-1, 2] = (-∞, 2] - (-∞, -1] is 0.9772 - 0.1587 = 0.8185.