Ans: When the histogram of the dataset is normal: symmetric and there are no outliers.
Ans: When the histogram is skewed and/or there are outliers.
Bin | Percent |
---|---|
[0,1) | 20 |
[1,2) | 30 |
[2,3) | 20 |
[3,5) | 20 |
[5,9) | 10 |
Since the bin widths are not all equal, the area of a rectangle represents the frequency, not the height. Now answer these questions about the histogram:
Ans: The median is exactly at 2 (50% of observations to the left, 50% to the right).
Ans: It is greater then the median. The long tail pulls the mean to the right. The exact value of the mean is computed as this weighted mean:
_ 0.5*20 + 1.5*30 + 2.5*20 + 4.0*20 + 7.0*10 254 x = ------------------------------------------ = --- = 2.54 20 + 30 + 20 + 20 + 10 100
Ans: The formula for SD uses n in the denominator before taking the square root; the formula for SD+ uses n-1. Most statisticians use SD+ because it takes into account of the extra variability that results in using x to estimate μ.
Ans: SD = sqrt(SS / n), where SS = sum of squares of deviations. Solve 6.94 = sqrt(SS / 23) for SS: SS = 1107.76. Then SD+ = sqrt[SS / (n - 1)] = sqrt(1107.76 / (23 - 1)) = 7.06.
Ans: Analyze >> Descriptive Statistics >> Descriptives.
Ans: Graphs >> Chart Builder. Drag a Simple Histogram in to the Chart Preview Area.
Ans: Graphs >> Chart Builder. Drag a Simple Scatterplot into the Chart Preview Area.
Ans: Data >> Sort Cases. Set the Sort Order to Ascending or Descending as you prefer.
Biased The average of the observations changes, depending on which thin vertical strip you pick.
Homoscedastic The variation (SD+) of the observations is the same in every thin vertical strip all the way across the scatterplot.
Heteroscedastic The variation (SD+) of the observations in a thin vertical strip changes, depending on which vertical strip you pick.
Method 2: Use the formula for the standard error of the mean or standard error of the average: