## Misuses of Statistics

 References: Huff, How to Lie with Statistics, Free Press, 1962 Wallis and Roberts, The Nature of Statistics, W.W. Norton, 1954
1. Shifting or Ambiguous Definitions

• How do you define a resident of the U.S. for the Census?

• How do you define who is employed and who is not?

• Hospitals measure the severity of diseases differently, making collecting national data difficult.

• Comparing crime rates across time is difficult because standards and procedures for recording crimes change.

2. Inaccurate Measurement or Misclassification of Cases

• The answers given in interviews depends on the race of the interviewer, especially when the questions are about racial equality.

3. Improperly Selected Cases

• The result of a census depends on the purpose of the census. For military draft and taxation in China: 28 million, for famine relief: 105 million.

• Literary Digest's Prediction of the 1936 Roosevelt vs. Landon election.

4. Improper Comparisons

• In WW2, 405 thousand people died from accidents at home, while 375 thousand people died on the front lines.

• Joe Twinklebat is batting .750. Are you impressed?

• Out of 20 interviewees, only 13% are happy at their jobs. What is wrong?

5. Use of Average instead of Median

• Any time you hear reports about the average income in a certain subpopulation, the median should probably be used instead.

• Use the the median to describe a population with a nonsymmetric histogram (larger tail in one direction than in another).

• Beware of graphs that only show part of the range of the dependent variable.

Salaries go through the roof:

Salaries barely increase:

7. The Semiattached Figure

• Does Brand X mouthwash cure sore throats? It kills 5 million bacteria per second in laboratory tests.

• Brand Z juicers extract 26% more juice.

• Compound W causes cancer in laboratory mice.

### How to Talk Back to a Statistic

1. Who says so?

2. How do they know?

3. What's missing?

4. Did somebody change the subject?

5. Does it make sense?

Men Women Major # Apps % Accept # Admits # Apps A 825 62 511 108 82 89 B 560 63 353 25 68 17 C 325 37 120 593 34 203 D 417 33 138 375 35 131 E 191 28 53 393 24 94 F 373 6 53 341 7 94 Total 2691 44 1197 1835 30 558

The average admission rate for men is 1197 / 2691 = 44%; for women it is 558 / 1835 = 30%.

However, when we look at majors separately, the rates appear to give a preference to women. To combine the rates, we can't just average the rates because the majors are of different sizes. We take a weighted average of the rates by total number (men + women) in each.

Major Total
A 933
B 585
C 918
D 792
E 584
F 714
Total 4,526

The weighted average for men

= (933×62 + 585×63 + 918×37 + 792×33 + 584×28 + 714×6) / 4,526 = 39%.

The weighted average for women

= (933×82 + 585×68 + 918×34 + 792×35 + 584×24 + 714×7) / 4,526 = 43%.

• In summary:

1. Be careful about drawing conclusions from a combined dataset.

2. The reason that the combined dataset is misleading: the men are applying to easier majors, so they have a higher acceptance rate.