LSP 121 - Practice Midterm

Show your work for each problem.

See the SPSS Tutorial for more information on using SPSS.

  1. Determine Q0, Q1, Q2, Q3, Q4, IQR for this list by hand:

    11   23   29   31   54   63   71   73   83   238   1197

    Check your results using SPSS.

    Ans: Q0=11, Q1=30, Q2=63, Q3=78, Q0=1197, IQR=48

    To obtain with SPSS: Analyze >> Descriptive Statistics >> Explore. Check the Percentiles box under the Statistics option to obtain Q1, Q2, and Q3.

  2. Describe what a boxplot is and how to construct one by hand.

    Ans: A boxplot shows graphically the locations of Q1, Q2, and Q3. It also shows the positions of mild and extreme outliers. An extreme outlier is a data point that is less than Q1 - 3*IQR or greater than Q3 + 3*IQR (outside of the outer fences). An mild outlier is a data point that is less than Q1 - 1.5*IQR or greater than Q3 + 1.5*IQR (outside of the inner fences), but is not an extreme outlier. Extreme outliers are marked with *; mild outliers are marked with O.

  3. Have SPSS construct a boxplot for the list in Problem 1. Determine the outliers from the boxplot.

    Ans: Analyze >> Descriptive Statistics >> Explore. One mild and one extreme outlier are found.

  4. Explain how to sort a list with SPSS.

    Ans: Data >> Sort Cases.

  5. Create a histogram from the list

    1.4   1.9   2.4   2.5   2.7   3.8   4.1   4.9

    using bin boundaries at 1, 2, 3, 4, and 5. Ans:

      3 +       +----+
        |       |    |
        |       |    |
      2 +  +----+    |    +----+
        |  |    |    |    |    |
        |  |    |    |    |    |
      1 +  |    |    +----+    |
        |  |    |    |    |    |
        |  |    |    |    |    |
      0 +  +----+----+----+----+
           1    2    3    4    5
    
  6. Compute SD+ of the list 1, 2, 6. Verify your answer using SPSS.

    Ans: The average of the list is (1 + 2 + 6) / 3 = 3, SD+ = sqrt(((1 - 3)^2 + (2 - 3)^2 + (6 - 3)^2) / 2) = 2.646.

  7. Explain what standard deviation is to someone that does not understand statistics. See this website for some ideas: www.mathisfun.com/standard-deviation.html.

  8. Download the Excel file chip-thicknesses.xls onto your harddrive (My Documents or memory stick). Then obtain the following with SPSS:

    1. Add a label for x, for example, Chip Thickness.

      Ans: Go to Variable View, type in the label in the Label column. Recall that the label describes the column in more detail than the name and can contain spaces. The name cannot contain spaces. A good label for this dataset would be "Chip Thickness". Also verify that the Measure is set to "Scale".

    2. Find x and SD+.

      Ans: Analyze >> Descriptive Statistics >> Descriptives. x = 0.0837613, SD+ = 0.0005277.

    3. Find any mild or extreme outliers using the boxplot.

      Ans: Analyze >> Descriptive Statistics >> Explore. There are two mild outliers.

    4. Sort the data points.

      Ans: Data >> Sort Cases.

    5. Find the z-scores of x. Then find the mild and extreme outliers using the z-scores.

      Ans: Transform >> Compute Variable. Enter z for the Target Variable and (x - 0.0837613) / 0.0005277 for the Numeric Expression. Check your z-scores by computing the mean and SD+ of the z-scores. I obtained -0.0009475 and 1.00005873, which are close to 0 and 1 respectively, so they are okay. Sort the z-scores with Data >> Sort Cases. There is one z-score less than -2.0 and one greater than 2.0, which shows there are two mild outliers (same answer as with the boxplot).

  9. Given that LSAT scores are normally distributed with a mean of 150 and an SD of 10.

    1. What proportion of scores are between 145 and 160?

      Ans: z = (x - xbar)/sd = (145 - 150) / 10 = -0.5, z = (160 - 150) / 10 = 1. Look -0.5 and 1 up in the standard normal table: 0.3086 and 0.8413. Subtract: 0.8413 - 0.3086 = 0.5208 = 52%.

    2. What proportion of scores are over 165?

      Ans: z = (x - xbar) / sd = (165 - 150) / 10 = 1.5. Look up 1.5 in the normal table: 0.9332. This is the proprtion of scores less than or equal to 165. Subtract from 1.000 to get the proportion of scores greater than 165: 0.0668 = 7%.

    3. What score is at the 97th percentile?

      Ans: Look up 0.97 in the body of the standard normal table: z = 1.88, so x = z * sd + xbar = 1.88 * 10 + 150 = 168.8.

    4. What score is at the 35th percentile?

      Ans: z = -0.39. x = -39 * 10 + 150 = 111.

  10. What is the probability that rolling five four sided dice once results in all ones? What is the probability that all ones are obtained at least once in 1,000 rolls of the five four-sided dice?

    Ans: The probability that rolling five 4-sided dice once results in all ones is 1/4^5 = 0.000977 (1 because there is only one way to get all ones, 4 because the dice are 4-sided, 5 because we are rolling five dice. If this experiment is repeated 1,000 times, the probability is 1 - (1 - p)^n = 1 - (1 - 0.000977)^1,000 = 0.623576 = 62%.

  11. What is the probability of obtaining at least one Yahtzee in 5000 rolls? A Yahtzee means that all five dice show the same face when five six-sided dice are rolled.

    Ans: The probability of obtaining a Yahtzee in one roll of five 6-sided dice is 6/6^5 = 0.000772. The 6 in the numerator is because there are 6 ways of obtaining a Yahtzee. The probability of obtaining a Yahtzee in 5,000 rolls is 1 - (1 - p)^n = 1 - (1 - 0.000384)^5,000 = 0.978922 = 98%.

  12. What is wrong with this payoff table:

    Payoff Probability
    50 0.3
    10 1.1
    -20 0.2
    -30 -0.3

    Ans: No probability can be less than 0 or more than 1 and the probabilies must sum to 1.

  13. A tropical island has the same probabilities of rain every day: 40% chance of 1 inch, 20% chance of 2 inches, 10% chance of 3 inches. It never rains more than 2 inches. What is the expected value of rain tomorrow?

    Ans: Here is the payoff table in terms of rain:

    Payoff Probability
    0 0.3
    1 0.4
    2 0.2
    3 0.1

    The expected value is 0*0.3 + 1*0.4 + 2*0.2 + 3*0.1 = 1.1

  14. Your house costs $400,000, on which you pay $130 per year in insurance. If the probability that your house is destroyed in a given year is 0.025%, what is the "pure premium" for the insurance company? The pure premium is the premium that would make the risk zero for the insurance company.

    Ans: (-400,000 + p) 0.00025 + 130 (1 - 0.00025) = 0. Solving for p: p = 400,000 * 0.00025 = $100.

  15. For the following bivariate dataset, use SPSS to compute the correlation, R2 value and scatterplot.

    x:   1   2   3   4   5
    y:   2   4   1   3   5

    Ans: To compute the correlation: Analyze >> Correlate >> Bivariate. To obtain the scatterplot with linear trend line: Graphs >> Chart Builder.

  16. Short Essay: discuss the original and current definitions of the meter, second, and kilogram.