To Lecture Notes

IT 223 -- 1/26/11

 

Review Questions

  1. What percent of observations are in the following bins of the standard normal distribution:

    Ans: Whether the interval endpoint is included or not does not change the probability. The answer is 0.6827 in each case.

  2. What does the correlation tell you about a bivariate dataset?

    Ans: It tells you the amount of linear association between the variables.

  3. If there is a linear relationship between x and y, what is the shape of the point cloud in the scatterplot?

    Ans: Ellipse shaped.

  4. What is the correlation if there is a perfect linear relationship between the independent variable and the dependent variable?

    Ans: r = 1 for a perfect positive relationship and r = -1 for perfect negative relationship.

  5. What does the R-squared value tell you?

    Ans: R-squared is the square of the correlation. It tells you the proportion of variation in the independent variable that can be explained by the variation in the independent variable.

  6. How do you tell if the correlation is meaningful for explaining the effect of the independent variable on the dependent variable?

    Ans: It depends on the discipline. See this table.

 

Confidence Intervals for μ

 

Linear Correlation

 

Practice Problems

  1. Estimate the correlation r in these situations:

    1. Height of father, height of son.

        i. -0.30    ii. 0.05    iii. 0.70    iv. 0.99

        Ans: 0.70

    2. IQ of husband, IQ of wife.

        i. -0.70    ii. 0.00    iii. 0.60    iv. 1.00

        Ans: 0.60

    3. Height of husband, height of wife if men always married women that were exactly 6 inches shorter.

        i. -0.60    ii. 0.60    iii. 0.99    iv. 1.00

        Ans: 1.00

    4. Weight of husband, weight of wife if men always married women that weighed 70% of their husbands weight.

        i. 0.00    ii. 0.50    iii. 0.70    iv. 1.00

        Ans: 1.0

  2. Match the correlation to the dataset:

    1. GPA in freshman year, GPA in sophomore year.   Ans: 0.70

    2. GPA in freshman year, GPA in senior year.   Ans: 0.30

    3. Length and weight of 2 by 4 boards.   Ans: 0.99

  3. What would happen to the correlation r if

    1. x were replaced by x + 10.

    2. y were replaced by 2 times y.

    3. x and y were interchanged.

    Ans: in all three cases, the correlation would remain the same.

  4. Use SPSS to compute the pairwise correlations of the variables in the Nielsen Dataset. Interpret them.

    Ans: There are six pairwise correlations to interpret:

      Women Men Teenagers Children
    Women 1 0.608 -0.191 -0.279
    Men 0.608 1 0.264 0.124
    Teenagers -0.191 0.264 1 0.789
    Children -0.279 0.124 0.789 1

    Women and: men and women often watch TV shows together, especially if they are dating or married, so they need to agree on which shows to watch. This results in a positive correlation.

    Women and Teenagers: Teenagers often like to watch action shows or MTV, whereas (at least according to the stereotype) women like to watch soap operas and romantic comedies, resulting in a negative correlation.

    Women and Children: Same comment as women and teenagers. Children also like to watch cartoons.

    Men and Teenagers: Men tend to like action shows more than women, which matches what teenagers like. (Maybe men are more like teenagers than woman are, in general).

    Men and Children: Same comment as men and teenagers.

    Teenagers and Children: This correlation is quite high. Maybe teenagers are more like children than they would like to admit.

    Note: The preceding remarks are just speculation. One would have to know the names of the TV shows that they were rating to get a better idea of what is happening.

 

Bivariate Normal Datasets

 

Project 3