Ans: A bivariate dataset that is univariate normal in any direction.
Ans: 5: x, y, SDx, SDy, and r.
Ans: m means + m sds + m(m-1)/2 pairwise correlations for a total of 2m + m(m-1)/2 = m2/2 + 3m/2 parameters.
Ans: A situation where the response curve is nonlinear as in the following image. The correlation is zero, but there is perfect causality.
Ans: It tells you the proportion of variation in the dependent variable that can be explained by x.
x | y |
---|---|
1 | 1 |
2 | 3 |
3 | 2 |
4 | 4 |
Here is a table of the calculations:
x | y | zx | zy | zx zy |
---|---|---|---|---|
1 | 1 | -1.161895 | -1.161895 | 1.35 |
2 | 3 | -0.387298 | 0.087298 | -0.15 |
3 | 2 | 0.387298 | -0.387298 | -0.15 |
4 | 4 | 1.161895 | -1.161895 | 1.35 |
The average of the products is (1.35 + (-0.15) + (-0.15) + 1.35) / 4 = 0.60. Multiply this by the correction factor n / (n-1) to obtain the correlation:
Line of Averages (the line that connects the averages of the y-values in thin vertical rectangles)
Least Squares Line (the line that minimizes the sum of the squares of the residuals)
Linear Trend Line
x | y |
---|---|
1 | 1 |
2 | 3 |
3 | 2 |
4 | 4 |
Such data is included in the Pendulum Dataset. The time for 10 periods period is measured in seconds; the length is measured in centimeters. Transform the data to obtain x as the pendulum length in meters and y as the pendulum period in seconds.
Note: from physics, period = 2 π √length/g, where π = 3.14159265 and g = 9.80665 m/sec2 is the acceleration of gravity.