Programming Assignment 2

CSC 323 Data Analysis and Statistical Software

Due: 2/25/99

Use simple linear regression methods to conduct a thorough analysis of the following dataset.

Benchmark Dataset:

Each observation consists of the following variables:

Consider Benchmark Score to be the dependent variable and Clock Rating the independent variable.

  1. Your program should accomplish the following:
    1. Execute the PRINT procedure.
    2. Produce a scatterplot of the dependent variable vs. the independent variable.
    3. Generate estimates of your slope and intercept using the REG procedure.
    4. Generate residuals using the REG procedure.
    5. Produce a residual plot.
    6. Execute PROC UNIVARIATE for the appropriate variable and with the appropriate options.

    Note: For PROC PRINT, be sure to use labels for column headings rather than variable names. Use names for datasets and variables that are meaningful. You should generate an appropriate title for the output of these procedures.

  2. Write a short analysis (no more than 2 pages) of the output of your SAS program. Your analysis should (at least) address the following:
    1. State the regression model including symbols for all of its parameters.
    2. Give the estimates of the model parameters.
    3. State the regression equation.
    4. Interpret the coefficients of the regression equation.
    5. Interpret the r-square value from the 'Analysis of Variance' section of your output.
    6. Determine the correlation between your dependent and independent variables.
    7. Comment on the model assumptions.
    Note: A sample analysis is available.