Programming Assignment 3

CSC 323 Data Analysis and Statistical Software

Due: Section 101: 11/22/2002 & Section 401: 11/26/2002

 

A colleague has developed a new compression algorithm for compressing documents and is interested in developing a model to predict Processing Time from Document Size. She has tested the algorithm on documents of varying sizes and asks you for help in completing the analysis.

She presents you with the data from her experiment and explains that each observation consists of the following values:

  1. Write a SAS program to analyze this dataset. Your program should accomplish the following: (40%)

  2. Note: Code and output for the income/gpa problem discussed in class is available.
    1. Read your data from an external file.
    2. Execute the PRINT procedure.
    3. Produce a scatterplot of the dependent variable vs. the independent variable.
    4. Generate estimates of the slope and intercept using PROC REG.
      Note: Do not compute these values by hand.
    5. Generate residuals using PROC REG.
      Hint: To generate residuals use the output statement with the out= option. See the code example.
    6. Execute PROC UNIVARIATE with the normal option. Use the variable that contains the residuals generated by PROC REG.

    7. Hint: See the code example.

    Note: For PROC PRINT, be sure to use labels for column headings rather than variable names. Use names for data sets and variables that are meaningful. You should generate an appropriate title for the output of these procedures.

  3. Write a report to summarize your findings. Your report should address the following. Note that an example of the format required for your report is provided. (60%)
    1. State the regression model. Use appropriate symbols for all of its parameters.
    2. Provide estimates of the model parameters and state the regression equation. You must interpret the coefficients of the regression equation.
    3. Interpret the correlation coefficient.
    4. Assess normality for the residuals.
      Hint: See the corresponding section in the example report.
    5. If normality is reasonable complete the following problems: