Programming Assignment 2
CSC 323 Data Analysis and
Statistical Software
Due: Section 702: 10/21/98 & Section 403: 10/22/98.
A colleague has developed a new encryption algorithm for
encrypting documents and is interested in developing a model to
predict Processing Time from Document Size. She has
tested the algorithm on documents of varying sizes and collected
the following measurements for each document:
- Document Size (# of words)
- Processing Time (ms)
The encryption link contains her data.
- Write a SAS program to analyze this dataset. Your program
should do the following:
- Execute the PRINT procedure.
- Produce a scatterplot of the dependent variable
vs. the independent variable.
- Generate estimates of the slope and intercept using
the REG procedure.
- Generate residuals using the REG
procedure.
- Produce a residual plot.
- Execute PROC UNIVARIATE with the appropriate
options for residual analysis.
Note: For PROC PRINT, be sure to use labels for
column headings rather than variable names. Use names for
data sets and variables that are meaningful. You should
generate an appropriate title for the output of these
procedures.
- Write a short analysis (no more than two pages) of
the output of your SAS program. Your analysis should at least
address the following:
- State the regression model including symbols for
all of its parameters.
- Give the estimates of the model parameters.
- State the regression equation.
- Interpret the coefficients of the regression
equation.
- Interpret the r-square value from the Analysis
of Variance section of your output.
- Determine the correlation between your dependent
and independent variables.
- Comment on the model assumptions.