Programming Assignment 3
CSC 323 - Data Analysis and Statistical Software
Due: 6/12/2001
Your colleague is interested in developing a model to predict
the performance of workstations and believes that
the Tower of Hanoi (TOH) benchmark is appropriate as a measure of
performance. She decides to conduct an experiment to develop
this model and selects a sample of workstations, based on different CPU's,
from different
manufacturers. She
notes the Clock Rating and Matrix Inversion (MI) score reported by
each manufacturer and then
executes the TOH benchmark for each workstation.
Note: The TOH benchmark is the time in microseconds
(ms)
to make 25 TOH moves. The MI score is the time in microseconds
(ms)
to complete a standard matrix inversion.
You have been presented with
the
data collected for this experiment.
Each observation consists of the following values:
- CPU Type; 1-8
- Architecture; 9-12
- TOH Benchmark
(
ms); 13-15
Clock Rating (MHz); 16-18
MI Score
(ms); 19-21
Consider
TOH Benchmark
to be the response variable.
Clock Rating and MI Score
are candidate explanatory
variables.
- Your program should accomplish the following: (30%)
Note: Code and output for the
income/gpa
problem discussed in class is
available.
- Read your data from an external file.
- Execute the PRINT procedure.
- Produce a scatterplot of the response variable
vs. each of the candidate explanatory variables.
- Execute PROC CORR for the response variable and each of the candidate explanatory variables.
- For the best explanatory variable:
- Generate estimates of your slope and intercept using PROC REG.
- Execute PROC UNIVARIATE with the appropriate options for your residuals.
Note: For PROC PRINT, be sure to use labels for
column headings rather than variable names. Use names for
data sets and variables that are meaningful. You should
generate an appropriate title for the output of these
procedures.
- Your analysis should address the following. Note that
an example
of the format required for your analysis is provided. Note that
your analysis
will include a section that identifies the best
explanatory variable. (70%)
- Identify the best explanatory variable, that is, the explanatory variable that does the best job of explaining variability in the response variable.
- For the best explanatory variable:
- State the regression model. Use appropriate symbols for all of its parameters.
- Provide estimates of the model parameters and
state the regression equation. You must interpret the
coefficients of the regression equation.
- Interpret the correlation coefficient.
- Assess the normality assumption. You must state
the appropriate hypotheses to assess this assumption. If
normality is reasonable:
- Predict TOH Benchmark score for:
- a Clock Rating of 650MHz if you determine that clock rating is the best
explanatory variable OR
- a MI Score of 500 if you determine that MI Score is the best explanatory
variable
- Determine the proportion of workstations that obtained a TOH Benchmark
score greater than 300 for:
- a Clock Rating of 650MHz if you determine that clock rating is the best
explanatory variable OR
- a MI Score of 500 if you determine that MI Score is the best explanatory
variable