Programming Assignment 3

CSC 323 - Data Analysis and Statistical Software

Due: 6/12/2001

Your colleague is interested in developing a model to predict the performance of workstations and believes that the Tower of Hanoi (TOH) benchmark is appropriate as a measure of performance. She decides to conduct an experiment to develop this model and selects a sample of workstations, based on different CPU's, from different manufacturers. She notes the Clock Rating and Matrix Inversion (MI) score reported by each manufacturer and then executes the TOH benchmark for each workstation.

Note: The TOH benchmark is the time in microseconds (ms) to make 25 TOH moves. The MI score is the time in microseconds (ms) to complete a standard matrix inversion.

You have been presented with the data collected for this experiment. Each observation consists of the following values:

Consider TOH Benchmark to be the response variable. Clock Rating and MI Score are candidate explanatory variables.

  1. Your program should accomplish the following: (30%)
    Note: Code and output for the income/gpa problem discussed in class is available.
    1. Read your data from an external file.
    2. Execute the PRINT procedure.
    3. Produce a scatterplot of the response variable vs. each of the candidate explanatory variables.
    4. Execute PROC CORR for the response variable and each of the candidate explanatory variables.
    5. For the best explanatory variable:
      1. Generate estimates of your slope and intercept using PROC REG.
      2. Execute PROC UNIVARIATE with the appropriate options for your residuals.

      Note: For PROC PRINT, be sure to use labels for column headings rather than variable names. Use names for data sets and variables that are meaningful. You should generate an appropriate title for the output of these procedures.

  2. Your analysis should address the following. Note that an example of the format required for your analysis is provided. Note that your analysis will include a section that identifies the best explanatory variable. (70%)
    1. Identify the best explanatory variable, that is, the explanatory variable that does the best job of explaining variability in the response variable.
    2. For the best explanatory variable:
      1. State the regression model. Use appropriate symbols for all of its parameters.
      2. Provide estimates of the model parameters and state the regression equation. You must interpret the coefficients of the regression equation.
      3. Interpret the correlation coefficient.
      4. Assess the normality assumption. You must state the appropriate hypotheses to assess this assumption. If normality is reasonable:
        1. Predict TOH Benchmark score for:
          • a Clock Rating of 650MHz if you determine that clock rating is the best explanatory variable OR
          • a MI Score of 500 if you determine that MI Score is the best explanatory variable
        2. Determine the proportion of workstations that obtained a TOH Benchmark score greater than 300 for:
          • a Clock Rating of 650MHz if you determine that clock rating is the best explanatory variable OR
          • a MI Score of 500 if you determine that MI Score is the best explanatory variable