Sample Analysis
Programming Assignment #3

Model:

We wish to model the relationship between starting income, measured in dollars, and grade point average (GPA) for a population of recent non-computer science graduates (class of 1996). Let y denote starting income and x GPA, then the simple linear regression model for this problem is:

y= a + bx +e

where, for fixed x, e is assumed to:

The model parameters are: a - population intercept; b - population slope; sy|x - standard deviation about the regression line

Residual Analysis:

To assess these assumptions we conducted a residual analysis. Hypotheses for testing the normality assumption are:

H0: Residuals are drawn from a normal distribution
Ha: Residuals are not drawn from a normal distribution

The p-value is 0.6960 indicating that we have insufficient reason to doubt the null-hypothesis and so the normality assumption is reasonable. The box-plot of residuals indicates that there are no outliers and so the unbiased assumption is also reasonable. Since the residual plot does not indicate systematic dependence on the independent variable, the constant standard deviation assumption is also reasonable.

These findings support the assumptions of the regression model and indicate that the simple linear regression model is appropriate for this problem.

Discussion:

From the SAS Analysis of Variance report, the parameter estimates are:

a = 271.54
b = 7994.98
sy|x = 2239.25

Hence, the regression equation is y=271.54 + 7994.98x which indicates that for unit increase in GPA we can expect an increase in starting income of about $7995. Also, the equation indicates that we can expect a starting income of $271.54 for a GPA of zero. Since the minimum observed GPA is 1.6 this interpretation is probably not meaningful.

The r-square value of 0.8677 indicates that about 87% of the variance in starting income is explained by the regression model. Since the model is a simple linear regression model we can also conclude that the correlation coefficient is +sqrt(0.8667) (i.e. 0.931) indicating strong positive correlation between starting income and GPA.

Summary:

Simple linear regression is appropriate for this problem. Furthermore, the model explains about 87% of the variance in starting income, indicating that, for these non-computer science graduates, GPA is a good predictor of starting income.

The regression equation, for this functional relationship, is y=271.54 + 7994.98x. Hence, an increase in starting income of about $7995 can be expected for unit increase in GPA. However, since the minimum observed GPA is 1.6, the intercept is probably not meaningful.