CSC423/324 - Data Analysis
Quiz #7
There are two questions for a total of 40 points. Plan on spending about
20 minutes on all questions.
You may use your notes to answer the questions. Please submit
answers by
email to
jmorgan1@condor.depaul.edu
no later than Monday, 3/24.
Do not send attachments. Embed your answers in
the body of your email.
- Consider the Microsoft Word training problem.
That is, a training consultant believes that training
can improve the efficiency of casual
Microsoft Word users and so decides to conduct an
experiment to investigate. She randomly selects two
groups of Microsoft Word
users from her organization. One
group is the Control group and
the other is the Treatment group (i.e. the group that receives training).
She assigns a
suite of tasks to each individual in each group and records the time
required to complete the suite. (20pts)
- Given the problem statement above, identify and state the null
and alternative hypotheses.
- Examine the proc reg output below. Let us say that x=0 identifies
times for the treatment group and x=1 identifies times for the
control group.
Given this output, and your hypotheses, complete the following:
- Derive the sample means.
- Conduct a test of hypotheses. Remember to state all
necessary assumptions.
- Comment on
the consultants point of view.
Parameter Estimates
Parameter Standard
Variable Estimate Error t Value Pr > |t|
Intercept 6.35112 1.08170 5.87142 <.0001
x 2.01050 1.52975 1.31426 0.0950
- Consider the
quality score prediction problem.
That is, you are interested in deriving a multiple regression model
to predict quality score from two independent variables. Quality score
is the time to the first failure in hours.
However, in this case, the independent variables
are all-uses coverage, x1 (i.e. a coverage measure that is similar
to decision coverage), and programmer experience, x2 (i.e.
programming experience in years). (20 pts)
Examine the proc reg output below.
Given this output, complete the following:
- Comment on the estimates of the
slope parameters. That is, interpret the estimates and
comment on the importance of each parameter to the model.
- Your colleagues argue
that a unit increase in all-uses coverage will lead to an increase in
execution time to first failure of 240 minutes. You believe that the
benefit of this additional coverage is better than they contend.
Formulate the necessary hypotheses to resolve this issue.
Given the proc reg output, conduct a test of hypotheses.
Parameter Estimates
Parameter Standard
Variable Estimate Error t Value Pr > |t|
Intercept 17.33146 5.94571 8.50 0.0172
x1 5.42188 1.87820 8.33 0.0180
x2 3.00957 1.47685 4.15 0.0720