Programming Assignment 1
CSC 323 - Data Analysis and Statistical
Software
Due: 7/2/2003
The new director of software development at a large local company has decided to investigate software quality. The director is particularly interested in the C++ portfolio. She decides to assess quality by analyzing several quality characteristics for each program in a randomly selected sample of programs taken from the portfolio. The CIO is prepared to support and fund a quality assurance program if software quality is unacceptably low.
The director has recruited you to help with data analysis. You have been
asked to analyze fault density. The director explains that fault
density is the number of faults discovered
during testing and operational use, per line of code (LOC).
The director also indicates that only programs with more than 1000 hours
of operational use were considered for this experiment.
Fault density
is a well known quality measure that is
appropriate for this software portfolio.
The following details are available
for each program in the sample:
- Program Name; 1-6
- Cyclomatic Complexity; 7-8
- Program Size (LOC); 9-12
- Number of Faults (Testing); 13-15
- Number of Faults (Operations); 16-19
You have been asked to help with the analysis of
this data.
The requirements for your analysis
are detailed below:
- Write a SAS program to analyze these data. In
particular your program should analyze fault density. (50%)
Your program should accomplish the following:
- Access your data from an external
file.
- Compute fault density.
Note: You will need an assignment statement in your data step.
If necessary, see "DATA step statements",
point 6, SAS Review.
- Execute the PRINT and MEANS
procedures with appropriate options.
- For PROC PRINT, be sure to use your defined
labels as column headings.
That is, use the label statement to
define "Number of
Faults",
"Program Name",
"Program Size", "Cyclomatic Complexity", and "Fault Density"
as labels and then make sure that these labels are printed as column headings.
Also, generate an appropriate
title for your output.
Note: If necessary,
see the SAS program used for the in-class SAS
demonstration.
Write a short report (no more than a couple of
paragraphs) discussing your findings. (50%)
Your report should
address the
following:
- Provide estimates of
the population mean
and population standard deviation
to 4 places of decimal
(i.e. the mean
fault density, and
the standard
deviation of fault denisty for the portfolio).
- The CIO has asked you to address the following
in your report (include all
necessary computations and diagrams to support your answer):
- Program prtprn is considered to be problematic by several staff
members. That is, they contend that it has a higher failure
density than the
average program in the portfolio.
Twenty faults were discovered during the testing of
program prtprn and nine more during operational use. Program
prtprn has 640 lines of code.
Determine
its percentile rank with respect to fault density and comment on the
point of
view of the staff members.
- Determine the fault density
that distinguishes high quality
programs from others in the portfolio.
The CIO considers high quality programs to
be the best 22% of the portfolio,
with respect to fault
density.
Note:
To
address these questions,
assume that fault density is
normally distributed. Also,
use your estimates of the relevant population
parameters.