Programming Assignment 1
CSC 323 - Data Analysis and Statistical
Software
Due: Section 101: 10/9/2002 & Section 401: 10/10/2002
A local financial institution has
decided to upgrade its existing proprietary
database management system to a relational database
management system. They are in the process of
deciding which software systems should be rewritten
and which ones should be migrated to
the new database management system. They are particularly interested in
investigating the effort required to convert a large
portfolio of C++ programs.
The CIO would like to know if
the conversion cost will be too high.
A local consulting firm has been brought
in to investigate this issue and have seven days to
submit a report to the CIO.
Given the time constraint, they decide to examine
a simple random sample of these C++
programs. The idea
is to determine the
effort required to convert each program in the
sample. The
conversion effort for a program
is the time, in man hours, to analyze, code,
and test the program.
Since each program is a different size,
conversion effort per line of code
will be needed for each
program in the sample. The information gathered from the sample will
be used to make
inferences about the portfolio.
The following details are available
for each program in the sample:
- Programmer; 1-2
- Module Name; 3-7
- Program Size (Lines of Code); 8-11
- Analysis Effort (Man Hours); 12-13
- Coding Effort (Man Hours); 14-15
- Testing Effort (Man Hours); 16-17
You have been asked to help with the analysis of
this data.
The requirements for your analysis
are detailed below:
- Write a SAS program to analyze these data.
Remember, your program should analyze conversion effort
per line of code.
(50%)
Your program should accomplish the following:
- Access your data from an external
file.
- Compute effort per line of code.
Note: You will need an assignment statement in your data step.
If necessary, see "DATA step statements",
point 6, SAS Review.
- Execute the PRINT and MEANS
procedures with appropriate options.
- For PROC PRINT, be sure to use
labels for column headings (e.g. Programmer,
Module Name,
Program Size, Analysis Effort, Coding Effort, Testing Effort,
Conversion Effort,
Conversion Effort per LOC).
Use names that are
meaningful. You should generate an appropriate
title for your output.
Note: If necessary,
see the SAS program used for the in-class SAS
demonstration.
Write a short report (no more than a couple of
paragraphs) discussing your findings. (50%)
Your report should
address the
following:
- Provide estimates of
the population mean and population standard deviation
(3 places of decimal).
That is, the mean effort
per line of code, and the standard deviation of effort per
line of code, to
convert each program in the portfolio.
- The CIO has asked you to address the following
in your report (include any necessary computations and diagrams to support
your answer):
- Determine the conversion effort per line of code
that distinguishes the most expensive
programs from others in the portfolio.
Assume that most expensive refers to
the top 5% of the portfolio with
respect to
conversion effort per line of code.
- Program uniq2 is considered to be a maintenance nightmare
by several staff
members. That is, they contend that it is more difficult to
analyze, modify and test uniq2 than it is to analyze,
modify and test the
average program in the portfolio.
Program uniq2
is 1200 LOC and will require 125 hours of conversion effort.
Determine
its percentile rank with respect to conversion
effort per line of code and comment on the viewpoint expressed by
staff members.
Note: To address these questions, assume that conversion effort
per line of code is normally distributed.
Also, use your
estimates of the relevant population parameters.