Programming Assignment 1
CSC 323 - Data Analysis and Statistical
Software
Due: 10/3/2001
A local financial institution is interested in
investigating the cost of maintaining a large
portfolio of legacy C++ programs.
The CIO is prepared to fund a project to replace these programs if
the maintenance cost is too high. However, this cost information is
not readily available.
A local consulting firm has been brought
in to investigate this issue and have seven days to
submit a report to management.
Given the time constraint, they decide to examine the project management
records of
a simple random sample of programs selected from the portfolio. The idea
is to determine the maintenance effort per line of code,
in hours, over the past year, for each
program in the sample and to use this information to make
inferences about the portfolio.
The following details are available
for each program in the sample:
- Module Name; 1-6
- Program Size (Lines of Code); 7-10
- Effort (Hours); 11-13
You have been asked to help with the analysis of
this data.
The requirements for your analysis
are detailed below:
- Write a SAS program to analyze these data. In
particular your program should analyze effort per line of code. (50%)
Your program should accomplish the following:
- Access your data from an external
file.
- Compute effort per line of code
(if necessary, see "DATA step statements",
point 6, SAS Review).
- Execute the PRINT and MEANS
procedures with appropriate options.
- For PROC PRINT, be sure to use
labels for column headings (e.g. Program Size, Effort,
Module Name, Effort per Line of Code).
Use names that are
meaningful. You should generate an appropriate
title for your output.
Note: If necessary,
see the SAS program used for the in-class SAS
demonstration.
Write a short report (no more than a couple of
paragraphs) discussing your findings. (50%)
Your report should
address the
following:
- Provide an estimate of
the population mean
(3 places of decimal).
That is, the mean effort
per line of code to
maintain the portfolio.
- Provide an estimate of
the population standard deviation
(3 places of decimal).
That is the standard
deviation of effort per line of code for the portfolio.
- The CIO has asked you to address the following
in your report:
- Program nlnreg is considered to be problematic by several staff
members. That is, they contend that it requires more maintenance than the
average program in the portfolio.
Program nlnreg
is 900 LOC and has required 279 hours of maintenance effort.
Determine
its percentile rank with respect to effort per line of code.
- Determine the effort per line of code
that distinguishes the most expensive
programs from others in the portfolio.
Assume that most expensive refers to
the top 15% of the portfolio with
respect to
effort per line of code.
Note:
Assume that effort per line of code is
normally distributed and
use your estimates of the relevant population parameters to
address these questions.