Programming Assignment 1
CSC 323 - Data Analysis and Statistical
Software
Due: 2/8/2001
A local financial institution is interested in
investigating the cost of maintaining a large
portfolio of legacy C++ programs.
The CIO is prepared to fund a project to replace these programs if
the maintenance cost is too high. However, this cost information is
not readily available.
A local consulting firm has been brought
in to investigate this issue and have seven days to
submit a report to management.
Given the time constraint, they decide to examine the project management
records of
a simple random sample of programs selected from the portfolio. The idea
is to determine the maintenance effort per line of code,
in hours, over the past year, for each
program in the sample and to use this information to make
inferences about the portfolio.
The following details are available
for each program in the sample:
- Program Size (Lines of Code); 1-4
- Effort (Hours); 5-7
- Module Name; 8-13
You have been asked to help with the analysis of
this data.
The requirements for your analysis
are detailed below
(if necessary, see "DATA step statements",
point 6, SAS Review).
- Write a SAS program to analyze these data. In
particular your program should analyze effort per line of code.
Your program should accomplish the following:
- Access your data from an external
file.
- Compute effort per line of code.
- Execute the PRINT and MEANS
procedures with appropriate options.
- For PROC PRINT, be sure to use
labels for column headings (e.g. Program Size, Effort,
Module Name, Effort per Line of Code).
Use names that are
meaningful. You should generate an appropriate
title for your output.
Note: If necessary,
see guide 2.
- Write a short report (no more than a couple of
paragraphs) discussing your findings. Your report should
address the
following:
- Provide an estimate of
the population mean. That is, the mean effort
per line of code to
maintain the portfolio.
- Provide an estimate of
the population standard deviation. That is the standard
deviation of effort per line of code for the portfolio.
- The CIO is interested in
the maintenance cost that distinguishes the most expensive
programs from others in the portfolio.
Assuming that by most expensive
the CIO means the top 10% of the portfolio (in terms of
effort per line of code),
and
using your estimates of the relevant population parameters,
determine the effort per line of code that makes this
distinction.
Assume that effort per line of code is
normally distributed.