Programming Assignment 1

CSC 323 - Data Analysis and Statistical Software

Due: 10/8/2003

A local financial institution has decided to upgrade its existing proprietary database management system to a relational database management system. They are in the process of deciding which software systems should be rewritten and which ones should be migrated to the new database management system. They are particularly interested in investigating the effort required to convert a large portfolio of C++ programs. The CIO would like to know if the conversion cost will be too high.

A local consulting firm has been brought in to investigate this issue and have seven days to submit a report to the CIO. Given the time constraint, they decide to examine a simple random sample of these C++ programs. The idea is to determine the effort required to convert each program in the sample. The conversion effort for a program is the time, in man hours, to analyze, code, and test the program. Since each program is a different size, conversion effort per line of code will be needed for each program in the sample. The information gathered from the sample will be used to make inferences about the portfolio. The following details are available for each program in the sample:

You have been asked to help with the analysis of this data.

The requirements for your analysis are detailed below:

  1. Write a SAS program to analyze these data. Remember, your program should analyze conversion effort per line of code. (50%)
    Your program should accomplish the following:
    1. Access your data from an external file.
    2. Compute effort per line of code.
      Note: You will need an assignment statement in your data step. If necessary, see "DATA step statements", point 6, SAS Review.
    3. Execute the PRINT and MEANS procedures with appropriate options.
    4. For PROC PRINT, be sure to use labels for column headings (e.g. Programmer, Module Name, Program Size, Analysis Effort, Coding Effort, Testing Effort, Conversion Effort, Conversion Effort per LOC). Use names that are meaningful. You should generate an appropriate title for your output.
    Note: If necessary, see the SAS program used for the in-class SAS demonstration.

  2. Write a short report (no more than a couple of paragraphs) discussing your findings. (50%)
    Your report should address the following:
    1. Provide estimates of the population mean and population standard deviation (3 places of decimal). That is, the mean effort per line of code, and the standard deviation of effort per line of code, to convert each program in the portfolio.
    2. The CIO has asked you to address the following in your report (include any necessary computations and diagrams to support your answer):
      • Determine the conversion effort per line of code that distinguishes the most expensive programs from others in the portfolio. Assume that most expensive refers to the top 12% of the portfolio with respect to conversion effort per line of code.
      • Program jterm is considered to be a maintenance nightmare by several staff members. That is, they contend that it is more difficult to analyze, modify and test jterm than it is to analyze, modify and test the average program in the portfolio. Program jterm is 2031 LOC and will require 180 hours of conversion effort. Determine its percentile rank with respect to conversion effort per line of code and comment on the viewpoint expressed by staff members.
      Note: To address these questions, assume that conversion effort per line of code is normally distributed. Also, use your estimates of the relevant population parameters.