|
Simon Dyck David Sloane |
Software TestingSeng 621 Winter 1999 |
Abstract |
|
This web document, an extension of a presentation for S. Eng. 623, provides and introduction to software testing. This covers the basic methods of black and white box testing, as well as the different test levels (unit, integration, system, etc.) Each level is described with how it builds on the previous stage. A brief discussion of software testing metrics is presented. The challenges facing software testing in an organization are explored, and the question of testing versus software inspections is discussed. Finally we present a look at fault-based testing methods, a testing strategy which is gaining in popularity. |
Metrics |
||||||||||||||||||||||||||||||||||
Goals | As stated above, the major goal of
testing is to discover errors in the software. A
secondary goal is to build confidence that the system
will work without error when testing does not reveal any
errors. Then what does it mean when testing does not
detect any errors? We can say that either the software is
high quality or the testing process is low quality. We
need metrics on our testing process if we are to tell
which is the right answer. As with all domains of the software process, there are hosts of metrics that can be used in testing. Rather than discuss the merits of specific measurements, it is more important to know what they are trying to achieve. Three themes prevail:
|
|||||||||||||||||||||||||||||||||
Quality Assessment | An important question in the testing process is
"when should we stop?" The answer is when
system reliability is acceptable or when the gain in
reliability cannot compensate for the testing cost. To
answer either of these concerns we need a measurement of
the quality of the system. The most commonly used means of measuring system quality is defect density. Defect density is represented by:
where system size is usually expressed in thousands of lines of code or KLOC. Although it is a useful indicator of quality when used consistently within an organization, there are a number of well documented problems with this metric. The most popular relate to inconsistent definitions of defects and system sizes. Defect density accounts only for defects that are found in-house or over a given amount of operational field use. Other metrics attempt to estimate of how many defects remain undetected. A simplistic case of error estimation is based on "error seeding". We assume the system has X errors. It is artificially seeded with S additional errors. After a testing, we have discovered Tr 'real' errors and Ts seeded errors. If we assume (questionable assumption) that the testers find the same percentage of seeded errors as real errors, we can calculate X:
For example, if we find half the seeded errors, then the number of 'real' defects found represents half of the total defects in the system. Estimating the number and severity of undetected defects allows informed decisions on whether the quality is acceptable or additional testing is cost-effective. It is very important to consider maintenance costs and redevelopment efforts when deciding on value of additional testing. |
|||||||||||||||||||||||||||||||||
Risk Management | Metrics involved in risk management measure how
important a particular defect is (or could be). These
measurements allow us to prioritize our testing and
repair cycles. A truism is that there is never enough
time or resources for complete testing, making
prioritization a necessity. One approach is known as Risk Driven Testing, where Risk has specific meaning. The failure of each component is rated by Impact and Likelihood. Impact is a severity rating, based on what would happen if the component malfunctioned. Likelihood is an estimate of how probable it is that the component would fail. Together, Impact and Likelihood determine the Risk for the piece. Obviously, the higher rating on each scale corresponds to the overall risk involved with defects in the component. With a rating scale, this might be represented visually:
The relative importance of likelihood and impact will vary from project to project and company to company. A system level measurement for risk management is the Mean Time To Failure (MTTF). Test data sampled from realistic beta testing is used find the average time until system failure. This data is extrapolated to predict overall uptime and the expected time the system will be operational. Sometimes measured with MTTF is Mean Time To Repair (MTTR). This represents the expected time until the system will be repaired and back in use after a failure is observed. Availability, obtained by calculating MTTF / (MTTF + MTTR), is the probability that a system is available when needed. While these are reasonable measures for assessing quality, they are more often used to assess the risk (financial or otherwise) that a failure poses to a customer or in turn to the system supplier. |
|||||||||||||||||||||||||||||||||
Process Improvement | It is generally accepted that achieve improvement you
need a measure against which to gauge performance. To
improve our testing processes we the ability to compare
the results from one process to another. Popular measures of the testing process report:
It is also important to consider reported system failures in the field by the customer. If a high percentage of customer reported defects were not revealed in-house, it is a significant indicator that the testing process in incomplete. A good defect reporting structure will allow defect types and origins to be identified. We can use this information to improve the testing process by altering and adding test activities to improve our changes of finding the defects that are currently escaping detection. By tracking our test efficiency and effectiveness, we can evaluate the changes made to the testing process. Testing metrics give us an idea how reliable our testing process has been at finding defects, and can is a reasonable indicator if its performance in the future. It must be remembered that measurement is not the goal, improvement through measurement, analysis and feedback is what is needed. |
Software Testing Organization |
|
Test Groups | The following summarizes the Pros and
Cons of maintaining separate test groups
Pros
Cons
The key to optimizing the use of separate test groups is understanding that developers are able to find certain types of bugs very efficiently, and testers have greater abilities in detecting other bugs. An important consideration would be the size of the organization, and the criticality of the product. |
Testing Problems | When trying to effectively implement software
testing, there are several mistakes that organizations
typically make. The errors fall into (at least) 4 broad
classes: Misunderstanding the role of testing. The purpose of testing is to discover defects in the product. Furthermore, it is important to have an understanding of the relative criticality of defects when planning tests, reporting status, and recommending actions. Poor planning of the testing effort. Test plans often over emphasize testing functionality at the expense of potential interactions. This mentality also can lead to incomplete configuration testing and inadequate load and stress testing. Neglecting to test documentation and/or installation procedures is also a risky decision. Using the wrong personnel as testers. The role of testing should not be relegated to junior programmers, nor should it be a place to employ failed programmers. A test group should include domain experts, and need not be limited to people who can program. A test team that lacks diversity will not be as effective. Poor testing methodology. Just as programmers often prefer coding to design, testers can be too focussed on running tests at the expense of designing them. The tests must verify that product does what it is supposed to, while not doing what it should not. As well, using code coverage as a performance goal for testers, or ignoring coverage entirely are poor strategies. |
Testing and SQA, Inspections |
|||
Inspections are undoubtedly a critical
tool to detect and prevent defects. Inspections are
strict and close examinations conducted on
specifications, design, code, test, and other artifacts.
An important point about inspections is that they can be
performed much earlier in the design cycle, well before
testing begins. Having said that, testing is something
that can be started much earlier than is normally the
case. Testers can review their test plans with developers
as they are creating their designs. Thus the developer
may be more aware of the potential defects and act
accordingly. In any case, the detection of defects early
is critical, the closer to the time of its creation that
we detect and remove a defect, the lower the cost, both
in terms of time and money. This is illustrated in figure
2:
Evidence of the benefits of inspections abounds. The literature (Humphrey 1989) reports cases where:
In the face of all this evidence, it has been suggested that "software inspections can replace testing". While the benefits of inspections are real, they are not enough to replace testing. Inspections could replace testing if and only if all information gleaned through testing could be obtained through inspection. This is not true for several reasons. Firstly, testing can identify defects due to complex interactions in large systems (e.g. timing/synchronization). While inspections can detect this event, as systems become more complex the chances of one person understanding all the interfaces and being present at all the reviews is quite small. Second, testing can provide a measure of software reliability (i.e. failures/execution time) that is unobtainable from inspections. This measure can often be used as a vital input to the release decision. Thirdly, testing identifies system level performance and usability issues that inspections cannot. Therefore, since inspections and testing provide different, equally important information, one cannot replace the other. However, depending on the product, the optimal mix of inspections and testing may be different! |
A Closer Look: Fault Based Methods |
|||||
The following paragraphs will describe
some newer techniques in the software testing field.
Fault based methods include Error Based Testing, Fault
seeding, mutation testing, and fault injection, among
others. After briefly describing each of the 4 techniques, fault injection will be discussed in more detail.
These methods attempt to address the belief that current techniques for assessing software quality are not adequate, particularly in the case of mission critical systems. Voas et. al. suggests that the traditional belief that improving and documenting the software development process will increase software quality is lacking. Yet, they recognize that the amount of testing (which is product focussed) required in order to demonstrate high reliability is impractical. In short, quality processes cannot demonstrate reliability and the testing necessary to do so is impossible to perform. Fault injection is not a new concept. Hardware design techniques have long used inserted fault conditions to test system behavior. It is as simple as pulling the modem out of your PC during use and observing the results to determine if they are safe and/or desired. The injection of faults into software is not so widespread, though it would appear that companies such as Hughes Information Systems, Microsoft, and Hughes Electronics have applied the techniques or are considering them. Properly used, fault insertion can give insight as to where testing should be concentrated, how much testing should be done, whether or not systems are fail-safe, etc. As a simple example consider the following code:
In this case it is catastrophic if T > 100. By using perturb(x) to generate changed values of X (i.e. a random number generator) you can quickly determine how often corrupted values of X lead to undesired values of T. The technique can be applied to internal source code, as well as to 3rd party software, which may be a "black box" |
Conclusion |
|
Software testing is an important part of
the software development process. It is not a single
activity that takes place after code implementation, but
is part of each stage of the lifecycle. A successful test
strategy will begin with consideration during
requirements specification. Testing details will be
fleshed through high and low level system designs, and
testing will be carried out by developers and separate
test groups after code implementation. As with the other activities in the software lifecycle, testing has its own unique challenges. As software systems become more and more complex, the importance of effective, well planned testing efforts will only increase. |
References |
|
References
Further Reading Marick, Brian, "Classic Testing
Mistakes", 1997 "Software Testing Techniques"
"Software Inspections"
Hower, Rick, "Software QA and
Testing Frequently-Asked-Qustions, Part 1", 1998
|
up | ||||
Software Testing Seng 621 Winter 1999 | Simon Dyck David Sloane |