Lecture #5

This is a continuation of the discussion on testing techniques that started with the Basis Path, White Box, testing technique (see Lecture #4). Following, is a detailed discussion of two well known Black Box techniques. That is, Equivalence Partitioning and Boundary Value Analysis. Another technique, Error Guessing is also briefly mentioned.

To motivate the discussion, consider the following functional description of the function, CheckoutCost, that is intended to compute the total cost of a basket of products purchased at an electronic retailer.

Functional Description:

CheckoutCost is a funtion of three variables: nextday, amount, and type. It determines shipping charge, applicable surcharges, applicable discounts, and tax, if any. It returns a single total amount which is the sum of amount, shipping charge, and tax, plus any surcharges and less any discounts.

The input variables are subject to the following constraints:

NextDay: a character value that may be either Y or N
Amount: an integer value that must be greater than zero.
Type: a character value that may be either B, E or H.
Note: These values indicate: B for Business, E for Education, H for Home.

The following requirements apply:

Shipping charge is determined from the input amount by applying the following algorithm:
1. If amount is greater than $1000.00 then shipping charge is 10% of amount.
2. If amount is greater than $200.00 but less than or equal to $1000.00 then shipping charge is 12% of amount.
3. If amount is greater than $0.00 but less than or equal to $200.00 then shipping charge is 15% of amount.
However, a value of Y for the variable NextDay indicates that this is a rush shipment and so an additional shipping charge of $100.00 is incurred.
Tax only applies when the type is B. It is determined by computing 8% of the input amount.
A discount of 5% of the input amount applies when the type is E.

Note: A suitable error message should be returned for an invalid input or combination of invalid inputs. For example, consider the following situations:

Let us say the input variable NextDay contained the invalid value G then the following descriptive text should be returned.
```
    NextDay must be Y or N
```
Let us say the input variable Amount contained the invalid value -50 then the following descriptive text should be returned.
```
    Amount must be greater than zero
```
Let us say the input variable NextDay contained the invalid value T and the input variable Type contained the invalid value N then the following descriptive text should be returned.
```
    NextDay must be Y or N
    Type must be B, E or H
```

Invalid error messages should be produced for all other error conditions.

Issue:

Given this functional description alone, is it possible to derive a thorough set of test cases that may be used to determine if the specified functionality is correctly implemented? The answer is Yes. Several techniques have been developed for accomplishing this. We will examine two of these techniques in some detail. They are the complementary techniques, Equivalence Partitioning and Boundary Value Analysis. We will also briefly mention the Error Guessing technique.

Equivalence Partitioning:

The Equivalence Partitioning technique is usually attributed to Glenford Myers. Myers provided a comprehensive outline of the technique in his classic text, The Art of Software Testing that was first published in 1979. It is based on the idea of Equivalence Classes from Set Theory (Discrete Mathematics). For the curious, the Set Theory aside below provides some details on this idea. However, to understand the technique it is not absolutely necessary to understand the mathematics. The point is that the use of equivalence classes as the basis for deriving test cases has two motivations.

We would like to have a sense of complete testing.
We would like to avoid redundancy.

That is, we would like to use the fewest number of test cases that have maximum effect and yet are complete. We would also like to be able to do this without knowing the internal workings of the software product. We would like to derive these test cases from the functional description alone. The basic idea is that one should try to partition the input domain of a module into a finite number of equivalence classes such that one can reasonably assume that a test of a representative value of each class is equivalent to a test of any other value. That is, if one test case in an equivalence class detects an error, all other test cases in the equivalence class would be expected to find the same error. Conversely, if a test case did not detect an error, we would expect that no other test cases in the equivalence class would find an error.

Following is an outline of the actual technique that may be used to discover these equivalence classes and, as a result, may be used to derive test cases.

Identifying Equivalence Classes:

Equivalence classes may be identified from a functional description by carefully examining the constraints and functional requirements for each input. For each input, the constraints and functional requirements should be sufficient to partition the input domain into two or more groups. The groups are partitions of the input domain such that each partition consists of values that the functional requirements indicate will be processed in the same way (i.e. equivalently). As a result, these partitions are referred to as equivalence classes.

Remember that two types of equivalence classes should be identified. That is, valid equivalence classes for input values that are valid (may be determined from the constraints) and invalid equivalence classes for input values that are invalid (may be determined from the constraints).

Identifying equivalence classes is largely a heuristic process. You may apply the following guidelines based on guidelines first outlined by Myers in the 1979 text:

Range of values: If an input variable may have a range of values, then identify a valid equivalence class for the valid range and one, or more, invalid equivalence classes for the invalid values.
e.g. Consider an integer variable ItemCount that is defined as an input in some functional description. Let us say that, from the functional description, the constraint on this variable is that ItemCount can have values from 1 to 999. You would identify the range 1 to 999 as the valid equivalence class and you would identify two invalid equivalence classes. That is, ItemCount less than 1 and ItemCount greater than 999.
Set of values: If an input variable may have a discrete set of values and the requirements indicate that each is handled differently, then identify a valid equivalence class for each valid discrete value and one invalid equivalence class for the invalid values.
e.g. Consider a character variable VehicleType that is defined as an input in some functional description. Let us say that, from the functional description, the constraint on this variable is that VehicleType may be B, T, M, or C (i.e. Bus, Truck, Motorcycle, or Cab). You would identify four valid equivalence classes each containing a single value, namely {B}, {T}, {M}, and {C} and you would identify one invalid equivalence class that would contain all other possible values.
Must be: If the functional description indicates a must be constraint for that input variable, then identify a valid equivalence class for input values that satisfy the must be constraint and an invalid equivalence class for input values that do not satisfy the must be constraint.
e.g. Consider a string variable PartNumber that is defined as an input in some functional description. Let us say that, from the functional description, the constraint on this variable is that the first character of PartNumber must be a letter. You would identify one valid equivalence class for input values where the first character is a letter and one invalid equivalence class where the first character is not a letter.
Re-partition: If there is any reason to believe that elements in an equivalence class are not handled in an identical manner then split the equivalence class into smaller equivalence classes.

Identifying Test Cases:

Once equivalence classes have been identified then test cases may be derived from the equivalence classes. The procedure described below is more robust than that outlined by Myers in his 1979 text. That is, it results in a more complete set of test cases than Myers would have derived. It is similar to a procedure referred to as Strong Robust Equivalence Class Testing in the second edition of Software Testing by Paul Jorgensen (2002).

For each input, assign a unique label to each equivalence class. Use a labeling technique that makes it easy to identify the valid and invalid equivalence classes and also makes it easy to identify the associated input variable.
e.g. Let us say you have two valid equivalence classes and two invalid equivalence classes for a variable ItemType. A reasonable labeling strategy may be to use a prefix of IT, to identify the variable ItemType, followed by a hyphen, followed by a V to indicate Valid or an I to indicate Invalid, followed by a sequence number. Thus, we would have the following equivalence class labels:
```
  Valid labels:   IT-V1, IT-V2 
  Invalid labels: IT-I1, IT-I2 
```
Treat the labels for each input as a set. Take the Cartesian product of the resulting sets. For a review of Cartesian product, see the Set Theory aside below.
e.g. Let us say the following sets of labels are obtained for the input variable ItemType and the input variable Amount:
```
  IT = {IT-V1, IT-V2, IT-I1, IT-I2}
   A = {A-V1, A-I1, A-I2}
```
Notice that, using our labeling convention, it is evident that there are two valid equivalence classes and two invalid equivalence classes for ItemType. Similarly, there is one valid equivalence class and two invalid equivalence classes for Amount. So, the resulting Cartesian product would be a set of 12 tuples:
```
  IT X A = {(IT-V1, A-V1), (IT-V1, A-I1), (IT-V1, A-I2),
            (IT-V2, A-V1), (IT-V2, A-I1), (IT-V2, A-I2),
            (IT-I1, A-V1), (IT-I1, A-I1), (IT-I1, A-I2),
            (IT-I2, A-V1), (IT-I2, A-I1), (IT-I2, A-I2)}
```
For each tuple in the resulting Cartesian product derive a test case that consists of any value taken from each equivalence class that constitutes the tuple. So, for the example above we would have 12 test cases since there are 12 tuples. Furthermore, each test case would consist of two values, one from each of the classes specified in the tuple. For example, tuple (IT-V2, A-I1) would generate a test case that consists of a value from valid equivalence class IT-V2 for input ItemType and a value from invalid equivalence class A-I1 for input Amount.
Note: Remember to define expected results for each test case.

Example:

We are now ready to derive test cases for the function, CheckoutCost described above. We will do this in two steps. First we will define equivalence classes by referring to the functional description for CheckoutCost and using the heuristics above. Second, we will derive test cases using the procedure above.

Equivalence Classes:
- NextDay:
  Using the set of values heuristic we will have two valid equivalence classes and one invalid equivalence class.
- Amount:
  Using the range of values heuristic we will have a valid equivalence class for the valid range and one invalid equivalence class for the invalid values. However, given the requirements for the amount input, and using the re-partition heuristic, we will further partition the valid equivalence class into three valid equivalence classes:
```
 A-V1 = {0 < Amount =< 200}
 A-V2 = {200 < Amount =< 1000}
 A-V3 = {Amount > 1000}
```
- Type:
  Using the set of values heuristic we will have three valid equivalence classes and one invalid equivalence class.

Test Cases:

The equivalence class labels for NextDay, Amount, and Type are:

 ND = {ND-V1, ND-V2, ND-I1} 
  A = {A-V1, A-V2, A-V3, A-I1} 
  T = {T-V1, T-V2, T-V3, T-I1}

The Cartesian product of ND, A, and T would be the set of forty eight 3-tuples. We will list them all here:

 ND X A X T = {(ND-V1, A-V1, T-V1), (ND-V1, A-V1, T-V2), 
               (ND-V1, A-V1, T-V3), (ND-V1, A-V1, T-I1),
               (ND-V1, A-V2, T-V1), (ND-V1, A-V2, T-V2), 
               (ND-V1, A-V2, T-V3), (ND-V1, A-V2, T-I1),
               (ND-V1, A-V3, T-V1), (ND-V1, A-V3, T-V2), 
               (ND-V1, A-V3, T-V3), (ND-V1, A-V3, T-I1),
               (ND-V1, A-I1, T-V1), (ND-V1, A-I1, T-V2), 
               (ND-V1, A-I1, T-V3), (ND-V1, A-I1, T-I1),
               (ND-V2, A-V1, T-V1), (ND-V2, A-V1, T-V2), 
               (ND-V2, A-V1, T-V3), (ND-V2, A-V1, T-I1),
               (ND-V2, A-V2, T-V1), (ND-V2, A-V2, T-V2), 
               (ND-V2, A-V2, T-V3), (ND-V2, A-V2, T-I1),
               (ND-V2, A-V3, T-V1), (ND-V2, A-V3, T-V2), 
               (ND-V2, A-V3, T-V3), (ND-V2, A-V3, T-I1),
               (ND-V2, A-I1, T-V1), (ND-V2, A-I1, T-V2), 
               (ND-V2, A-I1, T-V3), (ND-V2, A-I1, T-I1),
               (ND-I1, A-V1, T-V1), (ND-I1, A-V1, T-V2), 
               (ND-I1, A-V1, T-V3), (ND-I1, A-V1, T-I1),
               (ND-I1, A-V2, T-V1), (ND-I1, A-V2, T-V2), 
               (ND-I1, A-V2, T-V3), (ND-I1, A-V2, T-I1),
               (ND-I1, A-V3, T-V1), (ND-I1, A-V3, T-V2), 
               (ND-I1, A-V3, T-V3), (ND-I1, A-V3, T-I1),
               (ND-I1, A-I1, T-V1), (ND-I1, A-I1, T-V2), 
               (ND-I1, A-I1, T-V3), (ND-I1, A-I1, T-I1)}

The resulting forty eight test cases with expected results would be as follows:
Note: We will use the acronyms ND for variable NextDay, A for variable Amount, and T for variable Type.

   #  Test Case           Expected Result
  ---------------------------------------------------------
   1. ND=Y, A=100, T=B    $223.00 
   2. ND=Y, A=100, T=E    $210.00 
   3. ND=Y, A=100, T=H    $215.00 
   4. ND=Y, A=100, T=G    Type must be B, E or H
   5. ND=Y, A=500, T=B    $700.00 
   6. ND=Y, A=500, T=E    $635.00 
   7. ND=Y, A=500, T=H    $660.00 
   8. ND=Y, A=500, T=G    Type must be B, E or H
   9. ND=Y, A=5000, T=B   $6000.00  
  10. ND=Y, A=5000, T=E   $5350.00  
  11. ND=Y, A=5000, T=H   $5600.00  
  12. ND=Y, A=5000, T=G   Type must be B, E or H
  13. ND=Y, A=-1, T=B     Amount must be greater than zero
  14. ND=Y, A=-1, T=E     Amount must be greater than zero
  15. ND=Y, A=-1, T=H     Amount must be greater than zero
  16. ND=Y, A=-1, T=G     Amount must be greater than zero
                          Type must be B, E or H
  17. ND=N, A=100, T=B    $123.00
  18. ND=N, A=100, T=E    $110.00 
  19. ND=N, A=100, T=H    $115.00 
  20. ND=N, A=100, T=G    Type must be B, E or H
  21. ND=N, A=500, T=B    $600.00
  22. ND=N, A=500, T=E    $635.00  
  23. ND=N, A=500, T=H    $660.00  
  24. ND=N, A=500, T=G    Type must be B, E or H
  25. ND=N, A=5000, T=B   $5900.00
  26. ND=N, A=5000, T=E   $5250.00 
  27. ND=N, A=5000, T=H   $5500.00 
  28. ND=N, A=5000, T=G   Type must be B, E or H
  29. ND=N, A=-1, T=B     Amount must be greater than zero 
  30. ND=N, A=-1, T=E     Amount must be greater than zero 
  31. ND=N, A=-1, T=H     Amount must be greater than zero 
  32. ND=N, A=-1, T=G     Amount must be greater than zero
                          Type must be B, E or H
  33. ND=Z, A=100, T=B    NextDay must be Y or N
  34. ND=Z, A=100, T=E    NextDay must be Y or N
  35. ND=Z, A=100, T=H    NextDay must be Y or N
  36. ND=Z, A=100, T=G    NextDay must be Y or N
                          Type must be B, E or H
  37. ND=Z, A=500, T=B    NextDay must be Y or N
  38. ND=Z, A=500, T=E    NextDay must be Y or N
  39. ND=Z, A=500, T=H    NextDay must be Y or N
  40. ND=Z, A=500, T=G    NextDay must be Y or N
                          Type must be B, E or H
  41. ND=Z, A=5000, T=B   NextDay must be Y or N 
  42. ND=Z, A=5000, T=E   NextDay must be Y or N
  43. ND=Z, A=5000, T=H   NextDay must be Y or N
  44. ND=Z, A=5000, T=G   NextDay must be Y or N
                          Type must be B, E or H
  45. ND=Z, A=-1, T=B     NextDay must be Y or N
                          Amount must be greater than zero
  46. ND=Z, A=-1, T=E     NextDay must be Y or N
                          Amount must be greater than zero
  47. ND=Z, A=-1, T=H     NextDay must be Y or N
                          Amount must be greater than zero
  48. ND=Z, A=-1, T=G     NextDay must be Y or N
                          Amount must be greater than zero
                          Type must be B, E or H

As mentioned earlier, the procedure used to derive these test cases is robust. Some versions of equivalence partitioning may yield fewer test cases. However, it should be noted that this robust procedure is particularly appropriate when the testing of error conditions is a high priority.

If the testing of error conditions is not a priority then a weaker version of this technique may be used. Rather than identifying both valid and invalid equivalence classes, only valid equivalence classes are defined. The Cartesian product of these valid equivalence classes is then constructed and test cases are derived. Paul Jorgensen identifies such a technique in his Software Testing text but, he uses the phrase Strong Normal Equivalence Class Testing to refer to this approach.

Independence Presumption:

It is also worth mentioning that, like all other equivalence partitioning procedures, the presumption here is that the input variables are independent. If any dependencies occur, the procedure is likely to generate erroneous test cases or may mask important test conditions. An erroneous test case is a test case which suggests an expected result that cannot occur.

Many types of dependencies between input variables are possible. Unfortunately, no single strategy to resolve this dilemma exists. Some practitioners address the issue by first eliminating any erroneous test cases and then supplementing the remaining test cases with new test cases derived using experience, domain knowledge, or the error guessing technique mentioned below. However, the Strong Robust technique discussed above may be readily extended to accommodate dependencies. One approach is to construct the Cartesian product from the equivalence classes in stages. We illustrate the idea with two examples:

An example of this type of dependency would arise in our example above if the functional description specified an additional input that was dependent on a particular value of one of the input variables. Let us say, for example, that for Type=B, we had an additional input variable BusinessSubType that would have the value P indicating a for Profit business and N indicating a Non-Profit business. Treating this as an independent variable would clearly be incorrect since it would lead to many test cases for which expected results could not be defined.
One way of resolving this is to construct the Cartesian product in stages. That is, consider the independent equivalence classes first and construct the Cartesian product for these classes. Then, for each dependent variable (i.e. BusinessSubType for this example), construct a Cartesian product with the related subset of dependent tuples only. If you use this approach for our extended example you will notice that for each tuple that contains T=B, three tuples that contain the variable BusinessSubType will result. So, instead of forty eight test cases, we would have seventy two test cases.
Exercise: Derive the additional test cases for the extended example above using the dependence strategy mentioned.
Another, more insidious, example of how this dependency could arise in our example above is if the set of valid values of one of the input variables was dependent on another input variable. Let us say, for example, that for Type=H, the value of Amount was limited to no more than $10000 but unlimited for other valid settings of Type. Situations like this, particularly for complex functional descriptions, may result in the masking of important test conditions. For this example, the important test condition is the one that should indicate that Type=H and Amount greater than $10000 should result in an error message.
One way of resolving this is, again, to construct the Cartesian product in stages. The trick here is to first derive equivalence classes, construct Cartesian products, and derive test cases for valid values of Type={B, E} only. Next, with Type=H, derive equivalence classes for Amount and then construct the Cartesian product and derive test cases. If you use this approach for our extended example, you will notice that the twelve test cases with T=H will be replaced by fifteen test cases, three of which will result in the required error message.
Exercise: Derive the additional test cases for the extended example above using the dependence strategy mentioned.

Boundary-Value Analysis:

Experience shows that test cases that explore boundary conditions have a higher payoff than test cases that do not. Boris Beizer captured the essence of this principle in the following often quoted statement:

Bugs lurk in corners and congregate at boundaries

Boundary conditions are those situations directly on, above, and below the edges of equivalence classes. In his text, Myers points out that Boundary-value analysis differs from equivalence partitioning in two respects:

Rather than selecting any element in an equivalence class as being representative of the class, boundary-value analysis requires that one or more elements be selected such that each edge of the equivalence class is the subject of a test case.
Rather than just focusing attention on inputs, test cases should also be derived by considering outputs. That is output equivalence classes should be defined and should also be used to derive boundary value test cases.
Note: If curious about this idea, see the Myers text (i.e. The Art of Software Testing). We will not discuss this further in these notes.

The following guidelines (also based on those from Myers) may be used to derive test cases:

Range of values: Given the valid equivalence class, derive test cases for the ends of the range, and invalid test cases for values just beyond the ends. e.g.For the ItemCount example above, where the valid range was 1 to 999, valid test cases would be the values 1 and 999 and invalid test cases would be the values 0 and 1000.
Note: We assume that ItemCount is an integer. If the variable is not an integer then fractional values just beyond the range would be used. That is, 0.99 and 999.01 could be used as invalid test cases.
Ordered set: Many sets of values may have an intrinsic ordering. Usually, we would think of sets of numbers or letters. However, we may easily define an ordering on any set of values based on some criteria. For such ordered sets, we should derive two valid test cases. One for the minimum item and the other for the maximum item in the set.
Note: Such a case would apply if the functional description stipulated that a data structure, such as an array, is an input, or it may be that a sequential file is specified as an input.

Exercise: Derive additional test cases for the forty eight test case example above using the boundary-value analysis heuristics.

Error Guessing:

Error Guessing is the process of using intuition and past experience to augment the set of test cases derived using any technique. There are no rules. The tester should review the set of test cases with an eye toward recognizing important missing test cases.

Error Guessing is considered by many practitioners to be a legitimate technique that is as important as more structured techniques because it is intended to compensate for the inherent limitations of any structured technique (see the section titled Independence Presumption above).

To illustrate the spirit of error guessing, Myers proposes a sorting example. Let us say you were testing a sorting function where the input is either a list or a file. Myers suggests that you may want to explore the following situations:

An empty list or file.
A list or file with a single item.
Note: See the Ordered set boundary-value analysis heuristic.
All items in the list or file have the same value.
The input list or file is already sorted.

In other words, one enumerates those special cases that may have been overlooked when the program was coded.

Exercise: See if your intuition, or experience, suggests any additional test cases for the forty eight test case example above.

Aside - Set Theory:

This aside is a review of two topics that are usually covered in a first Discrete Mathematics course. The first topic is a brief high level review of Partitions and Equivalence Classes. The second topic is a review of the idea of a Cartesian Product.

Partitions & Equivalence Classes:
Given a set B, and subsets A₁, A₂,..., A_n of B, the subsets are said to be partitions of B iff:
```
  A₁ U A₂ U ... U A_n = B 
```
where, i not equal to j implies A_i intersection A_j is the empty set.
The two parts of this definition are of particular importance to testers. The first part indicates that every element of B is in some subset, while the second part guarantees that no element of B is in two of the subsets. This leads to two important notions. That is, completeness (everything is somewhere) and nonredundancy.
Now, suppose we have some partition A₁, A₂,..., A_n of a set B. Let us say that two elements b₁ and b₂ of B are related if b₁ and b₂ are in the same partition (say A_j). As it turns out, the relation in defined from the partition has special properties (i.e. it is reflexive, symmetric, and transitive) and is called the equivalence relation induced by the partition. The partitions are then referred to as equivalence classes.
Note: For more details see any introductory Discrete Mathematics text (e.g. Chapter 2 of Richard Johnsonbaugh's Discrete Math text).
Recall that the two properties of a partition are the notions of completeness and nonredundancy. When translated into testing situations, these notions allow teters to make powerful statements about the extent to which a software product has been tested. Furthermore, considerable effort is saved by testing just one element of an equivalence class and assuming that the remaining elements will behave similarly.
Cartesian Product:
The Cartesian product of two sets A₁ and A₂ is the set of tuples:
```
  A₁ X A₂ = {(x, y): x is in A₁ and y is in A₂}
```
For example, let us say that A₁ = {1, 2, 3} and A₂ = {w, x, y, z} then the Cartesian product would be:
```
  A₁ X A₂ = {(1, w), (1, x), (1, y), (1, z),
            (2, w), (2, x), (2, y), (2, z),
            (3, w), (3, x), (3, y), (3, z)}
```
Notice that the resulting set contains twelve tuples. Recall that the cardinality of a set A is the number of elements in A and is denoted |A|. Thus, we see that for sets A₁ and A₂, the cardinality of the Cartesian product is the multiplicative product of the cardinalities. That is:
```
  |A₁ X A₂| = |A₁| x |A₂|
```
We can easily see that in general, if we have k sets then the Cartesian product of the k sets is the set of k-tuples where the cardinality of the Cartesian product is the multiplicative product of the cardinalities of each of the individual k sets.