Structured Query Language (SQL) contd.

Arithmetic

The SELECT clause may contain arithmetic expressions involving column names and/or constants and the arithmetic operators +, -, *, and /. Consider the following relation:

ASSISTANT (SID, ClassName, Hours)

This relation represents hours worked by teaching assistants for duties pertaining to the classes they have been assigned. An instance of this relation may be:

         SID  ClassName  Hours
         ---------------------
         100  CS150        20
         200  BA200        10
         200  BF410         5
         200  CS250        15

Consider the following SQL statement:

         SELECT SID, Hours*20
           FROM ASSISTANT
The resulting table would contain SID and earnings (assuming an hourly rate of $20.00) for each student in the ASSISTANT table. Note that an arithmetic expression will evaluate to null if the value of an attribute happens to be null.
         SID  
         --------
         100  400
         200  200
         200  100
         200  300

 

Product

Produces a table that is the Cartesian product of the tuples of the tables specified in the FROM clause.

         SELECT *
           FROM STUDENT, ENROLLMENT

Union

SQL provides the keyword UNION for this operation. Two queries connected by the UNION keyword are required.

         SELECT SID FROM ENROLLMENT
         UNION
         SELECT SID FROM STUDENT

Using the relations defined on page 222 (Fig 8-11), another example would be:

         SELECT Snum, Name, Major FROM JUNIOR
         UNION
         SELECT Number, Name, Interest FROM HONOR-STUDENT

This query would produce the table presented in Fig 8-12 of the text.

Intersection

SQL provides the keyword INTERSECT for this operation. As for UNION, two queries connected by the INTERSECT keyword are required.

         SELECT SID FROM ENROLLMENT
         INTERSECT
         SELECT SID FROM STUDENT

Using the relations defined on page 222 (Fig 8-11), another example would be:

         SELECT Snum, Name, Major FROM JUNIOR
         INTERSECT
         SELECT Number, Name, Interest FROM HONOR-STUDENT

This query would produce the table presented in Fig 8-14 of the text.

Difference

SQL provides the keyword EXCEPT for this operation. As for UNION, two queries connected by the EXCEPT keyword are required.

         SELECT SID FROM ENROLLMENT
         EXCEPT
         SELECT SID FROM STUDENT

Using the relations defined on page 222 (Fig 8-11), another example would be:

         SELECT Snum, Name, Major FROM JUNIOR
         EXCEPT
         SELECT Number, Name, Interest FROM HONOR-STUDENT

This query would produce the table presented in Fig 8-13 of the text.

Note:

Each of the above operations automatically eliminates duplicates. In order to retain duplicates the keyword ALL as a suffix is required (i.e. UNION ALL, INTERSECT ALL, EXCEPT ALL). To understand the effect of the ALL keyword, consider tables r and s. Suppose that a tuple occurs m times in r and n times in s, then it occurs:

To illustrate, let us consider the ENROLLMENT table (page 237), and the ASSISTANT table above. Let the two queries for our example be projections on SID. So, we will have two tuples for SID 200 and three tuples for SID 400 from the ENROLLMENT table and three tuples for SID 200 from the ASSISTANT table. Hence, SID 200 occurs 2 times in the projection of ENROLLMENT and 3 times in the projection of ASSISTANT.

The following query will therefore result in 5 instances (i.e. 2 + 3) of SID 200.

         SELECT SID FROM ENROLLMENTS
         UNION ALL
         SELECT SID FROM ASSISTANT

The following query will therefore result in 2 instances (i.e. min (2, 3) of SID 200.

         SELECT SID FROM ENROLLMENTS
         INTERSECT ALL
         SELECT SID FROM ASSISTANT

The following query will therefore result in 0 instances (i.e. max (0, 2-3) of SID 200.

         SELECT SID FROM ENROLLMENTS
         EXCEPT ALL
         SELECT SID FROM ASSISTANT

However, the following query will result in 1 instance (i.e. max (0, 3-2) of SID 200.

         SELECT SID FROM ASSISTANT
         EXCEPT ALL
         SELECT SID FROM ENROLLMENTS

As an exercise, verify the above and determine all tuples in the resulting tables for each of the above.

 

Aggregate/Built-in Functions

SQL provides 5 aggregate functions. These functions operate on the values of a column of a table and return a single value (see page 242).

       Function   Meaning
       ------------------------------------------
       AVG        Computes the average value
       MIN        Determines the minimum value
       MAX        Determines the maximum value
       SUM        Computes the sum of the values 
       COUNT      Counts the values

See page 242 for details of the following example.

         SELECT COUNT(Major)
           FROM STUDENT

Note that the DISTINCT keyword must be specified to ensure that duplicates are eliminated. Also, all aggregate functions except COUNT(*) ignore tuples with null values in the aggregated column. If there are no non-null values then the result is null.

 

Boolean Expressions and Nulls - Three Valued Logic (3VL)

Consider the following boolean expression that may appear in a WHERE clause:

Major = 'Math' AND Age < 30

SQL evaluates this expression for each tuple in the subject table. Let us assume that Major and Age cannot contain null values. In this case, the so called, two valued logic (2VL) is sufficient. That is, SQL determines the truth value of each of the comparisons and then the truth value for the conjunction.

Recall truth tables for 2VL given two expressions P and Q:


  AND: P  Q  P AND Q       NOT: P  NOT P
       -------------            --------
       T  T     T               T    F
       T  F     F               F    T
       F  T     F
       F  F     F

   OR: P  Q  P OR Q
       ------------ 
       T  T     T
       T  F     T
       F  T     T
       F  F     F

This scheme however breaks down when null values are allowed. First, note that any comparison involving a null value evaluates to null. This means that we must accommodate T, F, and null in determining the truth value of any boolean expression. This is what is known as three valued logic (3VL).

The truth tables for 3VL given expressions P and Q are:


  AND: P    Q    P AND Q    AND: P    Q    P AND Q    NOT: P    NOT P
       -----------------         -----------------         ----------
       T    T       T            F    F       F            T     F
       T    null    null         F    null    F            F     T
       null T       null         null F       F            null  null
       null null    null         null null    null

   OR: P    Q    P OR Q      OR: P    Q    P OR Q  
       -----------------         -----------------
       T    T       T            F    F       F  
       T    null    T            F    null    null 
       null T       T            null F       null
       null null    null         null null    null

SQL treats boolean expressions that evaluate to null as it would treat an expression that evaluates to F. Assume that for some tuple, Age is 20 and Major is null, then the condition:

Major = 'Math' AND Age < 30

evaluates to null and so the tuple would not be selected.

 

Grouping

SQL provides the GROUP BY clause to allow logical grouping, on the values of a column, for tuples that result from a query. However, GROUP BY must be used in conjunction with an aggregate function (see page 243).

See page 243 for details of the following example.

         SELECT Major, COUNT(*)
           FROM STUDENT
           GROUP BY Major

SQL also provides the HAVING clause to eliminate groups based on some condition.

See page 243 for details of the following example.

         SELECT Major, COUNT(*)
           FROM STUDENT
           GROUP BY Major
           HAVING COUNT(*) > 2 

 

Basic Join

Recall, from above, the SQL for producing a Cartesian product. The basic join is accomplished by adding a WHERE clause. In our discussion of the relational algebra join operator, we mentioned that a join is the linking of relations by a common attribute. The WHERE clause accomplishes the linking of the tables specified in the FROM clause by a common column (i.e. from the same domain). See page 246 for details of the following example.

         SELECT SID, Name, ClassName
           FROM STUDENT, ENROLLMENT
           WHERE SID=StudentNumber

Notice that this statement is simpler than the example on page 246. This is because the author is "qualifying" the column names with the table name of the column. This is a mechanism provided in SQL to remove ambiguity when the same column name is used in the tables in the FROM clause. It is not needed in this example and so qualification is not done here. However, there are many cases where qualification is required.

 

Readings: Chapter 9, pg 242-246