Structured Query Language (SQL)

SQL is the standard data manipulation language for relational database products. It was developed by D. Chamberlin and others at the IBM San Jose Research Laboratory (1974) and was initially called Structured English Query Language (SEQUEL). For legal reasons the name was later changed to SQL. Development of the language is a direct result of Codd's (1970) paper on the relational model.

In 1982 the American National Standards Institute (ANSI) chartered its Database Committee to develop a proposal for a standard relational language. The proposal was completed and ratified by ANSI in 1986 and in 1987 it was also accepted as an international standard by the International Organization for Standardization (ISO). Since then the standard has undergone several revisions. Our text is based on the 1992 revision, which is usually referred to as SQL-92. We will follow that standard for this class but where appropriate revisions from the 1999 standard will be introduced.

The function of the SQL language is to support the definition, manipulation, and control of data in a relational database and so the syntax of SQL may be classified thus:

  1. Data Definition Language (DDL):

    Syntax that addresses database and table creation. These statements are based on the CREATE, ALTER, DROP keywords.

  2. Data Manipulation Language (DML):

    Syntax that addresses data retrieval and update.

  3. Data Control:

    Syntax that addresses security and concurrency.

For this class we will be primarily interested in the DML. We will start with data retrieval statements and will discuss the syntax as it relates to the basic operations of the relational algebra. The examples presented on pages 236 - 248 will be used. The basic form for all queries is:

         SELECT column list
           FROM table list
          WHERE condition

Note:

 

Projection

Produces a table where the columns constitute a subset of the columns of the table specified in the FROM clause. See page 237 for details of the following example.

         SELECT SID, Name, Major
           FROM STUDENT

 

Selection

Produces a table that is a subset of the tuples of the table specified in the FROM clause. See page 238 for details of the following example.

         SELECT SID, Name, Major, GradeLevel
           FROM STUDENT
           WHERE Major='Math'

In cases such as this where all columns from the subject table are needed, an asterisk may be specified for the comma list. That is, we could write the following instead:

         SELECT *
           FROM STUDENT
           WHERE Major='Math'

Boolean expressions consist of simple logical expressions involving, comparison operators, connected by logical operators. The comparison operators defined for SQL are =, <, ≤, >, ≥, and <>. The logical operators are NOT, AND, OR (in precedence order). Parentheses may be used to override precedence.

In addition to these operators, several special operators are available.

       Symbol      Meaning
       -------------------------------------------------------
       IN          Used to check for set membership
       BETWEEN     Used to define range limits
       LIKE        Used for string pattern matching
       IS NULL     Used to check if an attribute valuee is null
  1. IN

    See page 239 for details of the following example. Note that the example presented in the text is non-standard SQL. The SQL standard (also SQL Server) stipulates () instead of [] to enclose items for the IN list.

             SELECT Name
               FROM STUDENT
               WHERE Major IN ['MATH', 'ACCOUNTING']
    

    is equivalent to:

             SELECT Name
               FROM STUDENT
               WHERE Major = 'MATH' OR Major = 'ACCOUNTING'
    
  2. BETWEEN

    See page 240 for details of the following example. Note that the end values of BETWEEN are included in the selected range.

             SELECT Name, Major
               FROM STUDENT
               WHERE SID BETWEEN 200 AND 300
    

    is equivalent to:

             SELECT Name, Major
               FROM STUDENT
               WHERE SID ≥ 200 and ≤ 300
    
  3. LIKE

    See page 240 for details of the following example. Note that the explanation of the % symbol is incorrect. That is, the % symbol represents a series of zero or more unspecified characters.

             SELECT Name
               FROM STUDENT
               WHERE Name LIKE '%S'
    

    Also, note that we may use the ESCAPE keyword to define an escape symbol to allow us to match on % or _. The following example allows for matching any string that contains the underscore character (the backslash \ is used as the escape symbol).

             SELECT Name
               FROM STUDENT
               WHERE Name LIKE '%\_%' ESCAPE '\'
    
  4. IS NULL

    See page 241 for details of the following example.

             SELECT Name
               FROM STUDENT
               WHERE GradeLevel IS NULL
    

Duplicates

SQL queries may result in duplicate tuples and so, strictly speaking, it is not correct to refer to SQL tables as relations. SQL does provide a mechanism to eliminate duplicates. We do this by specifying the keyword DISTINCT after SELECT.

         SELECT DISTINCT Major
           FROM STUDENT

 

Sorting

The tuples of a query may be ordered by the values in one or more columns. We may specify the keywords DESC for descending order or ASC for ascending order for each column (ascending order is the default). See page 241 for details of the following example.

         SELECT Name, Major, GradeLevel
           FROM STUDENT
           WHERE GradeLevel IN ('FR', 'SO', 'SN')
           ORDER BY Major ASC, GradeLevel DESC

 

Readings: Chapter 9, pg 236-241