Lecture #2-2

Relational Model

E. F. Codd proposed the Relational Model in his seminal paper "A Relational Model of Data for Large Shared Data Banks" in 1970. Codd argued that, despite significant progress in the development of database technology, the lack of a sound theoretical basis was a fundamental weakness of the emerging technology.

Codd argued that the set theoretic notion of a relation should be used to represent the "things" of interest to an organization (i.e. entities). In the paper Codd addresses what is referred to as the structural as well as the manipulative aspect of relations. The structural aspect has to do with how relations are defined and represented and the manipulative aspect has to do with the allowable operations on relations (i.e. a relational algebra).

Relational Structure

The definition of a relation specifies the name of the relation, the attributes associated with the relation, and the "primary" key of the relation. The following general form is used to define relations:

relation_name (attribute list)

For example, the definition of a relation to represent customers may be defined as follows (where the underlined attribute is the primary key).

CUSTOMER (CustomerNumber, Name, PhoneNumber)

Note that in specifying attributes, the domain of the attribute must be defined. The domain is the set of allowable values for the attribute.

An instance of a relation is represented by a table with a header, which contains attribute names as column headings, and a body, which contains tuples (or rows) corresponding to the individual items that constitute the relation.

e.g. An instance of the CUSTOMER relation could be:


         CustomerNumber  Name              PhoneNumber
         ------------------------------------------------
             0000001     John Brown        (312) 111-2222  
             0000002     Mary Jones        (708) 715-8262  
             0000003     Bill Winston      (847) 212-3000  
             0000004     John Brown        (312) 501-2024

Note:

Observe that any realization of a relation is just a subset of the Cartesian product of the domain of each attribute of the relation.

Primary key

A discussion of primary keys first requires a definition of functional dependence.

Consider a relation with attributes A and B and possibly others. Attribute B is functionally dependent on attribute A if each value of A determines one and only one value of B.

e.g. Consider a "student" relation with attributes StudentID and DateOfBirth. DateOfBirth is functionally dependent on StudentID (we denote the dependence as StudentID -> DateOfBirth) since each student has a single birthday. Notice that the reverse is not true. That is StudentID is not functionally dependent on DateOfBirth since two or more students may have the same birthday.

A primary key is an attribute that determines the value of all other attributes within the relation. That is, every attribute in the relation is functionally dependent on the primary key.

Note:

A primary key cannot contain null values.
A key may be composed of more than one attribute. Such keys are known as composite keys.

NULL Values

In a subsequent paper that appeared in 1979 Codd addressed the notion of null values. A null value is a special value which indicates an unknown attribute value. Note that a null value is not the same as zero or blank (zero-length string).

Properties of Relations

To summarize, we consider the following general properties:

Each relation must have a name that is distinct from the name of all other relations.
Each attribute of a relation must have a name that is distinct from the name of other attributes in the relation.
The values of an attribute must be from its associated domain.
The value of a particular attribute of a particular tuple must be an atomic value. (i.e. cannot be an array)
Note: We will address this point in more detail when we discuss normal forms.
The order of attributes in a relation has no significance.
The order of tuples in a relation has no significance.
Each tuple in a relation is distinct. That is, relations do not contain duplicate tuples.

Readings: Chapter 5, pg 122-125