Entity Relationship (ER) Model

Introduction

So far, we have examined the relational model, discussed relational algebra, and looked at SQL. In each case we assumed that relations/tables were defined and available. However, except for simple problems, the process of determining suitable relations/tables for a given problem is a non-trivial one. The phrase database design is often used to describe this process. We usually consider this process to involve four steps:

  1. Data Modeling:

    Creating a logical representation of the structure of a database. This step results in two primary artifacts:

    Several data modeling techniques are available. We will use the ER modeling technique proposed by Peter Chen in 1976.

  2. Model Transformation:

    Converting the logical representation into a schema. For a relational database design this involves determining table structures.

  3. Normalization:

    Refining the database schema to address weaknesses that may result in modification anomalies (see page 126). These weaknesses usually arise from dependencies between attributes.

  4. Implementation:

    Creating the physical database structures to support the logical representation. For a relational database design this involves determining the SQL DDL statements needed for each table.

Note: The following material should be read in conjunction with the readings outlined for Chapter 3 below.

 

The ER Data Model

The ER data model represents the things of interest to an organization as a set of basic objects, referred to as entities, and relationships between the objects. It is a graphical model. That is, these constructs are represented as annotated graphical primitives.

Following is an overview of the fundamental constructs of the ER model, the corresponding notation (graphical primitives), and heuristics/rules for determining the annotation.

Entities

As originally defined by Peter Chen, an entity is an object that exists and is distinguishable from other objects. For instance, John Smith with SSN 111-12-3456 is an entity, since he can be uniquely identified as a particular person. An entity may be concrete (i.e a person, a book, etc.) or abstract (i.e a project, a holiday, etc.). An entity class (or entity set) is a set of entities of the same type (e.g all employees of a bank).

Given a problem statement, or a set of requirements, we can often determine candidate entities by identifying noun phrases. However, the actual list of entities is a function of the needs of the organization.

Entities are denoted by rectangles with the name of the entity written inside the rectangle.

Note: We will use the term entity to refer to an "entity set" and the phrase entity instance to refer to occurrences of a particular entity (i.e such as John smith above).

Attributes

The characteristics, or properties, of an entity that describe it are known as attributes. For example, we may determine that Name, SSN, Address, Age, PhoneNumber are attributes of the entity EMPLOYEE. We are usually interested in the domain of an attribute (i.e the set of permitted values). In the case of Age for example, it must be a positive integer.

Some attributes may not be atomic. They may represent groups of attributes. For example Address may represent the attributes Street, City, State, Zip. Such attributes are said to be composite attributes.

Attributes may be divided into two groups:

Attributes are denoted by ovals and are connected to the corresponding entity by lines. The name of the attribute is written within the oval. Identifier names are underlined.

Relationships

A relationship is an association between several entities. For example, consider the entities DEPARTMENT and EMPLOYEE. We may define the relationship CONSISTS-OF to denote the association between DEPARTMENTS and their EMPLOYEES.

A relationship is denoted by a diamond and is connected to the associated entities by lines. The name of the relationship is written beneath the diamond.

Given a problem statement, or a set of requirements, we can often determine candidate relationships by identifying verbs.

Relationships are described in terms of:

A relationship may exist among entities of the same type (see degree 1 relationships above). Such entities are referred to as recursive relationships. See Fig 3-5 for an example. Note however that the cardinality for that example should be M:N not 1:N.

In determining the cardinality of a relationship, we were concerned about the maximum number of instances of an entity that could be involved in a relationship. We may also consider the minimum number. If the number is zero, we indicate this by superimposing a zero on the line representing the relationship, and adjacent to the entity. If the number is one, we indicate this by superimposing a dash perpendicular to the line representing the relationship, and adjacent to the entity. This quantity is referred to as the minimum cardinality of the relationship. The example presented in Fig 3-4 indicates that each dormitory is occupied by a minimum of one and a maximum of many students but each student is associated with a minimum of zero and a maximum of one dormitory.

Weak Entities

A weak entity is an entity whose presence in the database depends logically on another entity. We denote such entities by a rectangle with rounded corners. Hence, for a weak entity to be meaningful, it must be part of a 1:N relationship. See the example presented in Fig 3-7.

Recall, that attributes that may contain multiple values (i.e. repeating groups like arrays) are not allowed in the ER model (see page 52). We may use weak entities to represent such attributes. That is, for any attribute known to be multi-valued:

See page 58 and Fig 3-9.

 

Readings: Chapter 3, pg 51-58