Basic concepts
2.2.5 Schemas
A database schema is a formal description of all the database relations and all the relationships existing between them. In Chapter 3, Conceptual data modeling, and Chapter
4, Relational database design, you will learn more about a relational database schema.
2.2.6 Keys
The relational data model uses keys to define identifiers for a relation’s tuples. The keys are used to enforce rules and/or constraints on database data. Those constraints are essential for maintaining data consistency and correctness. Relational DBMS permits definition of such keys, and starting with this point the relational database management system is responsible to verify and maintain the correctness and consistency of database data. Let’s define each type of key.
2.2.6.1 Candidate keys
A candidate key is a unique identifier for the tuples of a relation. By definition, every relation has at least one candidate key (the first property of a relation). In practice, most relations have multiple candidate keys.
C. J. Date in [2.2] gives the following definition for a candidate key: Let R be a relation with attributes A1, A2, …, An. The set of K=(Ai, Aj, …, Ak) of R
is said to be a candidate key of R if and only if it satisfies the following two time- independent properties:
Uniqueness At any given time, no two distinct tuples of R have the same value for Ai, the same
value for Aj, …, and the same value for Ak. Minimality
Database Fundamentals
42 None of Ai, Aj, …, Ak can be discarded from K without destroying the
uniqueness property. Every relation has at least one candidate key, because at least the combination of all of its
attributes has the uniqueness property (the first property of a relation), but usually exist at least one other candidate key made of fewer attributes of the relation. For example, the CARS relation shown earlier in Figure 2.2 has only one candidate key K=(Type, Producer, Model, FabricationYear, Color, Fuel) considering that we can have multiple cars with the same characteristics in the relation. Nevertheless, if we create another relation CARS as in Figure 2.3 by adding other two attributes like SerialNumber (engine serial number) and IdentificationNumber (car identification number) we will have 3 candidate keys for that relation.
The new CARS Relation
Candidate keys
TYPE PRODUCE
NUMBER LIMOUSINE
SB24MEA
AB08DGF
NF37590
LIMOUSIN MERCEDES
SB06GHX
WM19875
LIMOUSINE AUDI
SB52MAG
MW79580
LIMOUSINE BMW
AB02AMR
WQ21998
Figure 2.3 – The new CARS Relation and its candidate keys
A candidate key is sometimes called a unique key. A unique key can be specified at the Data Definition Language (DDL) level using the UNIQUE parameter beside the attribute name. If a relation has more than one candidate key, the one that is chosen to represent the relation is called the primary key, and the remaining candidate keys are called alternate keys.
Note:
To correctly define candidate keys you have to take into consideration all relation instances to understand the attributes meaning so you can be able to determine if duplicates are possible during the relation lifetime.
2.2.6.2 Primary keys
Chapter 2 – The relational data model 43
A primary key is a unique identifier of the relation tuples. As mentioned already, it is a candidate key that is chosen to represent the relation in the database and to provide a way to uniquely identify each tuple of the relation. A database relation always has a primary key.
Relational DBMS allow a primary key to be specified the moment you create the relation (table). The DDL sublanguage usually has a PRIMARY KEY construct for that. For example, for the CARS relation from Figure 2.3 the primary key will be the candidate key IdentificationNumber . This attribute values must be “UNIQUE” and “NOT NULL” for
all tuples from all relation instances. There are situations when real-world characteristic of data, modeled by that relation, do not
have unique values. For example, the first CARS relation from Figure 2.2 suffers from this inconvenience. In this case, the primary key must be the combination of all relation attributes. Such a primary key is not a convenient one for practical matters as it would require too much physical space for storage, and maintaining relationships between database relations would be more difficult. In those cases, the solution adopted is to
introduce another attribute, like an ID, with no meaning to real-world data, which will have unique values and will be used as a primary key. This attribute is usually called a surrogate key. Sometimes, in database literature, you will also find it referenced as artificial key.
Surrogate keys usually have unique numerical values. Those values grow or decrease automatically with an increment (usually by 1).
2.2.6.3 Foreign keys
A foreign key is an attribute (or attribute combination) in one relation R2 whose values are required to match those of the primary key of some relation R1 (R1 and R2 not necessarily distinct). Note that a foreign key and the corresponding primary key should be defined on the same underlying domain.
For example, in Figure 2.4 we have another relation called OWNERS which contains the data about the owners of the cars from relation CARS.
Database Fundamentals
OWNERS Relation
Foreign key Primary key
ID FIRST NAME LAST NAME
IDENTIFI CATION NUMBER
1 JOHN SMITH
SB24MEA
2 MARY FORD
ALBA
TE ILOR
AB08DGF
3 ANNE SHEPARD
SIB IU
SEBASTIA N
SB06GHX
4 WILLIAM HILL
SB52MAG
5 JOE PESCI
ALBA
MOLD OVA
AB02AMR
Figure 2.4 – The OWNERS relation and its primary and foreign keys
The IdentificationNumber foreign key from the OWNERS relation refers to the IdentificationNumber primary key from CARS relation. In this manner, we are able to know which car belongs to each person.
Foreign-to-primary-key matches represent references from one relation to another. They are the “glue” that holds the database together. Another way of saying this is that
foreign-to-primary-key matches represent certain relationships between tuples. Note carefully, however, that not all such relationships are represented by foreign-to-primary-key matches.
The DDL sublanguage usually has a FOREIGN KEY construct for defining the foreign keys. For each foreign key the corresponding primary key and its relation is also specified.