Relational calculus

2.5. Relational calculus

Relational calculus represents an alternative to relational algebra as a candidate for the manipulative part of the relational data model. The difference between the two is as follows:

 Algebra provides a collection of explicit operations like union, intersect, difference, select, project, join, etc., that can be actually used to build some desired relation from the given relations in the database.

 Calculus provides a notation for formulating the definition of that desired relation in terms of those given relations.

For example, consider the query “Get owners’ complete names and cities for owners who own a red car.”

An algebraic version of this query could be:

 Join relation OWNERS with relation CARS on IdentificationNumber attributes  Select from the resulting relation only those tuples with car Colour = ”RED”  Project the result of that restriction on owner FirstName, LastName and City

A calculus formulation, by contrast, might look like:  Get FirstName, LastName and City for cars owners such that there exists a car

with the same IdentificationNumber and with RED color. Here the user has merely stated the defining characteristics of the desired set of tuples,

and it is left to the system to decide exactly how this will be done. We might say that the calculus formulation is descriptive where the algebraic one is prescriptive. The calculus simply states what the problem is while algebra gives a procedure for solving that problem.

The fact is that algebra and calculus are precisely equivalent to one another. For every expression of the algebra, there is an equivalent expression in the calculus; likewise, for every expression of the calculus, there is an equivalent expression in the algebra. There is

a one-to-one correspondence between the two of them. The different formalisms simply represent different styles of expression. Calculus is more like natural language while algebra is closer to a programming language.

Relational calculus is founded on a branch of mathematical logic called the predicate calculus. Kuhns [2.4] seems to be the father of this idea of using predicate calculus as the basis for a database language, but Codd was the first who proposed the concept of relational calculus, an applied predicate calculus specifically tailored to relational databases, in [2.3]. A language explicitly based on relational calculus was also presented by Codd in [2.5]. It was called data sublanguage ALPHA and it was never implemented in the original form. The language QUEL from INGRES is actually very similar to data sublanguage ALPHA. Codd also gave an algorithm, Codd’s reduction algorithm, by which

Database Fundamentals

58 an arbitrary expression of the calculus can be reduced to a semantically equivalent

expression of the algebra. There are two types of relational calculus:

 Tuple-oriented relational calculus – based on tuple variable concept  Domain-oriented relational calculus – based on domain variable concept

2.5.1 Tuple-oriented relational calculus

A tuple variable is a variable that ranges over some relation. It is a variable whose only permitted values are tuples of that relation. In other words, if tuple variable T ranges over relation R, then, at any given time, T represents some tuple t of R.

A tuple variable is defined as:

RANGE OF T IS X1; X2; …; Xn

where T is a tuple variable and X1, X2, …, Xn are tuple calculus expressions, representing relations R1, R2, …, Rn. Relations R1, R2, …, Rn must all be union- compatible and corresponding attributes must be identically named in every relation. Tuple variable T ranges over the union of those relations. If the list of tuple calculus expressions identifies just one named relation R (the normal case), then the tuple variable T ranges over just the tuples of that single relation.

Each occurrence of a tuple variable can be free or bound. If a tuple variable occurs in the context of an attribute reference of the form T.A, where A is an attribute of the relation over which T ranges, it is called a free tuple variable. If a tuple variable occurs as the variable immediately following one of the quantifiers: the existential quantifier ∃ or the universal quantifier ∀ it is called a bound variable.

A tuple calculus expression is defined as:

T.A, U.B, …, V.C WHERE f

where T, U, …, V are tuple variables, A, B, …, C are attributes of the associated relations, and f is a relational calculus formula containing exactly T, U, …, V as free variables. The value of this expression is defined to be a projection of that subset of the extended Cartesian product T×U×…×V (where T, U, …, V range all of their possible values) for which f evaluates to true or if “WHERE f” is omitted a projection of that entire Cartesian product. The projection is taken over the attributes indicated by T.A, U.B, …,

V.C . No target item may appear more than once in that list. For example, the query “Get FirstName, LastName and City for cars owners such that

there exists a car with the same IdentificationNumber and with RED color” can be expressed as follows:

RANGE OF OWNERS IS OWNERS.FirstName, OWNERS.LastName, OWNERS.City WHERE

∃ CARS(CARS.IdentificationNumber=OWNERS.IdentificationNumber

AND CARS.Color=’RED’)

Chapter 2 – The relational data model 59

The tuple calculus is formally equivalent to the relational algebra. The QUEL language from INGRES is based on tuple-oriented relational calculus.

2.5.2 Domain-oriented relational calculus

Lacroix and Pirotte in [2.6] proposed an alternative relational calculus called the domain calculus, in which tuple variables are replaced by domain variables. A domain variable is a variable that ranges over a domain instead of a relation.

Each occurrence of a domain variable can be also free or bound. A bound domain variable occurs as the variable immediately following one of the quantifiers: the

existential quantifier ∃ or the universal quantifier ∀. In all other cases, the variable is called a free variable.

Domain-oriented relational calculus uses membership conditions. A membership condition takes the form

R (term, term, …)

where R is a relation name, and each term is a pair of the form A:v, where A is an attribute of R and v is either a domain variable or a constant. The condition evaluates to true if and only if there exists a tuple in relation R having the specified values for the specified attributes.

For example, the expression OWNERS(IdentificationNumber:’SB24MEA’, City:’SIBIU’) is a membership condition which evaluates to true if and only if there exists a tuple in relation OWNERS with IdentificationNumber value SB24MEA and City value SIBIU. Likewise, the membership condition

R (A:AX, B:BX, …)

evaluates to true if and only if there exists an R tuple with A attribute value equal to the current value of domain variable AX (whatever that may be), the B attribute value equal to the current value of domain variable BX (again, whatever that may be) and so on.

For example, the query “Get FirstName, LastName and City for cars owners such that there exists a car with the same IdentificationNumber and with RED color” can be expressed as follows:

FirstNameX, LastNameX, CityX WHERE ∃ IdentificationNumberX (OWNERS (IdentificationNumber:IdentificationNumberX,

FirstName:FirstNameX, LastName:LastNameX, City:CityX)

AND CARS(IdentificationNumber:IdentificationNumberX, Color:’RED’)

The domain calculus is formally equivalent to the relational algebra.

Database Fundamentals

A language, called ILL, based on that calculus is presented by Lacroix and Pirotte in [2.7]. Another relational language based on domain relational calculus is Query-By-Example (QBE).