The UNION, INTERSECTION, and MINUS Operations

6.2.1 The UNION, INTERSECTION, and MINUS Operations

The next group of relational algebra operations are the standard mathematical operations on sets. For example, to retrieve the Social Security numbers of all employees who either work in department 5 or directly supervise an employee who works in department 5, we can use the UNION operation as follows: 4

4 As a single relational algebra expression, this becomes Result ← π Ssn (σ Dno=5 (EMPLOYEE) ) ∪

π Super_ssn (σ Dno=5 (EMPLOYEE))

6.2 Relational Algebra Operations from Set Theory 153

DEP5_EMPS ← σ Dno = 5 ( EMPLOYEE )

RESULT1 ← π Ssn ( DEP5_EMPS ) RESULT2 ( Ssn )← π Super_ssn ( DEP5_EMPS ) RESULT ← RESULT1 ∪ RESULT2

The relation RESULT1 has the Ssn of all employees who work in department 5, whereas RESULT2 has the Ssn of all employees who directly supervise an employee

who works in department 5. The UNION operation produces the tuples that are in either RESULT1 or RESULT2 or both (see Figure 6.3), while eliminating any dupli- cates. Thus, the Ssn value ‘333445555’ appears only once in the result.

Several set theoretic operations are used to merge the elements of two sets in vari- ous ways, including UNION , INTERSECTION , and SET DIFFERENCE (also called MINUS or EXCEPT ). These are binary operations; that is, each is applied to two sets

(of tuples). When these operations are adapted to relational databases, the two rela- tions on which any of these three operations are applied must have the same type of tuples ; this condition has been called union compatibility or type compatibility. Two

relations R(A 1 ,A 2 , ..., A n ) and S(B 1 ,B 2 , ..., B n ) are said to be union compatible (or type compatible ) if they have the same degree n and if dom(A i ) = dom(B i ) for 1 fi

f n. This means that the two relations have the same number of attributes and each corresponding pair of attributes has the same domain.

We can define the three operations UNION , INTERSECTION , and SET DIFFERENCE on two union-compatible relations R and S as follows:

UNION : The result of this operation, denoted by R ∪ S, is a relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated.

INTERSECTION : The result of this operation, denoted by R ∩ S, is a relation that includes all tuples that are in both R and S.

SET DIFFERENCE (or MINUS ): The result of this operation, denoted by R – S, is a relation that includes all tuples that are in R but not in S.

We will adopt the convention that the resulting relation has the same attribute names as the first relation R. It is always possible to rename the attributes in the result using the rename operator.

Result of the UNION operation RESULT ← RESULT1 ∪

RESULT2. 333445555

154 Chapter 6 The Relational Algebra and Relational Calculus

Figure 6.4 illustrates the three operations. The relations STUDENT and INSTRUCTOR in Figure 6.4(a) are union compatible and their tuples represent the names of students and the names of instructors, respectively. The result of the UNION operation in Figure 6.4(b) shows the names of all students and instructors. Note that duplicate tuples appear only once in the result. The result of the INTERSECTION operation (Figure 6.4(c)) includes only those who are both students and instructors.

Notice that both UNION and INTERSECTION are commutative operations; that is,

R ∪ S = S ∪ R and R ∩ S = S ∩ R Both UNION and INTERSECTION can be treated as n-ary operations applicable to

any number of relations because both are also associative operations; that is, R ∪ (S ∪ T ) = (R ∪ S) ∪ T and (R ∩ S ) ∩ T = R ∩ (S ∩ T ) The MINUS operation is not commutative; that is, in general, R−S ≠ S−R

Figure 6.4

The set operations UNION, INTERSECTION, and MINUS. (a) Two union-compatible relations. (b) STUDENT ∪ INSTRUCTOR. (c) STUDENT ∩ INSTRUCTOR. (d) STUDENT − INSTRUCTOR. (e) INSTRUCTOR − STUDENT.

(a) STUDENT

INSTRUCTOR

Ln Susan

Yao Ramesh Shah

Ramesh Shah Johnny

Ricardo Browne

Kohler Barbara Jones

Barbara Jones Amy

Francis

Johnson

Ford Jimmy

Ford

Ramesh Shah

Amy

Wang Ernest

Lname Susan

(e) Fname

Smith Ramesh Shah

Barbara Jones

Ricardo

Browne

Amy

Ford

Francis

Johnson

Jimmy

Wang

6.2 Relational Algebra Operations from Set Theory 155

Figure 6.4(d) shows the names of students who are not instructors, and Figure 6.4(e) shows the names of instructors who are not students.

Note that INTERSECTION can be expressed in terms of union and set difference as follows:

R ∩ S = ((R ∪ S ) − (R − S )) − (S − R) In SQL, there are three operations— UNION , INTERSECT , and EXCEPT —that corre-

spond to the set operations described here. In addition, there are multiset opera- tions ( UNION ALL , INTERSECT ALL , and EXCEPT ALL ) that do not eliminate duplicates (see Section 4.3.4).