Clausal Form and Horn Clauses

26.5.4 Clausal Form and Horn Clauses

Recall from Section 6.6 that a formula in the relational calculus is a condition that includes predicates called atoms (based on relation names). Additionally, a formula can have quantifiers—namely, the universal quantifier (for all) and the existential quantifier (there exists). In clausal form, a formula must be transformed into another formula with the following characteristics:

All variables in the formula are universally quantified. Hence, it is not neces- sary to include the universal quantifiers (for all) explicitly; the quantifiers are removed, and all variables in the formula are implicitly quantified by the uni- versal quantifier.

In clausal form, the formula is made up of a number of clauses, where each clause is composed of a number of literals connected by OR logical connec- tives only. Hence, each clause is a disjunction of literals.

The clauses themselves are connected by AND logical connectives only, to form a formula. Hence, the clausal form of a formula is a conjunction of clauses.

It can be shown that any formula can be converted into clausal form. For our pur- poses, we are mainly interested in the form of the individual clauses, each of which is a disjunction of literals. Recall that literals can be positive literals or negative liter- als. Consider a clause of the form:

NOT (P 1 ) OR NOT (P 2 ) OR ... OR NOT (P n ) OR Q 1 OR Q 2 OR ... OR Q m (1) This clause has n negative literals and m positive literals. Such a clause can be trans-

formed into the following equivalent logical formula:

(2) where ⇒ is the implies symbol. The formulas (1) and (2) are equivalent, meaning

P 1 AND P 2 AND ... AND P n ⇒Q 1 OR Q 2 OR ... OR Q m

that their truth values are always the same. This is the case because if all the P i liter- als (i = 1, 2, ..., n) are true, the formula (2) is true only if at least one of the Q i ’s is true, which is the meaning of the ⇒ (implies) symbol. For formula (1), if all the P i literals (i = 1, 2, ..., n) are true, their negations are all false; so in this case formula (1) is true only if at least one of the Q i ’s is true. In Datalog, rules are expressed as a restricted form of clauses called Horn clauses, in which a clause can contain at most one positive literal. Hence, a Horn clause is either of the form

(3) or of the form

NOT (P 1 ) OR NOT (P 2 ) OR ... OR NOT (P n ) OR Q

(4) The Horn clause in (3) can be transformed into the clause

NOT (P 1 ) OR NOT (P 2 ) OR ... OR NOT (P n )

(5) which is written in Datalog as the following rule:

P 1 AND P 2 AND ... AND P n ⇒Q

26.5 Introduction to Deductive Databases 975

The Horn clause in (4) can be transformed into P 1 AND P 2 AND ... AND P n ⇒

(7) which is written in Datalog as follows: P 1 ,P 2 , ..., P n .

A Datalog rule, as in (6), is hence a Horn clause, and its meaning, based on formula (5), is that if the predicates P 1 AND P 2 AND ... AND P n are all true for a particular binding to their variable arguments, then Q is also true and can hence be inferred. The Datalog expression (8) can be considered as an integrity constraint, where all the predicates must be true to satisfy the query.

In general, a query in Datalog consists of two components:

A Datalog program, which is a finite set of rules

A literal P(X 1 ,X 2 , ..., X n ), where each X i is a variable or a constant

A Prolog or Datalog system has an internal inference engine that can be used to process and compute the results of such queries. Prolog inference engines typically return one result to the query (that is, one set of values for the variables in the query) at a time and must be prompted to return additional results. On the con- trary, Datalog returns results set-at-a-time.