Evaluation of Nonrecursive Datalog Queries

26.5.8 Evaluation of Nonrecursive Datalog Queries

In order to use Datalog as a deductive database system, it is appropriate to define an inference mechanism based on relational database query processing concepts. The inherent strategy involves a bottom-up evaluation, starting with base relations; the order of operations is kept flexible and subject to query optimization. In this section we discuss an inference mechanism based on relational operations that can be applied to nonrecursive Datalog queries. We use the fact and rule base shown in Figures 26.14 and 26.15 to illustrate our discussion.

If a query involves only fact-defined predicates, the inference becomes one of searching among the facts for the query result. For example, a query such as

DEPARTMENT (X, Research )? is a selection of all employee names X who work for the Research department. In

relational algebra, it is the query: π $1 ( σ $2 = “Research” ( DEPARTMENT )) which can be answered by searching through the fact-defined predicate

department (X,Y ). The query involves relational SELECT and PROJECT operations on a base relation, and it can be handled by the database query processing and opti-

982 Chapter 26 Enhanced Data Models for Advanced Applications

When a query involves rule-defined predicates, the inference mechanism must compute the result based on the rule definitions. If a query is nonrecursive and

involves a predicate p that appears as the head of a rule p :– p 1 ,p 2 , ... ,p n , the strategy is first to compute the relations corresponding to p 1 ,p 2 , ... ,p n and then to compute the relation corresponding to p. It is useful to keep track of the dependency among the predicates of a deductive database in a predicate dependency graph. Figure

26.17 shows the graph for the fact and rule predicates shown in Figures 26.14 and

26.15. The dependency graph contains a node for each predicate. Whenever a pred- icate A is specified in the body (RHS) of a rule, and the head (LHS) of that rule is the predicate B, we say that B depends on A, and we draw a directed edge from A to

B. This indicates that in order to compute the facts for the predicate B (the rule head), we must first compute the facts for all the predicates A in the rule body. If the dependency graph has no cycles, we call the rule set nonrecursive. If there is at least one cycle, we call the rule set recursive. In Figure 26.17, there is one recursively defined predicate—namely, SUPERIOR —which has a recursive edge pointing back to itself. Additionally, because the predicate subordinate depends on SUPERIOR , it also requires recursion in computing its result.

A query that includes only nonrecursive predicates is called a nonrecursive query. In this section we discuss only inference mechanisms for nonrecursive queries. In Figure 26.17, any query that does not involve the predicates SUBORDINATE or SUPERIOR is nonrecursive. In the predicate dependency graph, the nodes corre- sponding to fact-defined predicates do not have any incoming edges, since all fact- defined predicates have their facts stored in a database relation. The contents of a fact-defined predicate can be computed by directly retrieving the tuples in the cor- responding database relation.

UNDER_40K_SUPERVISOR SUBORDINATE Predicate dependency graph for Figures

MAIN_PRODUCT_EMP

OVER_40K_EMP

SUPERIOR

WORKS_ON

EMPLOYEE

SALARY

SUPERVISE

26.6 Summary 983

The main function of an inference mechanism is to compute the facts that corre- spond to query predicates. This can be accomplished by generating a relational expression involving relational operators as SELECT , PROJECT , JOIN , UNION , and SET DIFFERENCE (with appropriate provision for dealing with safety issues) that, when executed, provides the query result. The query can then be executed by utiliz- ing the internal query processing and optimization operations of a relational data- base management system. Whenever the inference mechanism needs to compute the fact set corresponding to a nonrecursive rule-defined predicate p, it first locates all the rules that have p as their head. The idea is to compute the fact set for each such rule and then to apply the UNION operation to the results, since UNION corre- sponds to a logical OR operation. The dependency graph indicates all predicates q on which each p depends, and since we assume that the predicate is nonrecursive, we can always determine a partial order among such predicates q. Before computing the fact set for p, first we compute the fact sets for all predicates q on which p depends, based on their partial order. For example, if a query involves the predicate UNDER_40K_SUPERVISOR , we must first compute both SUPERVISOR and OVER_40K_EMP . Since the latter two depend only on the fact-defined predicates EMPLOYEE , SALARY , and SUPERVISE , they can be computed directly from the stored database relations.

This concludes our introduction to deductive databases. Additional material may be found at the book’s Website, where the complete Chapter 25 from the third edition is available. This includes a discussion on algorithms for recursive query processing. We have included an extensive bibliography of work in deductive databases, recur- sive query processing, magic sets, combination of relational databases with deduc- tive rules, and GLUE-NAIL! System at the end of this chapter.