Queries to Fragment Queries

  Tr a n sla t ion of Tr a n sla t ion of

Globa l Qu e r ie s t o Globa l Q e ie s t o Qu e r ie s t o Q e ie s t o Fr a gm e n t Qu e r ie s Fr a gm e n t Qu e r ie s Fr a gm e n t Qu e r ie s Fr a gm e n t Qu e r ie s y

  An access a ss op a o operation issued ssu d by by a an application can be expressed as a query which references global relations.

  y

  The DDBMS has to transform this query i t into simpler queries which refer only to i l i hi h f l t fragments.

  y

  There are several ways to transform a query over global relations into queries query over global relations into queries over fragments.

  Equ iva le n ce Tr a n sfor m a t ion s for qu e r ie s i y y

  A relational query can be expressed using A relational query can be expressed using different languages; relational algebra & SQL.

  y

  Any of these can be used for expressing the semantics of the Query. the semantics of the Query

  y

  We can interpret an expression of relational We can interpret an expression of relational algebra not only as the specification of the semantics of a query, but also as the y

  Two expressions with the same semantics Two expressions with the same semantics can describe two different sequences of operations.

PJ SL EMP

  NAME,DEPTNUM DEPTNUM =15 and

SL PJ EMP

  DEPTNUM =15 NAME,DEPTNUM

  are equivalent expressions but define two different sequences of operations. different sequences of operations

  Ope r a t or Tr e e of a Qu e r y p Q y y

  These are introduced to have a more practical representation of queries, in ti l t ti f i i which expression manipulation is easier to follow.

Q1: PJ SL

  SNUM AREA = “NORTH” (SUPPLY JN (SUPPLY JN

DEPT) DEPT)

  DEPTNUM = DEPTNUM requires the supplier number of suppliers requires the supplier number of suppliers that have issued a supply order in the North area of our company. PJ PJ SNUM S SL AREA =“NORTH” JN DEPTNUM = DEPTNUM SUPPLY DEPT

  Fig: An operator tree for query Q1. f Q

  ¾

  The leaves of the tree are global relations. a s o a g oba a o s

  ¾

  Each node represents a unary or binary p y y operation.

  ¾

  A tree defines a partial order in which operations must be applied in order to produce the result of the query. p od ce the es lt of the q e

  ¾ ¾

  In this case, the join is applied first, In this case the join is applied first followed by a selection and a projection.

  ¾

  The selection operation applies to the global relation DEPT.

  ¾

  Thus, a different ordering of operations could be selection, join, projection.

  ¾

  Thi i This inversion in the order of nodes of an i i th d f d f operator tree corresponds to an equivalence transformation. equivalence transformation The operator tree of an expression of op a o o a p ss o o relational algebra can be regarded as the parse tree of the expression itself, assuming the following grammar:

  R identifier R R (R) (R) R un_op R R R R bin_op R R bin op R un_op SL | PJ

  F A

  bin_op bin op CP | UN | DF | JN CP | UN | DF | JN | NJN | | NJN |

  F F

  SJ | NSJ

  F F

  Equ iva le n ce Tr a n sfor m a t ion s for t h e Re la t ion a l Alge br a ™

  Two relations are equivalent q when their tuples represent the same mapping from attribute names to values, even if the order of attributes is different.

  Let E1 and E2 are two expressions of relational algebra. l ti l l b The two expressions are equivalent, written E1 E2 , if substituting the same relations for identical names in the two expressions, we get equivalent results. y y

  Equivalence transformations can be given Equivalence transformations can be given systematically for small expressions, i.e., expressions of two or three operand relations. relations

  y

  These transformations are classified into categories according to the type of the categories according to the type of the operators involved. Let U and B denote unary and binary algebraic operations, respectively. We have:

  y Com m u t a t ivit y of Unary operations: C t t i it f U ti

  R R

  U U U U

  1

  2

  2

  1 y

  Com m u t a t ivit y of operands of Binary

  operations: operations: R B S S B R

  y Associa t ivit y of binary operations:

  R B (S B T) ( ) ( (R B S) B T ) y I de m pot e n ce of unary operations: de po e ce o u a y op a o s

  R

  U R U U

  1

  2 y

  D ist r ibu t ivit y of unary operations with

  respect to binary operations:

  U(R B S) U(R) B U(S) y

  (this Fa ct or iza t ion of unary operations transformation is the inverse of distributivity):

  U(R) B U(S) U(R) B U(S) U(R B S) U(R B S)`

  Ta ble 1 Com m u t a t ivit y of u n a r y ope r a t ion s SL PJ F2 A2 SNC : Attr(F2) A1 1 SL (*(R)) *(SL (R)) Y Y F1 F1

  SNC : A1 Ξ A2 SNC : A1 Ξ A2 2 PJ (*(R)) *(PJ (R)) SNC SNC A1 A1

1

2 Attr(F) ( ) the attributes which appear in a h b - h h formula F.

  Attr(R) ( ) - the set of attributes of a relation R.

The tables contain in each position a validity

indicator.

  “Y” Y – the property can always be applied. – the property can always be applied “N” – it cannot be applied.

  

“SNC”- specifying a condition which is necessary st

  The validity indicator “Y” in the 1 a d y d a o row o

  st

  & 1 column means that the following transformation is correct.

  SL SL R SL SL R F1 F2 F2 F1

  where F1 and F2 are two generic selection specifications. ifi ti

  nd st

  SNC in the 2 row & 1 column means

  1

  that the transformation. that the transformation

  PJ SL R SL PJ R A1 F2 F2 A1

  is correct only if the specifications A1 and is correct only if the specifications A1 and F2 satisfy the condition SNC 1.

  Ta ble 2 : Com m u t a t ivit y of ope r a n ds a n d a ssocia t ivit y of bin a r y ope r a t ion s a ssocia t ivit y of bin a r y ope r a t ion s UN D F CP JN SJ F F R * S S * R Y N Y Y N (R * S) * T R * (S * T) Y N Y SNC N 1 SNC for (R JN S)JN T R JN (S JN T) : 1 F1 F2 F1 F2 Attr(F2) Attr(S) U ATTR(T)

  Ta ble 3 : I de m pot e n ce of u n a r y ope r a t ion s p y p PJ A (R) PJ A1 PJ A2 (R) SNC : A Ξ A1, A A2 SL F (R) SL F1 SL F2 (R) SNC : F = F1 Λ F2

  Ta ble 4 : D ist r ibu t ivit y of u n a r y ope r a t ion s w it h r e spe ct t o bin a r y ope r a t ion s

UN D F CP JN

F3 SJ F3 SL F

  (R*S) Y Y SNC 1 SNC 1 Y SL FR (R) * SL FS (S)

  FR = F, FS = F FR = F, FS = F

  FR = F1, FS = F2 FR = F1, FS = F2

  FR =F, FS = true true

  PJ A (R*S) Y Y SNC 2 SNC 2 PJ AR (R) *

  PJ AS (S) AR =A,

  AS = A N AR = A – Attr(S)

  AS = A – AR = A – Attr(S) AS = A – Att (R) AR = A-

  Attr(S) AS = Att (S) Attr(R) Attr(S)

  Ta ble 5 : Fa ct or iza t ion of u n a r y ope r a t ion s fr om bin a r y ope r a t ion s bin a r y ope r a t ion s UN D F CP JN SJ F1 F1 (S) SNC SNC Y Y SNC

  SL (R)*SL FR FS 1

2

4 F=FR=FS F=FR F=FRΛFS F=FRΛFS F=FR (R*S)

  SL F (S) SNC Y Y Y

  PJ (R)*PJ AR AS 3 A=AR=AS A=AR U AS A=AR U AS A=AR (R*S)

  N PJ A SNC : FR = FS 1 1 SNC : FR FS 2 SNC : Attr(R) = Attr(S) 3 SNC : FS = true 4 4 y

  In addition to the transformations defined, add o o a s o a o s d

  d, the following commutativity rule between the binary operations join and union is correct and extremely useful.

  (R’UNR’’)JNF(S’UNS’’) (R’UNR’’)JNF(S’UNS’’) UN((R’JNFS’), UN((R’JNFS’) (R’JNFS’’), (R’’JNFS’), (R’’JNFS’’))

  The property is shown here with two binary unions in the LHS, which gives rise binary unions in the LHS which gives rise to a union with four operands in the RHS. y

  In nondistributed o d s bu d da abas s, databases, g general a criteria have been given for applying equivalence transformations for the purpose of simplifying the execution of queries:

  y

  Criterion 1. C it i

  1 Use idempotence of selection U id t f l ti and projection to generate appropriate selections and projections for each operand selections and projections for each operand relation.

  y

  C Criterion 2. o Push us selections s o s a d and projections down the tree as far as possible. y These criteria descend from the consideration that binary operations are the most expensive operations. y y

  Therefore, it is convenient to reduce the sizes Therefore it is convenient to reduce the sizes of operands of binary operations before performing them. y

  I In DDBs, DDB these th criteria it i are even more important: binary operations require the comparison of operands that could be allocated at different sites. y

  Transmission of data is one of the major components components of of the the costs costs and and delays delays associated with query execution. y

  Thus, reducing the size of operands of binary operations is a major concern. operations is a major concern

  Fig. shows a modified operator tree for query query Q1, Q1 in in which which the the following following transformations have been applied: 1.

  The selection is distributed with respect to the join; thus, the selection is applied h h h l l d directly to the DEPT relation.

2. Two new projection operations are

  generated d and d are distributed d b d with h respect to the join. j

  PJ SNUM JN DEPTNUM=DEPTNUM PJ PJ SNUM,DEPTNUM SNUM DEPTNUM PJ PJ DEPTNUM SUPPLY SL AREA=“NORTH” Fig : A modified operator tree for query Q1. Fig : A modified operator tree for query Q1.

  DEPT

  Ope a o Ope r a t or Gr a ph a n d D e t e r m in a t ion of G a p a d e e a o o Com m on Su b e x pr e ssion s

  An important issue in applying transformations to a query expression is transformations to a query expression is to discover its common subexpressions; i.e., subexpressions which appear more than once in the query. than once in the query A method to recognize them consists in A method to recognize them consists in transforming the corresponding operator tree in an operator graph by first merging identical leaves of the tree g g and then merging other intermediate nodes of the i th i t di t d f th tree corresponding to the same operations and having the same operands.

Q Q2 : Give the names of employees who work G a s o p oy s o o in a department whose manager has number 373 but who do not earn more than > $35000.

PJ PJ ((EMPJN ((EMPJN

  EMP.NAME DEPTNUM=DEPTNUM MGRNUM=373

  DEPT)D F(SL EMPJN

  SAL>35000 DEPTNUM=DEPTNUM

  DEPT)) DEPT))

SL SL

  MGRNUM=373

  The corresponding operator tree is shown The corresponding operator tree is shown in fig.(a) followed by progressive simplifications of query in(b),(c) & (d). p q y ( ),( ) ( )

  PJ

EMP.NAME

  (a)

  DF JN JN DEPTNUM=DEPTNUM DEPTNUM=DEPTNUM DEPTNUM=DEPTNUM DEPTNUM=DEPTNUM JN JN SL SL SL SL SAL>35000 SAL>35000 MGRNUM=373 MGRNUM 373 EMP EMP SL SL MGRNUM=373

DEPT EMP DEPT We start by merging leaves corresponding s a by g g a s o spo d g

  • to EMP and DEPT relations. We factorize the selection on SAL with
  • respect to join( we move the selection upward in doing this). Now, N we can merge the th nodes d
  • >corresponding to the selection on MGRNUM MGRNUM and and finally finally the the node node corresponding to the join. We come to the operator tree of fig(b). We come to the operator tree of fig

  PJ EMP.NAME

  (b)

  DF SL SAL>35000 JN DEPTNUM=DEPTNUM SL MGRNUM=373 EMP DEPT We recognize the following subexpression: EMP JN

SL DEPT

  DEPTNUM=DEPTNUM MGRNUM=373

  Once common subexpressions are identified, id tifi d we can use the th f ll following i properties to further simplify an operator tree. tree

  R N JN R R R UN R R UN R R R R D F R Ø R N JN SL R R

  SL F F

  R UN SL R R

  F F

  (S (SL R) N JN (SL ) (S R) ) S SL R

F1 F1 F2 F2 F1 AND F2 F1 AND F2

  (SL R) UN (SL R) R

  SL

F1 F2 F1 OR F2

  ( (SL R) D F (SL ) ( R) ) R

  SL F1 F1 F2 F2 F1 AND NOT F2 F1 AND NOT F2 th

  The 6 property i.e R D F SL R SL R

  F NOT F

  in the list is applied reducing the operator tree to that in fig c.

  (C)

  PJ EMP.NAME SL SAL≤35000 JN DEPTNUM=DEPTNUM EMP SL MGRNUM=373 DEPT DEPT

  PJ EMP.NAME

  (d)

  JN DEPTNUM=DEPTNUM PJ PJ NAME,DEPTNUM

  PJ DEPTNUM SL SL SAL≤35000 SL MGRNUM=373 EMP

  DEPT

  

TRANSFORMING GLOBAL TRANSFORMING GLOBAL

QUERIES INTO FRAGMENT QUERIES

  CAN ON I CAL EXPRESSI ON OF A FRAGM EN T QUERY y

  Replace each global relation with algebraic expression with algebraic expression giving reconstruction of global relations from global relations from fragments. y

  Replace leaves of operator

  Alge br a of qu a lifie d r e la t ion s Alge br a of qu a lifie d r e la t ion s Alge br a of qu a lifie d r e la t ion s Alge br a of qu a lifie d r e la t ion s y

  A A qualified relation qualified relation is a relation extended is a relation extended by a qualification.

  y

  We denote it as a pair[ R:q ], where R is

  R R

  a relation called the body of the qualified relation and q is a predicate called the

  R qualification lifi ti of the relation. f th l ti y

  Horizontal fragments are typical examples. examples y

  The algebra of qualified relations is an extension of relational algebra which uses e tension of elational algeb a hich ses qualified relations as operands.

  y y

  This algebra requires manipulating This algebra requires manipulating qualifications as well as relations.

  y

  Two qualified relations are equivalent if Two qualified relations are equivalent if their bodies are equivalent relations and their qualifications represent same truth function.

  

Ru le s de fin in g r e su lt of a pplyin g Ru le s de fin in g r e su lt of a pplyin g

ope r a t ion s of r e la t ion a l a lge br a t o ope r a t ion s of r e la t ion a l a lge br a t o ope r a t ion s of r e la t ion a l a lge br a t o ope r a t ion s of r e la t ion a l a lge br a t o qu a lifie d r e la t ion s qu a lifie d r e la t ion s 1 . SL [ R : q ] = > [ SL R : F AN D q ] F R F R 2 . PJ [ [ R : q q ] = > [ PJ ] [ R : q q ] ] A A R R A A R R 3 . [ R : q ] CP [ S : q ] = > [ R CP S : q AN D q ]

R S R S 4 . [ R : q ] D F [ S : q ] = > [ R D F S : q ] R S R 5 . [ R : q 5 [ R : q ] UN [ S : q ] = > [ R UN S : q ] UN [ S : q ] = > [ R UN S : q OR q OR q ] ] R S R S

6 . [ R : q ] JN [ S : q ] = > [ R JN S : q AN D q AN D F] R F S F R S We use qualifications for elim inat ing

  fragm ent s which are not involved in the query.

  Eg : SL [SUPPLIER : CITY=“HYD”].

  CITY=“NSP”

  This reduces to an empty relation. This reduces to an empty relation Here SUPPLIER relation is qualified by HYD . Here SUPPLIER relation is qualified by “HYD” So selection of tuples based on CITY=“NSP” leads to empty relation. eads to e pty e at o

  Cr it e r ia for sim plifyin g e x pr e ssion s Cr it e r ia for sim plifyin g e x pr e ssion s ove r fr a gm e n t a t ion sch e m a ove r fr a gm e n t a t ion sch e m a ove r fr a gm e n t a t ion sch e m a ove r fr a gm e n t a t ion sch e m a

  1.Use idempotence of selection and projection to generate appropriate selections and projections for each operand relation.

2.Push selections and projections down in the tree as far as possible.

  3.Push selections down to the leaves of the o s do o a s o tree, and apply them using the algebra of qualified relations; substitute the selection result with empty relation if the qualification of the result is contradictory.

  3 us s

  4.Use the algebra of qualified relations to evaluate the qualification of operands of evaluate the qualification of operands of joins; substitute the subtree, including the jo join and its operands, with empty relation a d s op a ds, p y a o if the qualification of the result is contradictory.

  

Sim plifica t ion s of h or izon t a lly Sim plifica t ion s of h or izon t a lly

fr a gm e n t e d r e la t ion s fr a gm e n t e d r e la t ion s g g Consider query Q : SL DEPT

  DEPTNUM=1 where DEPT is a relation horizontally fragmented. f d The canonical form of query is The canonical form of query is

SL SL

  D EPTN UM = 1 SL

  DEPTNUM=1 UN D EPTN UM < = 1 0 ] ] [ D EPT1 : D EPTN UM < = 1 0 ] 1 0 < D EPTN UM < = 2 0 ] D EPTN UM > 2 0 ] [DEPT1: [DEPT2: [DEPT3:

  Simplification of Joins between Horizontally Fragmented Relations Let us consider, for simplicity, the join between two fragmented Let us consider for simplicity the join between two fragmented relations R and S. There are two distinct possibilities of joining them; The first one requires collecting all the fragments of R and S The first one req ires collecting all the fragments of R and S before performing the join.

The second one consists of performing the join between fragments

and then collecting all the results into the same result relation; we d th ll ti ll th lt i t th lt l ti refer to this second case as "distributed join." Neither of the above possibilities dominates the other. Very generally, we prefer the first solution if conditions on fragments are highly selective; the second f f solution is preferred if the join between fragments involves few pairs of fragments

  Criterion 5. In order to distribute joins which appear in the global query, unions (representing fragment collections)

  

Building a join graph requires, then, applying criterion 5 (for

distributing the join) followed by criterion 4 (for eliminating di t ib ti th j i ) f ll d b it i 4 (f li i ti

joins between fragments that, do not give any contribution to

the result). )

  

Let us show an example of a distributed join. We start from

query Q4 which requires the number SNUM of all suppliers query Q4 which requires the number SNUM of all suppliers having a supply order.

  The algebraic expression of the query over the global Th l b i i f th th l b l schema is Q4 : PJSNUM (SUPPLY NJN SUPPLIER)

  Simplification of Joins between Horizontally Fragmented Relations

  (B) DISTRIBUTED JOIN FOR QUERY Q4

  Let us consider again the query Q1 that requires the supplier number of Let us consider again the query Q1 that requires the supplier number of

USING INFERENCE FOR FURTHER SIMPLIFICATIONS

the following knowledge is available to the query optimizer:

those suppliers having a supply order issued in the North area. Assume that

Fran-cisco. 2. Orders from departments 1 to 10 are all addressed to suppliers of San 1.The North area includes only departments 1 to 10.

  1 Th N th i l d l d t t 1 t 10 eliminating sub-expressions. We use the above knowledge to "infer" contradictions that allow a)From 1 above, we can write the following implications: a)From 1 abo e e can rite the follo ing implications AREA = >"North" =>NOT (DEPTNUM > 20)

AREA => "North" =>NOT (10 < DEPTNUM < 20)

DEPT3 and evaluate the qualification of the results.

Using criterion 3, we apply the selection to fragments DEPT1, DEPT2, and

By-virtue of the above implications, two of them are contradictory. This

allows us to eliminate the sub expressions for fragments DEPT2 and DEPT3. allows us to eliminate the sub expressions for fragments DEPT2 and DEPT3

  (A)

SIMPLIFICATION OF AN OPERATOR TREE USING INFERENCE SIMPLIFICATION OF AN OPERATOR TREE USING INFERENCE

  We then apply criterion 5 for distributing the join; in principle, we would need to join the subtree including DEPT, with both subtrees including SUPPLY1, and SUPPLY2. But from 1 above, we know that: ,

  

AREA =>"North" =>DEPTNUM < 10

and from 2 above we know that:

  DEPTNUM DEPTNUM < 10 =>

  10 NOT (SNUM = SUPPLIER. SNUM AND SUPPLIER.CITY = "LA”)) By applying criterion 4, it is- possible to deduce that only the subtree including SUPPLY, needs to be joined. - The final subtree including SUPPLY needs to be joined - The final

SIMPLIFICATION OF AN OPERATOR TREE USING INFERENCE

  (B)

  

Simplification of Vertically Fragmented Relations

The simplification is to determine a proper subset of the

fragments which is sufficient for answering the query, g g q y,

and then to eliminate all other fragments from the query

expression, as well as the joins which are used in the inverse of the fragmentation schema for reconstructing the global relations. Example : Consider query Q5 which requires names Example :- Consider query Q5, which requires names and salaries of employees. The query on the global schema is simply schema is simply

  EMP Q5 : PJ NAME,SAL

The canonical operator tree of the expression is shown in

  Ca n on ica l for m of qu e r y Q5 PJ N AM E,SAL JN EM PN UM = EM PN UM [ EM P4 : t r u e ]

  UN [ EM P1 : D EPTN UM < = 1 0 ]

[ EM P2 :

1 0 < D EPTN UM < = 2 0 ]

  [ EM P3 : EPTN UM > 2 0 ]

  Sim plifie d qu e r y p q y PJ N AM E,SAL [ EM P4 :t r u e ]

  D I STRI BUTED D I STRI BUTED D I STRI BUTED D I STRI BUTED GROUPI N G AN D GROUPI N G AN D GROUPI N G AN D GROUPI N G AN D AGGREGATE FUN CTI ON AGGREGATE FUN CTI ON EVALUATI ON EVALUATI ON O O

  Database applications often require performing Database applications often require performing database access operations that cannot be expressed with relational algebra. g Therefore, query languages for relational databases typically allow the formulation of queries that cannot be reduced to expressions of relational algebra.

  The most important of these additional features are the possibility of grouping tuples into disjoint the possibility of grouping tuples into disjoint subsets of relations and of evaluating aggregate functions over them.

  Q Query 6 y Select AVG(QUAN) from SUPPLY where PNUM=“P1” Query 7 Select PNUM,SNUM,SUM(QUAN) , , (Q ) from SUPPLY g group by SNUM,PNUM p y , Query 8 Select PNUM,SNUM,SUM(QUAN) , , (Q ) from SUPPLY group by SNUM,PNUM having SUM(QUAN)>300 g (Q )

  Ex t e n sion of r e la t ion a l a lge br a Ex t e n sion of r e la t ion a l a lge br a Relational algebra is extended with the following Group-by GB R such that:

  G,AF G AF ƒ

  G are the attributes which determine the grouping of R.

  ƒ ƒ AF are aggregate functions to be evaluated on AF are aggregate functions to be evaluated on each group

  ƒ GB R is a relation having: g

  G,AF G AF A relation schema made by the attributes of G and the aggregate functions of AF.

  ƒ Either G or AF may be unspecified. Ei h G AF b ifi d

  Query 6

  S l Select AVG(QUAN) from SUPPLY where PNUM=P1 AVG(QUAN) f SUPPLY h PNUM P1

GB SL SUPPLY

  AVG(QUAN) PNUM=“P1” Query 7

  Select PNUM,SNUM,SUM(QUAN) from SUPPLY , , (Q ) group by SNUM,PNUM

  GB SNUM,PNUM,SUM(QUAN) SNUM,PNUM,SUM(QUAN) SUPPLY Query 8

  Select PNUM,SNUM,SUM(QUAN) from SUPPLY Select PNUM SNUM SUM(QUAN) from SUPPLY group by SNUM,PNUM having SUM(QUAN)>300

SL GB

  SUM(QUANT)>300 SNUM,PNUM,SUM(QUAN) SUPPLY

  Pr ope r t ie s of Gr ou p- by ope r a t ion Pr ope r t ie s of Gr ou p by ope r a t ion GB G,AF ( R

  2 ) ( GB G,AF R

1 UN R

1 G,AF

  1 ) UN( GB G,AF R

  2 ) G,AF

  2 SN C: For e ve r y i, j e it h e r ( G i R j ) or ( G i R j = 0 )

  Cr it e r ion 6 Cr it e r ion 6 Cr it e r ion 6 Cr it e r ion 6 y

  In order to distribute grouping and I d t di t ib t i d aggregate function evaluations appearing in global query, unions (representing in global query unions (representing fragment collections) must be pushed up, beyond the corresponding group-by y p g g p y operation.

  Ca n on ica l for m of qu e r y 8 Ca n on ica l for m of qu e r y 8 Ca n on ica l for m of qu e r y 8 Ca n on ica l for m of qu e r y 8 SL SUM ( QUAN T) > 3 0 0 ( Q ) GB SN UM ,PN UM ,SUM ( QUAN ) UN SUPPLY SUPPLY

  1

  2

  D ist r ibu t e d ve r sion of qu e r y 8 D ist r ibu t e d ve r sion of qu e r y 8 D ist r ibu t e d ve r sion of qu e r y 8 D ist r ibu t e d ve r sion of qu e r y 8 UN UN SL SL SL SL SUM(QUANT)>300 SUM(QUANT)>300 SUM(QUANT)>300 SUM(QUANT)>300 GB SNUM,PNUM,SUM(QUAN) SNUM,PNUM,SUM(QUAN)

GB SUPPLY SUPPLY 1 1 2 2 y

  We say that the aggregate function F has a distributed computation if for any multiset S and any decomposition of S multiset S and any decomposition of S into multisets S1,S2,S3,……..,Sn, it is possible to determine a set of aggregate possible to determine a set of aggregate functions F1,…..,Fm and an expression E(F1,……,Fm)

  F(S)= E(F1(Sn),….,F1(Sn),…,F2(S1),…,F2(Sn),…, Fm(S1),….,Fm(Sn)) y y An aggregate function for which it is possible An aggregate function for which it is possible to find the function Fi and the expression E(Fi) is the function average ( ) g SUM(SUM(S1),SUM(S2),..,SUM(Sn)

  AVG(S)= SUM(COUNT(S1),..,COUNT(Sn)) y

  S Similarly we have a y a MIN(S)=MIN(MIN(S1),MIN(S2),..,MIN(Sn)) MAX(S)=MAX(MAX(S1),MAX(S2),..,MAX(Sn)) COUNT(S)= SUM(COUNT(S1), COUNT(S2), .

  …, COUNT(Sn)) SUM(S) = SUM(SUM(S1), SUM(S2),..,

  SUM(Sn)) SUM(Sn)) y

  Ex:- Consider Query 6

  GB AVG(QUAN)

  Q y

  GB AVG(QUAN) SL PNUM=“P1” SUPPLY GB AVG(QUAN)

  SL PNUM=“P1” UN y

  We generate two independent sub queries, W t t i d d t b i operating on two fragments SUPPLY1 and SUPPLY2: SUPPLY2:

  SUM(QUAN),COUNT PNUM=“P1”

  1 GB SL SUPPLY SUM(QUAN),COUNT PNUM=“P1”

  2 GB SL SUPPLY

  D ist r ibu t e d Ve r sion of Qu e r y 6 . E:AVG( SAL) = SUM ( S1 ,S2 ) / SUM ( C1 ,C2 )

SUM ( QUAN ) ,COUN T SUM ( QUAN ) ,COUN T

S1 ,C1 :GB S2 ,C2 :GB

  PNUM=“P1” SL PNUM=“P1”

SL

  SUPPLY2 SUPPLY1 SUPPLY1

  Pa r a m e t r ic Qu e r ie s Pa r a m e t r ic Qu e r ie s Pa r a m e t r ic Qu e r ie s Pa r a m e t r ic Qu e r ie s y

  Parametric queries are the queries in which the formulas in the selection criteria which the formulas in the selection criteria of queries includes parameters whose values are not known when the query is values are not known when the query is compiled.

  y

  Ex :- Consider Query 9 Q9 : SL

  SUPPLY D EPTN UM = $ X OR D EPTN UM = $ Y

  Sim plifica t ion s of Pa r a m e t r ic Qu e r ie s Sim plifica t ion s of Pa r a m e t r ic Qu e r ie s Sim plifica t ion s of Pa r a m e t r ic Qu e r ie s Sim plifica t ion s of Pa r a m e t r ic Qu e r ie s & Ex t e n sion of Alge br a & Ex t e n sion of Alge br a

  The canonical form of query 9 is SL

  D EPTN UM = $ X OR D EPTN UM = $ Y

  UN [DEPT1: [DEPT2: [DEPT3: [DEPT1: [DEPT2: [DEPT3: Que y t ee Query tree with CUT Operation t CU Ope at o $ X> 2 0 $ Y< = 1 0 $ X< = 1 0 OR CUT

( $ X> 1 0 AN D ( $ X> 1 0 AN D

$ Y> 2 0 $ OR

( $ Y> 1 0 AN D

$ X< = 2 0 )

OR

SL D EPTN UM = $ X OR $ D EPTN UM = $ Y

$ Y< = 2 0 )

SL D EPTN UM = $ X OR $ D EPTN UM = $ Y SL D EPTN UM = $ X OR D EPTN UM = $ Y

  [ D EPT1 : D EPTN UM < = 1 0 ] [ D EPT3 : D EPTN UM > 2 0 ] [ D EPT2 : 1 0 < D EPTN UM < = 2 0 ]