Aggregation Operations
13.5.2.3 Aggregation Operations
Aggregation operations proceed somewhat like projections. The aggregate oper- ations in SQL are count, sum, avg, min, and max:
• count : Consider a materialized view v = A G count (B) (r ), which computes the
count of the attribute B, after grouping r by attribute A. When a set of tuples i r is inserted into r , for each tuple t in i r we do the following: We look for the group t.A in the materialized view. If it is not present, we add (t.A, 1) to the materialized view. If the group t.A is present, we add 1 to the count of the group.
When a set of tuples d r is deleted from r , for each tuple t in d r we do the following: We look for the group t.A in the materialized view, and subtract 1 from the count for the group. If the count becomes 0, we delete the tuple for the group t.A from the materialized view.
• sum : Consider a materialized view v = A G sum (B) (r ).
When a set of tuples i r is inserted into r , for each tuple t in i r we do the following: We look for the group t.A in the materialized view. If it is not present, we add (t.A, t.B) to the materialized view; in addition, we store a count of 1 associated with (t.A, t.B), just as we did for projection. If the group t.A is present, we add the value of t.B to the aggregate value for the group, and add 1 to the count of the group.
When a set of tuples d r is deleted from r , for each tuple t in d r we do the following: We look for the group t.A in the materialized view, and subtract t.B from the aggregate value for the group. We also subtract 1 from the count for the group, and if the count becomes 0, we delete the tuple for the group t.A from the materialized view.
Without keeping the extra count value, we would not be able to distinguish
a case where the sum for a group is 0 from the case where the last tuple in a group is deleted.
13.5 Materialized Views** 611
• avg : Consider a materialized view v = A G avg (B) (r ).
Directly updating the average on an insert or delete is not possible, since it depends not only on the old average and the tuple being inserted/deleted, but also on the number of tuples in the group.
Instead, to handle the case of avg, we maintain the sum and count aggregate values as described earlier, and compute the average as the sum divided by the count.
• min, max : Consider a materialized view v = A G min (B) (r ). (The case of max is
exactly equivalent.) Handling insertions on r is straightforward. Maintaining the aggregate values min and max on deletions may be more expensive. For example, if the tuple corresponding to the minimum value for a group is deleted from r , we have to look at the other tuples of r that are in the same group to find the new minimum value.
Parts
» Indian Institute of Technology, Bombay
» Data Mining and Information Retrieval
» Structure of Relational Databases
» Database Schema When we talk about a database, we must differentiate between the database
» Basic Structure of SQL Queries
» Modification of the Database
» • Embedded SQL : Like dynamic SQL , embedded SQL provides a means by
» Advanced Aggregation Features**
» The Cartesian-Product Operation
» The Tuple Relational Calculus
» The Entity-Relationship Model
» • For an n-ary relationship set with an arrow on one of its edges, the primary
» Entity-Relationship Design Issues
» Representation of Generalization
» Alternative Notations for Modeling Data
» Other Aspects of Database Design
» Features of Good Relational Designs
» Atomic Domains and First Normal Form
» Decomposition Using Functional Dependencies
» BCNF Decomposition Algorithm
» Decomposition Using Multivalued Dependencies
» Application Programs and User Interfaces
» Overview of Physical Storage Media
» Magnetic Disk and Flash Storage
» Organization of Records in Files
» Comparison of Ordered Indexing and Hashing
» Implementation of Pipelining
» Evaluation Algorithms for Pipelining
» Transformation of Relational Expressions
» (A, r ), the number of distinct values that appear in the relation r for attribute
» Advanced Topics in Query Optimization**
» Transaction Atomicity and Durability
» Transaction Isolation and Atomicity
» Implementation of Isolation Levels
» Transactions as SQL Statements
» Weak Levels of Consistency in Practice
» Concurrency in Index Structures**
» Failure with Loss of Nonvolatile Storage
» Early Lock Release and Logical Undo Operations
» Centralized and Client – Server Architectures
» Parallelism on Multicore Processors
» Recovery and Concurrency Control
» Distributed Query Processing
» Heterogeneous Distributed Databases
» Partitioning and Retrieving Data
» Transactions and Replication
» Decision-Tree Construction Algorithm
» Relevance Ranking Using Terms
» Synonyms, Homonyms, and Ontologies
» Crawling and Indexing the Web
» Information Retrieval: Beyond Ranking of Pages
» Structured Types and Inheritance in SQL
» Array and Multiset Types in SQL
» Application Program Interfaces to XML
» Native Storage within a Relational Database
» Other Issues in Application Development
» Representation of Geographic Data
» Transaction-Processing Monitors
» Real-Time Transaction Systems
» PostgreSQL Implementation of MVCC
» Database Design and Querying Tools
» Database Administration Tools
» Business Intelligence Features
Show more