Implications of Using MVCC
27.4.1.4 Implications of Using MVCC
Using the P ostgre SQL MVCC scheme has implications in three different areas: (1) extra burden is placed on the storage system, since it needs to maintain different versions of tuples; (2) developing concurrent applications takes some extra care, since P ostgre SQL MVCC can lead to subtle, but important, differences in how concurrent transactions behave, compared to systems where standard two-phase locking is used; (3) P ostgre SQL performance depends on the characteristics of the workload running on it. The implications of P ostgre SQL MVCC are described in more detail below.
Creating and storing multiple versions of every row can lead to excessive storage consumption. To alleviate this problem, P ostgre SQL frees up space when possible by identifying and deleting versions of tuples that cannot be visible to any active or future transactions, and are therefore no longer needed. The task of freeing space is nontrivial, because indices may refer to the location of an unneeded tuple, so these references need to be deleted before reusing the space. To lessen this issue, P ostgre SQL avoids indexing multiple versions of a tuple that have identical index attributes. This allows the space taken by nonindexed tuples to be freed efficiently by any transaction that finds such a tuple.
For more aggressive space reuse, P ostgre SQL provides the vacuum command, which correctly updates indices for each freed tuple. P ostgre SQL employs a back- ground process to vacuum tables automatically, but the command can also be executed by the user directly. The vacuum command offers two modes of op- eration: Plain vacuum simply identifies tuples that are not needed, and makes their space available for reuse. This form of the command can operate in parallel with normal reading and writing of the table. Vacuum full does more extensive processing, including moving of tuples across blocks to try to compact the table to the minimum number of disk blocks. This form is much slower and requires an exclusive lock on each table while it is being processed.
Because of the use of multiversion concurrency control in P ostgre SQL , porting applications from other environments to P ostgre SQL might require some extra care to ensure data consistency. As an example, consider a transaction T A executing
a select statement. Since readers in P ostgre SQL don’t lock data, data read and selected by T A can be overwritten by another concurrent transaction T B , while T A is still running. As a result some of the data that T A returns might not be current anymore at the time of completion of T A .T A might return rows that in the meantime have been changed or deleted by other transactions. To ensure the current validity of a row and protect it against concurrent updates, an application must either use select for share or explicitly acquire a lock with the appropriate lock table command.
P ostgre SQL ’s approach to concurrency control performs best for workloads containing many more reads than updates, since in this case there is a very low chance that two updates will conflict and force a transaction to roll back. Two-
14 Chapter 27 PostgreSQL
phase locking may be more efficient for some update-intensive workloads, but this depends on many factors, such as the length of transactions and the frequency of deadlocks.
Parts
» Indian Institute of Technology, Bombay
» Data Mining and Information Retrieval
» Structure of Relational Databases
» Database Schema When we talk about a database, we must differentiate between the database
» Basic Structure of SQL Queries
» Modification of the Database
» • Embedded SQL : Like dynamic SQL , embedded SQL provides a means by
» Advanced Aggregation Features**
» The Cartesian-Product Operation
» The Tuple Relational Calculus
» The Entity-Relationship Model
» • For an n-ary relationship set with an arrow on one of its edges, the primary
» Entity-Relationship Design Issues
» Representation of Generalization
» Alternative Notations for Modeling Data
» Other Aspects of Database Design
» Features of Good Relational Designs
» Atomic Domains and First Normal Form
» Decomposition Using Functional Dependencies
» BCNF Decomposition Algorithm
» Decomposition Using Multivalued Dependencies
» Application Programs and User Interfaces
» Overview of Physical Storage Media
» Magnetic Disk and Flash Storage
» Organization of Records in Files
» Comparison of Ordered Indexing and Hashing
» Implementation of Pipelining
» Evaluation Algorithms for Pipelining
» Transformation of Relational Expressions
» (A, r ), the number of distinct values that appear in the relation r for attribute
» Advanced Topics in Query Optimization**
» Transaction Atomicity and Durability
» Transaction Isolation and Atomicity
» Implementation of Isolation Levels
» Transactions as SQL Statements
» Weak Levels of Consistency in Practice
» Concurrency in Index Structures**
» Failure with Loss of Nonvolatile Storage
» Early Lock Release and Logical Undo Operations
» Centralized and Client – Server Architectures
» Parallelism on Multicore Processors
» Recovery and Concurrency Control
» Distributed Query Processing
» Heterogeneous Distributed Databases
» Partitioning and Retrieving Data
» Transactions and Replication
» Decision-Tree Construction Algorithm
» Relevance Ranking Using Terms
» Synonyms, Homonyms, and Ontologies
» Crawling and Indexing the Web
» Information Retrieval: Beyond Ranking of Pages
» Structured Types and Inheritance in SQL
» Array and Multiset Types in SQL
» Application Program Interfaces to XML
» Native Storage within a Relational Database
» Other Issues in Application Development
» Representation of Geographic Data
» Transaction-Processing Monitors
» Real-Time Transaction Systems
» PostgreSQL Implementation of MVCC
» Database Design and Querying Tools
» Database Administration Tools
» Business Intelligence Features
Show more