Multiversion Concurrency Control (MVCC)- How PostgreSQL attain data consistency



Posted: Tuesday, June 21, 2005

by

Introduction.

Multiversion Concurrency Control (MVCC) is an advanced technique for improving database performance in a multiuser environment. Unlike traditional database systems, which use locks for concurrency control, PostgreSQLMVCC). The main advantage to using the MVCC model of concurrency control rather than locking is that in MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading. maintains data consistency by using a multiversion model (Multiversion Concurrency Control.

Table- and row-level locking facilities are also available in PostgreSQL for applications that cannot adapt easily to MVCC behavior. However, proper use of MVCC will generally provide better performance than locks.

Transaction Isolation.

The SQL standard defines four levels of transaction isolation in terms of three phenomena that must be prevented between concurrent transactions. These undesirable phenomena are:

dirty read - A transaction reads data written by a concurrent uncommitted transaction.

Non-repeatable read - A transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read).

phantom read - A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.

Read Committed Isolation Level

Read Committed is the default isolation level in PostgreSQL. When a transaction runs on this isolation level, a SELECT query sees only data committed before the query began it never sees either uncommitted data or changes committed during query execution by concurrent

Serializable Isolation Level.

The level Serializable provides the strictest transaction isolation. This level emulates serial transaction execution, as if transactions had been executed one after another, serially, rather than concurrently. However, applications using this level must be prepared to retry transactions due to serialization failures.

Explicit Locking

PostgreSQL provides various lock modes to control concurrent access to data in tables. These modes can be used for application-controlled locking in situations where MVCC does not give the desired behavior. Also, most PostgreSQL commands automatically acquire locks of appropriate modes to ensure that referenced tables are not dropped or modified in incompatible ways while the command executes. (For example, ALTER TABLE cannot be executed concurrently with other operations on the same table.)


Table_level Locks


Non-conflicting lock modes may be held concurrently by many transactions. Notice in particular that some lock modes are self-conflicting (for example, an ACCESS EXCLUSIVE lock cannot be held by more than one transaction at a time) while others are not self-conflicting (for example, an ACCESS SHARE lock can be held by multiple transactions). Once acquired, a lock is held till end of transaction.


Row-Level Locks


In addition to table-level locks, there are row-level locks. A row-level lock on a specific row is automatically acquired when the row is updated (or deleted or marked for update). The lock is held until the transaction commits or rolls back. Row-level locks do not affect data querying they block writers to the same row only. To acquire a row-level lock on a row without actually modifying the row, select the row with SELECT FOR UPDATE. Note that once a particular row-level lock is acquired, the transaction may update the row multiple times without fear of conflicts.


Dead Locks


The use of explicit locking can increase the likelihood of deadlocks, wherein two (or more) transactions each hold locks that the other wants. For example, if transaction 1 acquires an exclusive lock on table A and then tries to acquire an exclusive lock on table B, while transaction 2 has already exclusive-locked table B and now wants an exclusive lock on table A, then neither one can proceed. PostgreSQL automatically detects deadlock situations and resolves them by aborting one of the transactions involved, allowing the other(s) to complete. (Exactly which transaction will be aborted is difficult to predict and should not be relied on.)


Locking and Indexes


Though PostgreSQL provides nonblocking read/write access to table data, nonblocking read/write access is not currently offered for every index access method implemented in PostgreSQL. B-tree indexes offer the best performance for concurrent applications since they also have more features than hash indexes, they are the recommended index type for concurrent applications that need to index scalar data. When dealing with non-scalar data, B-trees obviously cannot be used in that situation, application developers should be aware of the relatively poor concurrent performance of GiST and R-tree indexes.


Bibliography
.

www.postgresql.org

http://www.netcraft.com.au/geoffrey/postgresql/mvcc.html

http://www.linuxgazette.com/issue68/mitchell.html

 

This Article has been viewed 4,255 times. (Not updated in real-time.)
Top-level comments on this article: (3 total)
» left by Dayana
from Bangalore
6 years 229 days ago.
Good content , Simple & Understandable
» left by Cláudio BezerraLeopoldino
from Fortaleza Brazil
6 years 166 days ago.
The article has a wonderful theme, but isn't deep enough. Reading the text i couldn't know how MVCC works in Postgresql and how this works with another concurrency techniques in the database.
» left by Guillaume
from France
5 years 337 days ago.
Very good and clear explanation of a complex matter. Thank you John
We want your comments! If you can read this, you don't have javascript enabled, so you can't use this comment system. Please enable javascript.