Database ACID/BASE - Understanding the CAP Theorem

Learn what is the CAP Theorem in less than 5 minutes !
Tuesday, August 22, 2023
CAP Theorem Consistency Availabilty Partition-Tolerance CA CP AP

(Source: https://commons.wikimedia.org/wiki/File:CAP_Theorem.svg)

ACID vs. BASE

ACID

In traditional relational databases, ACID (Atomicity, Consistency, Isolation, Durability) represents a set of properties that ensure data integrity and reliability. Here’s a brief overview of each component:

  • Atomicity: Ensures that all operations within a transaction are completed; if not, the transaction is aborted, and no changes are made.

  • Consistency: Guarantees that a transaction brings the database from one valid state to another, maintaining database rules.

  • Isolation: Ensures that concurrent transactions do not interfere with each other, providing a reliable and predictable outcome.

  • Durability: Once a transaction is committed, the changes become permanent and survive any subsequent failures.

ACID Example: Imagine a banking application where a user transfers money from one account to another. In this scenario, ACID properties are crucial. If the transaction fails after deducting money from the sender’s account but before adding it to the recipient’s account, it would violate the consistency property. Atomicity ensures that either the entire transaction is completed successfully (both deductions and additions) or none of it takes place.

BASE

On the other hand, NoSQL databases often adopt the BASE (Basically Available, Soft-state, Eventually consistent) model, which focuses on achieving high availability and scalability in distributed systems:

  • Basically Available: The system is always available, even in the presence of network partitions or node failures. It sacrifices immediate consistency for increased availability.

  • Soft-state: The state of the system is allowed to change over time, even without input. It allows the system to make optimizations for performance or scalability by allowing certain temporary inconsistencies or partial views of the data. The data convergence process in a soft-state system might take some time, and the system doesn’t guarantee that all nodes see the same data at every instance.

  • Eventually Consistent: Given enough time and no further updates, all replicas of the data will converge to a consistent state. While the system may experience temporary inconsistencies, it eventually reaches a coherent state.

BASE Example: Consider a social media platform where users post updates and comments. In this scenario, eventual consistency is acceptable. When a user posts a new update, it might take a little time for all replicas across different data centers to synchronize. As long as all users eventually see the post, it satisfies the BASE property.

Understanding the CAP Theorem

The CAP theorem, proposed by computer scientist Eric Brewer, states that in a distributed system, it is impossible to achieve all three properties - Consistency, Availability, and Partition Tolerance - simultaneously. According to the theorem, a distributed system can satisfy at most two out of the three.

  1. Consistency: In a consistent system, all nodes see the same data simultaneously, no matter the number of replicas. Achieving this property requires synchronization and coordination among nodes, which may lead to increased latency, especially during network partitions.

  2. Availability: An available system guarantees that every request receives a response, either success or failure, without delay. This property emphasizes uninterrupted service even when certain nodes are down or unreachable. However, ensuring high availability might compromise consistency.

  3. Partition Tolerance: This property enables the system to continue functioning despite communication failures or network partitions between nodes. It ensures that the distributed system can survive and operate even if some nodes cannot communicate with each other.

In summary, the CAP theorem forces us to make a strategic decision based on the specific needs of our application:

  • If we prioritize Consistency and Availability, we might need to compromise on Partition Tolerance.

  • If Availability and Partition Tolerance are crucial, we might need to accept eventual consistency and loosen our requirements on immediate data coherence.

  • If we focus on Partition Tolerance, we might have to sacrifice either Consistency or Availability, depending on the situation.


Recommended articles