I/D/E · Patterns

CAP Theorem

Summary

The fundamental trade-off in distributed systems: during a network partition, you must choose between consistency and availability

TL;DR

The CAP theorem states that during a network partition, a distributed system must choose between consistency (all nodes see the same data) and availability (every request gets a response). You cannot have both simultaneously when the network splits. This isn’t a limitation to overcome—it’s a fundamental property of distributed systems proven mathematically.

Visual Overview

CAP Theorem Triangle
THE CAP TRIANGLE

                                                    
                   Consistency                      
                       (C)                          
                      /   \                         
                     /     \                        
                    /  CA   \                       
                   /  (myth) \                      
                  /           \                     
                 /     CP      \                    
                /   (banking)   \                   
               /                 \                  
              /        AP         \                 
             /    (social media)   \                
            /                       \               
     Availability  Partition Tolerance   
          (A)                        (P)            
                                                    
  During partition: Pick C or A (P is mandatory)    


WHAT EACH MEANS

  C - Consistency                                   
      Every read receives the most recent write     
      All nodes see the same data at same time      
                                                    
  A - Availability                                  
      Every request receives a response             
      (not guaranteed to be the latest data)        
                                                    
  P - Partition Tolerance                           
      System continues to operate despite           
      network failures between nodes                

Core Explanation

What is the CAP Theorem?

Real-World Analogy: Imagine you’re running a chain of coffee shops with a shared inventory system. You sell the last bag of a special blend at Shop A.

  • With strong consistency: Shop B can’t sell that blend until they confirm Shop A’s sale—but if the network is down between shops, Shop B is stuck (unavailable).
  • With high availability: Shop B continues taking orders, potentially selling a bag they don’t have—the system stays up but might be wrong (inconsistent).

You can’t have both during a network problem. That’s CAP.

The Key Insight: Partitions Happen

The theorem only constrains behavior during network partitions. When the network is healthy, CAP doesn’t force any trade-offs—systems can provide strong consistency and high availability simultaneously. The critical insight:

Partition tolerance is not optional. Networks will fail. The real question is: when they do, do you sacrifice C or A?

Normal Operation vs Partition
NORMAL OPERATION (No Partition)

                                                    
  Client writes x=1                                 
                                                   
                                 
     Node A  > Node B                    
      x=1    healthy  x=1                      
      network                    
                                                   
  Both nodes have x=1                             
  Reads from either return x=1                    
                                                    
   CAP doesn't constrain trade-offs here           
    (no partition means no forced choice)           


NETWORK PARTITION OCCURS

                                                    
  Client writes x=2 to Node A                       
                                                   
                        
     Node A  > Node B                    
      x=2    BROKEN!  x=1                      
                                 
                                                    
  Node A has x=2, Node B has x=1 (stale)           
  Client reads from Node B... what should happen?  
                                                    
   NOW you must choose C or A                      

The Impossible Choice

CP vs AP During Partition
CHOOSE CONSISTENCY (CP)

  Client reads from Node B during partition         
                                                    
  Node B: "I can't confirm I have latest data"      
          "I'll refuse to answer"                   
                                                    
                        
     Node A          Node B   ERROR         
      x=2             x=1       "Unavailable"  
                                 
                                                    
   Client never sees wrong data                   
   Client gets no data (system unavailable)       
                                                    
  Examples: ZooKeeper, etcd, Consul, HBase         


CHOOSE AVAILABILITY (AP)

  Client reads from Node B during partition         
                                                    
  Node B: "I might not have latest data"            
          "But I'll give you what I have"           
                                                    
                        
     Node A          Node B   x=1 (stale!)  
      x=2             x=1                      
                                 
                                                    
   Client always gets a response                  
   Response might be outdated                     
                                                    
  Examples: Cassandra, DynamoDB, CouchDB           

Making the Decision

The right choice depends on your domain. Ask: “What’s worse for my users—seeing old data or seeing nothing?”

DomainWorse OutcomeCAP Choice
Bank balanceShowing wrong balance → overdraftCP
Inventory countOverselling → angry customersCP
Social media feedSlight delay in seeing postsAP
Shopping cartCart contents briefly out of syncAP
Distributed lockTwo processes think they hold lockCP
DNS lookupSlight delay in DNS propagationAP

Real Systems and Their CAP Behaviors

These classifications describe default or typical configurations. Most systems are tunable—the same system can behave as CP or AP depending on consistency level settings.

SystemTypical BehaviorConfiguration NotesUse Case
ZooKeeperCP-leaningRequires majority for writes; minority partitions read-onlyDistributed coordination
etcdCP-leaningRaft consensus requires majorityKubernetes config store
ConsulTunableDefault CP; can configure for AP with stale readsService discovery
CassandraTunableCL=ONE is AP; CL=QUORUM is CP-leaningMulti-datacenter database
DynamoDBTunableEventually consistent reads (AP); strongly consistent reads (CP)High-availability storage
CockroachDBCP-leaningSerializable by default; unavailable without quorumDistributed SQL
MongoDBTunableDepends on write/read concern settingsDocument database
RiakAP-leaningSibling values, application-level conflict resolutionKey-value store

Interview note: Avoid saying “X is a CP system” without qualification. Real systems offer configuration knobs. Focus on describing behavior under specific configurations.

Case Study: Partition Scenario

5-Node Cluster During Partition
5-NODE CLUSTER PARTITIONS INTO [3] AND [2]

                                                    
   MAJORITY (3 nodes)        MINORITY (2 nodes)   
               
    A   B   C  D   E          
               

                                                    
   CP SYSTEM (e.g., etcd):                         
    Majority: Accepts reads AND writes          
    Minority: Read-only or unavailable          
                                                    
   AP SYSTEM (e.g., Cassandra with CL=ONE):        
    Majority: Accepts reads AND writes          
    Minority: Accepts reads AND writes          
    Warning: Writes may conflict!                
                                                    
   After partition heals:                          
    CP: Minority syncs from majority             
    AP: Conflict resolution needed               

When to Use Each

✓ Choose CP (Consistency) When

CP Use Cases
FINANCIAL TRANSACTIONS
Scenario: Bank transfer between accounts
Requirement: Cannot show wrong balance, cannot double-spend
Choice: CP - Reject transaction if partition prevents verification
Trade-off: "Transfer failed, try again" better than "You're overdrawn"

INVENTORY SYSTEMS
Scenario: E-commerce with limited stock
Requirement: Cannot sell more than available
Choice: CP - Refuse purchase if inventory state uncertain
Trade-off: Lost sale better than overselling

DISTRIBUTED LOCKS
Scenario: Only one process should run a job
Requirement: Cannot have two lock holders
Choice: CP - If unsure about lock state, assume not held
Trade-off: Job waits rather than running twice

LEADER ELECTION
Scenario: Exactly one leader needed
Requirement: Split-brain is catastrophic
Choice: CP - No leader during partition better than two leaders
Trade-off: System pauses vs corrupted state

✓ Choose AP (Availability) When

AP Use Cases
SOCIAL MEDIA FEEDS
Scenario: Displaying user's timeline
Requirement: Always show something, even if slightly outdated
Choice: AP - Show cached feed, sync when possible
Trade-off: Seeing post 30 seconds late is fine

SHOPPING CARTS
Scenario: User adding items to cart
Requirement: Never lose an add-to-cart action
Choice: AP - Accept writes, merge conflicts later
Trade-off: Cart might show duplicates briefly (resolved at checkout)

CACHING LAYERS
Scenario: CDN serving static content
Requirement: Low latency more important than freshness
Choice: AP - Serve from cache even if origin unreachable
Trade-off: Stale content for TTL duration

ANALYTICS / METRICS
Scenario: Collecting user behavior data
Requirement: Don't lose events, exact order less critical
Choice: AP - Buffer locally, sync when connected
Trade-off: Dashboards may be a few seconds behind

Interview Application

Common Interview Question

Q: “Explain the CAP theorem. How would you design a distributed database that needs to handle network partitions?”

Strong Answer:

“The CAP theorem, proven by Gilbert and Lynch in 2002, states that a distributed system cannot simultaneously provide all three guarantees during a network partition:

  • Consistency: Every read receives the most recent write
  • Availability: Every request receives a response
  • Partition Tolerance: System operates despite network failures

Key insight: Partition tolerance isn’t optional in distributed systems—networks will fail. So the real choice is between C and A when a partition occurs.

For my database design, I’d first ask: ‘What’s the cost of inconsistency vs unavailability?’

For a banking system: I’d choose CP. Wrong balances can cause overdrafts, legal issues. During a partition, the minority side becomes unavailable for writes. I’d implement this with consensus protocols like Raft, requiring majority quorum for commits.

For a social media feed: I’d choose AP. Users expect the app to work. Seeing a post 30 seconds late is acceptable. During partition, all nodes accept writes. After healing, I’d use CRDTs or last-write-wins to merge conflicts.

Important nuance: CAP is about the partition moment. Most of the time there’s no partition, and you can have all three. Also, different parts of a system can make different choices—payments might be CP while user profiles are AP.”

Follow-up: What is PACELC and how does it extend CAP?

“PACELC extends CAP to consider behavior when there’s no partition. It says:

  • PAC: During Partition, choose Availability or Consistency
  • ELC: Else (no partition), choose Latency or Consistency

Even without partitions, strong consistency requires coordination (more latency). So systems like DynamoDB offer eventual consistency for lower latency (EL) or strong consistency with higher latency (EC).

Example: Cassandra is PA/EL—it chooses availability during partitions and low latency normally. Spanner is PC/EC—it always chooses consistency, accepting latency for synchronous replication.”

Code Example

Demonstrating CP vs AP Behavior

"""
Simulating CAP theorem trade-offs in a distributed key-value store.
"""

from enum import Enum
from dataclasses import dataclass
from typing import Optional
import time

class CAPMode(Enum):
    CP = "consistency"  # Prioritize consistency over availability
    AP = "availability"  # Prioritize availability over consistency


@dataclass
class Node:
    name: str
    data: dict
    is_connected: bool = True  # Simulates network partition


class DistributedStore:
    """
    A simplified distributed store demonstrating CAP trade-offs.
    """

    def __init__(self, mode: CAPMode, nodes: list[Node]):
        self.mode = mode
        self.nodes = {n.name: n for n in nodes}
        self.primary = nodes[0].name

    def write(self, key: str, value: str) -> dict:
        """
        Write a key-value pair.
        Behavior differs based on CAP mode during partition.
        """
        primary = self.nodes[self.primary]

        # Write to primary
        primary.data[key] = {"value": value, "timestamp": time.time()}

        # Try to replicate to other nodes
        replication_results = []
        for name, node in self.nodes.items():
            if name == self.primary:
                continue

            if node.is_connected:
                # Node reachable - replicate synchronously
                node.data[key] = primary.data[key].copy()
                replication_results.append((name, "synced"))
            else:
                # Node unreachable - partition!
                replication_results.append((name, "unreachable"))

        # Check if we achieved quorum (majority)
        synced_count = sum(1 for _, status in replication_results if status == "synced")
        total_nodes = len(self.nodes)
        has_quorum = (synced_count + 1) > total_nodes / 2  # +1 for primary

        if self.mode == CAPMode.CP:
            # CP: Require quorum for write to succeed
            if not has_quorum:
                # Rollback primary write
                del primary.data[key]
                return {
                    "success": False,
                    "error": "Quorum not reached - write rejected for consistency",
                    "mode": "CP"
                }
            return {
                "success": True,
                "message": f"Write committed with quorum ({synced_count + 1}/{total_nodes})",
                "mode": "CP"
            }

        else:  # AP mode
            # AP: Accept write even without quorum
            return {
                "success": True,
                "message": f"Write accepted (synced to {synced_count + 1}/{total_nodes})",
                "warning": "Some nodes may have stale data" if not has_quorum else None,
                "mode": "AP"
            }

    def read(self, key: str, from_node: str) -> dict:
        """
        Read a key from a specific node.
        Behavior differs based on CAP mode during partition.
        """
        node = self.nodes.get(from_node)
        if not node:
            return {"success": False, "error": "Node not found"}

        if self.mode == CAPMode.CP:
            # CP: Only return value if we can verify it's current
            if not node.is_connected and from_node != self.primary:
                return {
                    "success": False,
                    "error": "Node partitioned - cannot guarantee consistency",
                    "mode": "CP"
                }

        # AP mode or connected node: return whatever we have
        if key in node.data:
            return {
                "success": True,
                "value": node.data[key]["value"],
                "timestamp": node.data[key]["timestamp"],
                "node": from_node,
                "mode": self.mode.value,
                "warning": "Value may be stale" if not node.is_connected else None
            }

        return {"success": False, "error": "Key not found"}


# Demonstration
if __name__ == "__main__":
    # Create 3 nodes
    nodes = [
        Node("node-a", {}),
        Node("node-b", {}),
        Node("node-c", {}),
    ]

    print("=" * 60)
    print("SCENARIO: Network partition isolates node-c")
    print("=" * 60)

    # Simulate partition - node-c is unreachable
    nodes[2].is_connected = False

    # Test CP mode
    print("\n--- CP MODE (Consistency Priority) ---")
    cp_store = DistributedStore(CAPMode.CP, nodes)

    result = cp_store.write("balance", "1000")
    print(f"Write result: {result}")
    # CP requires 2/3 nodes (quorum) - this succeeds

    # Read from partitioned node
    result = cp_store.read("balance", "node-c")
    print(f"Read from partitioned node-c: {result}")
    # CP refuses - can't guarantee consistency

    # Reset for AP test
    for n in nodes:
        n.data = {}
    nodes[2].is_connected = False

    # Test AP mode
    print("\n--- AP MODE (Availability Priority) ---")
    ap_store = DistributedStore(CAPMode.AP, nodes)

    result = ap_store.write("balance", "1000")
    print(f"Write result: {result}")
    # AP accepts - warns about partial sync

    # Now partition node-a and node-b, connect node-c
    nodes[0].is_connected = False
    nodes[1].is_connected = False
    nodes[2].is_connected = True

    # Read from node-c (which missed the write)
    result = ap_store.read("balance", "node-c")
    print(f"Read from node-c (missed write): {result}")
    # AP returns stale/missing data - but it responds!

Common Misconceptions

CAP Misconceptions
MISCONCEPTION 1: "Pick any 2 of 3"
Reality: P is not optional. Networks fail. Real choice is C vs A during partition.

MISCONCEPTION 2: "CAP applies all the time"
Reality: Only during partitions. Normal operation can have all three.

MISCONCEPTION 3: "Systems are purely CP or AP"
Reality: Systems can be tunable, or use CP for some operations and AP for others.

MISCONCEPTION 4: "Eventual consistency = no consistency"
Reality: Eventual consistency is a form of consistency - data converges over time.

MISCONCEPTION 5: "CP means always consistent"
Reality: CP means sacrificing availability for consistency DURING PARTITION.

See It In Action:

Related Concepts:

Quick Self-Check

  • Can explain CAP theorem in 60 seconds?
  • Understand why partition tolerance is non-optional?
  • Can give examples of CP vs AP system choices?
  • Know what happens to each side of a partition in CP vs AP?
  • Understand that CAP only applies during partitions?
  • Can explain why the same system might choose CP for some operations and AP for others?

Production signal

Why this concept matters

Interview 85% of distributed systems interviews
Production Every distributed database
Performance Architecture decisions
Scale Partition handling