I/D/E · Patterns

Immutability

Summary

Design principle where data structures cannot be modified after creation, simplifying distributed systems by eliminating update conflicts and race conditions

TL;DR

Immutability means data cannot be changed after creation. In distributed systems, immutable data structures eliminate entire classes of concurrency bugs, enable caching without invalidation, simplify replication, and power systems like Kafka, Git, and event sourcing architectures.

Visual Overview

Mutable vs Immutable Data
MUTABLE DATA (Traditional Approach)

  Database Record: User Balance                     
                                                    
  T0: balance = $100                               
  T1: UPDATE balance = $80  (withdraw $20)         
  T2: UPDATE balance = $130 (deposit $50)          
                                                    
  Current State: balance = $130                    
  History: LOST
                                                    
  Problems:                                         
  - Race conditions (concurrent updates)           
  - No audit trail                                 
  - Cache invalidation needed                      
  - Difficult to debug past states                 


IMMUTABLE DATA (Append-Only Approach)

 Event Log: User Transactions 
 
 Event 1: {type: "DEPOSIT", amount: 100, time: T0}
 Event 2: {type: "WITHDRAW", amount: 20, time: T1}
 Event 3: {type: "DEPOSIT", amount: 50, time: T2}
 
 Current State: SUM(events) = $130 
 History: PRESERVED  
 
 Benefits: 
  No race conditions (only appends) 
  Complete audit trail 
  Cache forever (never invalidated) 
  Time-travel debugging (replay to any point) 


CONCURRENCY COMPARISON:

Mutable (Requires Locking):
Thread A: READ balance=100  UPDATE balance=80 
Thread B: READ balance=100  UPDATE balance=150 
Result: Lost update! (one transaction overwrites other)

Immutable (Lock-Free):
Thread A: APPEND {withdraw: 20, id: 1}  No conflict
Thread B: APPEND {deposit: 50, id: 2}  No conflict
Result: Both transactions preserved 

Core Explanation

What is Immutability?

Immutability is a design principle where data structures cannot be modified after creation. Instead of updating existing data, you create new versions.

Programming Example:

// MUTABLE (traditional)
let user = { name: "Alice", age: 30 };
user.age = 31; // Original data modified ✕

// IMMUTABLE (functional)
const user = { name: "Alice", age: 30 };
const updatedUser = { ...user, age: 31 }; // New object created ✓
// Original 'user' unchanged

In distributed systems, immutability typically means:

  1. Append-only writes: New records added, existing records never modified
  2. Versioned data: Each change creates a new version
  3. Event logs: Store changes as immutable events

Why Immutability Matters in Distributed Systems

1. Eliminates Concurrency Bugs

Concurrency: Mutable vs Immutable
PROBLEM WITH MUTABLE DATA:

  Two servers updating same record           
                                             
  Server A: UPDATE inventory SET qty=9       
  Server B: UPDATE inventory SET qty=8       
                                             
  Race condition:                            
  - Who wins? (last write wins = data loss)  
  - Need distributed locks (slow)            
  - Need MVCC or optimistic locking          


SOLUTION WITH IMMUTABLE DATA:

 Two servers appending events 
 
 Server A: APPEND {sold: 1, timestamp: T1} 
 Server B: APPEND {sold: 2, timestamp: T2} 
 
 No race condition: 
 - Both events preserved 
 - No locks needed (append-only) 
 - Total sold = 3 (computed from events) 

2. Enables Aggressive Caching

Caching: Mutable vs Immutable
MUTABLE DATA:
- Cache user profile
- User updates profile
- Must invalidate cache (cache invalidation is hard!)
- Cache miss on next read

IMMUTABLE DATA:

- Cache user profile version 5
- User updates profile  creates version 6
- Version 5 cache still valid (never expires)
- New requests use version 6 (different cache key)

Result: Cache can live forever, no invalidation needed

3. Simplifies Replication

Replication: Mutable vs Immutable
MUTABLE REPLICATION:
Primary: UPDATE user SET name='Alice' WHERE id=123
Replica: Must apply same UPDATE

Problems:

- What if replica is behind? (out-of-order updates)
- What if UPDATE fails on replica? (inconsistency)
- How to handle conflicts? (complex merge logic)

IMMUTABLE REPLICATION:
Primary: APPEND event {id: 123, name: 'Alice', version: 5}
Replica: APPEND same event

Benefits:

- Events can be replayed in order
- Idempotent (appending same event twice is safe)
- No merge conflicts (deterministic ordering)

4. Time-Travel Debugging

Time-Travel Debugging
MUTABLE: Only current state exists
- Bug in production?
- Cannot see what state was at T-1 hour

IMMUTABLE: Complete history exists

- Bug in production?
- Replay events to T-1 hour
- See exact state at any point in time
- Example: "What was user's cart at 3pm yesterday?"

Append-Only Logs

The most common form of immutability in distributed systems:

Kafka Topic (Append-Only Log)

  Partition 0: User Events                    
                                              
  Offset 0: {user: 1, action: "login"}       
  Offset 1: {user: 1, action: "view_product"}
  Offset 2: {user: 2, action: "login"}       
  Offset 3: {user: 1, action: "purchase"}    
                                              
  Properties:                                 
  - Only appends allowed (no updates/deletes) 
  - Each message has immutable offset        
  - Consumers can replay from any offset     
  - Old messages deleted by time (retention) 


Benefits:
 High throughput (sequential disk writes)
 Multiple consumers can read same data
 Replay events for recovery or new consumers
 Audit trail preserved

Versioned Data

Alternative approach: Keep multiple versions of data

Database with Versioning
DATABASE WITH VERSIONING (e.g., DynamoDB)

  User Profile Versions                     
                                            
  Version 1: {name: "Alice", age: 30}       
  Version 2: {name: "Alice", age: 31}       
  Version 3: {name: "Alice A", age: 31}     
                                            
  Current: Version 3                        
  History: Versions 1-2 preserved           
                                            
  Implementation:                           
  - Each write creates new version          
  - Version ID/timestamp tracks changes     
  - Old versions kept or garbage collected  


Examples:

- DynamoDB: Version numbers
- PostgreSQL: MVCC (Multi-Version Concurrency Control)
- Git: Commit hashes

Real Systems Using Immutability

SystemImmutability ModelUse CaseBenefits
KafkaAppend-only logMessage streamingReplay, fault tolerance, high throughput
GitImmutable commitsVersion controlComplete history, branching, rollback
BlockchainImmutable ledgerCryptocurrencyTamper-proof, audit trail
Event SourcingEvent logCQRS systemsAudit trail, time-travel, replay
S3Write-once objectsObject storageCache forever, versioning
DatomicImmutable factsDatabaseQuery past states, time-travel

Case Study: Kafka Log Immutability

Kafka Design Decisions
1. Messages are immutable after write
 - Producer writes message  never changed
 - Consumers cannot modify messages
 - Only deletion: Time-based retention (e.g., delete after 7 days)

2. Sequential writes to disk
 - Append-only = sequential I/O (fast!)
 - Modern disks: Sequential ~600 MB/s vs Random ~100 MB/s
 - Result: Kafka throughput in millions of msgs/sec

3. Zero-copy reads
 - Messages immutable  cache in OS page cache
 - Send directly from page cache to network (zero-copy)
 - No serialization/deserialization overhead

4. Replayability
 - Consumer can reset offset and replay
 - Used for: Recovery, new consumers, backfilling data
 - Example: "Process last 24 hours of events again"

5. Log compaction (for keyed data)
 - Keep latest value per key
 - Delete old versions (garbage collection)
 - Still immutable: Never UPDATE, only APPEND + COMPACT
   

Case Study: Git Commits

Git Commit Immutability
Commit: SHA-256 hash of (content + metadata)

 Commit abc123: 
 - Parent: def456 
 - Tree: Files snapshot 
 - Author: Alice 
 - Message: "Add feature X" 


Properties:

- Changing any field  different hash  different commit
- Cannot modify history without changing hash
- Result: Tamper-proof, verifiable history

Benefits:
 Branching: Create alternate histories (branches)
 Merging: Combine histories deterministically
 Rollback: Revert to any commit
 Distributed: Clone full history to any machine

When to Use Immutability

✓ Perfect Use Cases

Event Sourcing Architectures

Event Sourcing Use Case
Scenario: Banking system
Requirement: Complete audit trail for compliance
Solution: Store all transactions as immutable events
Benefit: Can audit any account at any point in time

Message Streaming

Message Streaming Use Case
Scenario: Real-time analytics pipeline
Requirement: Multiple consumers, replayability
Solution: Kafka append-only log
Benefit: New analytics jobs can process historical data

Caching & CDN

Caching & CDN Use Case
Scenario: Static assets (images, JS, CSS)
Solution: Immutable URLs with content hash
Example: bundle.abc123.js (hash in filename)
Benefit: Cache forever with HTTP Cache-Control: immutable

Version Control

Version Control Use Case
Scenario: Collaborative document editing
Solution: Store every edit as immutable version
Benefit: Undo, redo, view history, branch documents

✕ When NOT to Use (or Use Carefully)

Storage-Constrained Systems

Storage-Constrained Systems
Problem: Immutable data accumulates forever
Example: 1 billion events/day = massive storage cost
Solution: Log compaction, retention policies, snapshots

GDPR Right to Delete

GDPR Right to Delete
Problem: Cannot truly delete immutable data
Example: User requests account deletion (GDPR)
Solution: Tombstone records, encryption with key deletion

Real-Time Updates with Small Changes

Real-Time Updates with Small Changes
Problem: Appending full document for small change is wasteful
Example: Updating single field in 1MB document
Solution: Hybrid approach (mutable with WAL for durability)

Interview Application

Common Interview Question

Q: “Why does Kafka use immutable logs instead of a traditional database?”

Strong Answer:

“Kafka uses immutable append-only logs for several key reasons:

1. Performance:

  • Sequential disk writes are 6x faster than random writes (600 MB/s vs 100 MB/s)
  • Append-only allows optimizing for sequential I/O
  • Result: Kafka achieves millions of messages/second throughput

2. Replayability:

  • Immutable messages can be read multiple times
  • Consumers can reset offset and replay historical data
  • Use cases: Recovery from consumer failures, backfilling data for new analytics

3. Simplifies Replication:

  • Replicas just copy log segments
  • No complex merge logic (events never change)
  • Idempotent replication (copying same event twice is safe)

4. Multiple Consumers:

  • Same log can be consumed by multiple independent consumers
  • Each consumer tracks own offset
  • Example: Real-time analytics + batch processing on same stream

5. Durability:

  • Once written to log, message is never lost
  • Replicas have identical copies (deterministic)
  • Contrast with message queues that delete on consumption

Trade-offs:

  • Storage cost: Must retain logs (mitigated by log compaction + retention)
  • Cannot update: If message has error, must append correction event
  • But benefits far outweigh costs for streaming use cases”

Code Example

Immutable Event Sourcing Pattern

// MUTABLE APPROACH (traditional)
class BankAccount {
  constructor() {
    this.balance = 0; // Mutable state
  }

  deposit(amount) {
    this.balance += amount; // In-place update ✕
    // History lost!
  }

  withdraw(amount) {
    this.balance -= amount; // In-place update ✕
  }
}

// Problem: No audit trail, race conditions on concurrent updates

// IMMUTABLE APPROACH (event sourcing)
class BankAccountEventSourced {
  constructor() {
    this.events = []; // Immutable event log
  }

  // Commands: Append events (never modify existing)
  deposit(amount) {
    const event = {
      type: "DEPOSIT",
      amount: amount,
      timestamp: Date.now(),
      id: generateId(),
    };
    this.events.push(event); // Append-only ✓
    return event;
  }

  withdraw(amount) {
    const event = {
      type: "WITHDRAW",
      amount: amount,
      timestamp: Date.now(),
      id: generateId(),
    };
    this.events.push(event); // Append-only ✓
    return event;
  }

  // Query: Compute current state from events
  getBalance() {
    return this.events.reduce((balance, event) => {
      if (event.type === "DEPOSIT") return balance + event.amount;
      if (event.type === "WITHDRAW") return balance - event.amount;
      return balance;
    }, 0);
  }

  // Time-travel: Get balance at any point in history
  getBalanceAt(timestamp) {
    return this.events
      .filter(e => e.timestamp <= timestamp)
      .reduce((balance, event) => {
        if (event.type === "DEPOSIT") return balance + event.amount;
        if (event.type === "WITHDRAW") return balance - event.amount;
        return balance;
      }, 0);
  }

  // Audit: Get complete transaction history
  getAuditLog() {
    return this.events.map(e => ({
      type: e.type,
      amount: e.amount,
      timestamp: new Date(e.timestamp).toISOString(),
    }));
  }
}

// Usage
const account = new BankAccountEventSourced();
account.deposit(100);
account.withdraw(20);
account.deposit(50);

console.log(account.getBalance()); // 130
console.log(account.getBalanceAt(Date.now() - 1000)); // Balance 1 second ago
console.log(account.getAuditLog()); // Complete history

Immutable Cache Keys (Versioned Assets)

// MUTABLE (cache invalidation problem)
<script src="/bundle.js"></script>
// Updated bundle.js → Must invalidate CDN cache (complex!)

// IMMUTABLE (cache forever)
<script src="/bundle.abc123.js"></script>
// Updated bundle → New hash → New URL → Old cache unaffected ✓

// Implementation
const crypto = require('crypto');
const fs = require('fs');

function generateImmutableAssetURL(filePath) {
  const content = fs.readFileSync(filePath);
  const hash = crypto.createHash('sha256')
    .update(content)
    .digest('hex')
    .substring(0, 8);

  const extension = filePath.split('.').pop();
  const basename = filePath.replace(`.${extension}`, '');

  // Immutable URL: content hash in filename
  const immutableURL = `${basename}.${hash}.${extension}`;

  // HTTP headers for immutable cache
  // Cache-Control: public, max-age=31536000, immutable
  // Result: Browser never revalidates (cache forever)

  return immutableURL;
}

// Example
generateImmutableAssetURL('bundle.js');  // bundle.abc12345.js
// Change one byte → Different hash → Different URL → New cache entry

Prerequisites: None - foundational concept

Related Concepts:

Used In Systems:

  • Kafka: Message streaming with immutable logs
  • Git: Version control with immutable commits
  • Blockchain: Immutable distributed ledger

Explained In Detail:

  • Kafka Deep Dive - Immutable log architecture in depth

Quick Self-Check

  • Can explain immutability in 60 seconds?
  • Understand difference between mutable and immutable data?
  • Know 3 benefits of immutability in distributed systems?
  • Can explain how Kafka uses immutability for performance?
  • Understand trade-offs (storage cost, GDPR)?
  • Can implement simple event sourcing pattern?

Production signal

Why this concept matters

Interview 55% of system design interviews
Production Kafka, Git, blockchain
Performance No race conditions
Scale Simplified replication