Sliding Window | Concepts

TL;DR

Sliding window counter is a rate limiting algorithm that fixes the boundary-burst problem of fixed windows. It weights the previous window’s count based on how far into the current window you are, providing smoother rate enforcement with minimal memory overhead (just two counters per key). Unlike token bucket, it doesn’t allow bursts—it enforces strict rate limits.

Visual Overview

Sliding Window Counter

THE BOUNDARY BURST PROBLEM
┌────────────────────────────────────────────────────┐
│  Fixed Window Rate Limit: 100 requests per minute  │
│                                                    │
│  Window 1 (2:00-2:59):    Window 2 (3:00-3:59):    │
│  [.......●●●●●●●●●●]     [●●●●●●●●●●.......]       │
│       100 at 2:55              100 at 3:00         │
│                ↓                    ↓              │
│            └──────────────────────────┘            │
│               200 requests in 10 seconds!          │
│                                                    │
│  Both are "100/minute"—but the system sees 2x!     │
└────────────────────────────────────────────────────┘

SLIDING WINDOW SOLUTION
┌────────────────────────────────────────────────────┐
│  At 3:15 (15 seconds = 25% into current window):   │
│                                                    │
│  Previous window (2:00-2:59): 80 requests          │
│  Current window (3:00-3:59): 30 requests so far    │
│  Elapsed into current window: 25%                  │
│                                                    │
│  ┌─────────────────────────────────────────┐       │
│  │ Weighted count = prev × (1 - elapsed) + curr │  │
│  │               = 80 × 0.75 + 30              │   │
│  │               = 60 + 30                     │   │
│  │               = 90                          │   │
│  └─────────────────────────────────────────┘       │
│                                                    │
│  90 < 100 limit → ALLOWED                          │
│  At 110: 80 × 0.75 + 31 = 91 → ALLOWED             │
│  At 120: 80 × 0.75 + 41 = 101 → REJECTED           │
│                                                    │
└────────────────────────────────────────────────────┘

Core Explanation

What is Sliding Window?

Real-World Analogy: Imagine a highway toll booth that tracks cars per hour. A fixed window system would reset at exactly each hour—so a convoy could race through at 11:59 and again at 12:00. A sliding window instead asks: “In the last 60 minutes from right now, how many cars passed?” It smoothly weights recent and older traffic.

Like looking backward through a sliding 1-hour window that moves with you through time, instead of fixed calendar hours.

How It Works

The algorithm maintains two counters per rate-limited key:

prev_count: Requests in the previous complete window
curr_count: Requests in the current ongoing window

Sliding Window State

SLIDING WINDOW STATE MACHINE
┌────────────────────────────────────────────────────┐
│                                                    │
│  On Request Arrival:                               │
│  ┌─────────────────────────────────────────────┐   │
│  │ 1. Determine current window index           │   │
│  │    window = floor(now / window_size)        │   │
│  │                                             │   │
│  │ 2. If new window, rotate counters           │   │
│  │    prev_count = curr_count                  │   │
│  │    curr_count = 0                           │   │
│  │                                             │   │
│  │ 3. Calculate elapsed fraction               │   │
│  │    elapsed = (now % window_size) / window_size  │
│  └─────────────────────────────────────────────┘   │
│                    ↓                               │
│  ┌─────────────────────────────────────────────┐   │
│  │ 4. Calculate weighted count                 │   │
│  │    weighted = prev × (1 - elapsed) + curr   │   │
│  │                                             │   │
│  │ 5. Check: weighted < limit?                 │   │
│  │    YES → curr_count++, return ALLOW         │   │
│  │    NO  → return REJECT (429)                │   │
│  └─────────────────────────────────────────────┘   │
│                                                    │
│  State per key: ~24 bytes (2 counters + window_id) │
│  Time complexity: O(1) per request                 │
│                                                    │
└────────────────────────────────────────────────────┘

The Algorithm

on request:
  window = floor(now / window_size)
  elapsed = (now % window_size) / window_size

  if window != current_window:
    # Rotate to new window
    prev_count = curr_count
    curr_count = 0
    current_window = window

  weighted_count = prev_count * (1 - elapsed) + curr_count

  if weighted_count < limit:
    curr_count += 1
    return ALLOW
  else:
    return REJECT

Key Properties

Property	Value	Notes
Time Complexity	O(1) per request	Simple arithmetic
Space Complexity	O(2) per key	Two counters + window ID
Burst Prevention	Yes	No sudden spikes allowed
Accuracy	Approximate	Weighted estimate, not exact

Why Weighted Counting Works

The weighting formula estimates how many requests occurred in the “last window_size seconds” as if we were looking backward from now:

Weighted Count Visualization

VISUALIZING THE SLIDING WINDOW
┌────────────────────────────────────────────────────┐
│                                                    │
│  Real time: ──────────────────────────────────►    │
│                                                    │
│  Previous Window        Current Window             │
│  [----------------]     [----------------]         │
│        2:00-2:59              3:00-3:59            │
│                                                    │
│  At 3:15 (25% into current window):                │
│                                                    │
│                     ← 60 seconds →                 │
│                    [==============]                │
│                    ↑              ↑                │
│                  2:15           3:15               │
│                    ↑              ↑                │
│              75% of prev     25% of curr           │
│                                                    │
│  We estimate: 75% of prev's requests likely        │
│  occurred in the part that overlaps our window.    │
│  Plus 100% of current window so far.               │
│                                                    │
└────────────────────────────────────────────────────┘

Sliding Window vs Fixed Window vs Token Bucket

Algorithm Comparison

COMPARISON TABLE
┌──────────────┬──────────────┬──────────────┬──────────────┐
│ Aspect       │ Fixed Window │ Sliding Win  │ Token Bucket │
├──────────────┼──────────────┼──────────────┼──────────────┤
│ Memory/key   │ 1 counter    │ 2 counters   │ 2 values     │
│ Boundary     │ 2× burst     │ Prevented    │ N/A          │
│ Bursts       │ At boundary  │ No bursts    │ Allowed      │
│ Accuracy     │ Exact/window │ Approximate  │ Exact/bucket │
│ Complexity   │ Simplest     │ Simple       │ Simple       │
│ Best for     │ Never use    │ Strict limit │ Bursty OK    │
└───────────────────────────────────────────────────────────┘

WHEN EACH MAKES SENSE
┌────────────────────────────────────────────────────┐
│                                                    │
│  Fixed Window: ✗ Avoid (boundary burst problem)    │
│                                                    │
│  Sliding Window: ✓ Billing, quotas, SLAs           │
│  └─ "Exactly 1000 requests per hour"               │
│  └─ No acceptable overage                          │
│                                                    │
│  Token Bucket: ✓ APIs with bursty clients          │
│  └─ "100 burst, 10/sec sustained"                  │
│  └─ Legitimate burst traffic expected              │
│                                                    │
└────────────────────────────────────────────────────┘

Real Systems Using Sliding Window

System	Implementation	Configuration	Use Case
Stripe API	Sliding window style	Per-endpoint limits	Payment processing
Cloudflare Rate Limiting	Configurable algorithms	Rule-based	Edge protection
Redis Rate Limiter	redis-cell module	Configurable	General rate limiting
Envoy	Token bucket + sliding window	Filter chain	Service mesh
HAProxy	stick-tables	Connection rate limiting	Load balancer

Note: Many systems offer configurable algorithms. Verify current behavior in documentation.

Configuration Patterns

Common Sliding Window Configurations

BILLING/QUOTA ENFORCEMENT
┌────────────────────────────────────────────────────┐
│  Use Case: API usage billing                       │
│  Window: 1 hour                                    │
│  Limit: Based on subscription tier                 │
│  Why sliding: Prevents gaming at window boundaries │
└────────────────────────────────────────────────────┘

SLA ENFORCEMENT
┌────────────────────────────────────────────────────┐
│  Use Case: Partner API contracts                   │
│  Window: 1 minute                                  │
│  Limit: Contracted rate                            │
│  Why sliding: Consistent experience, no spikes     │
└────────────────────────────────────────────────────┘

ABUSE PREVENTION
┌────────────────────────────────────────────────────┐
│  Use Case: Login attempts, password resets         │
│  Window: 15 minutes                                │
│  Limit: 5 attempts                                 │
│  Why sliding: Prevents rapid-fire attacks          │
└────────────────────────────────────────────────────┘

When to Use Sliding Window

✓ Perfect Use Cases

Sliding Window Use Cases

BILLING/METERING
Scenario: API calls billed per 1000 requests
Requirement: Cannot exceed purchased quota
Configuration: window=1h, limit=1000/tier
Trade-off: Slightly more state than fixed window

SLA ENFORCEMENT
Scenario: Partner contract guarantees X req/sec
Requirement: Smooth, predictable rate limiting
Configuration: window=1min, limit=600
Trade-off: No burst tolerance for legitimate spikes

SECURITY RATE LIMITS
Scenario: Login, password reset, verification codes
Requirement: Prevent rapid-fire attacks
Configuration: window=15min, limit=5
Trade-off: Legitimate users with typos may hit limit

FAIR USAGE POLICIES
Scenario: Shared resource (storage, compute)
Requirement: No single user dominates
Configuration: window=1h, limit=proportional_share
Trade-off: Bursty but legitimate workloads throttled

✕ When NOT to Use

When Sliding Window May Not Fit

BURSTY BUT LEGITIMATE TRAFFIC
Problem: Mobile apps burst on launch, then go idle
Example: 50 API calls in 1 second on app open
Alternative: Token bucket (allows burst up to capacity)
When OK: If burst is never acceptable

REAL-TIME BIDDING/TRADING
Problem: Milliseconds matter, weighted calc adds latency
Example: High-frequency trading, ad bidding
Alternative: Local in-memory counters, approximate
When OK: If microsecond latency isn't critical

LOW-VOLUME ENDPOINTS
Problem: Overhead not worth it for low traffic
Example: Admin endpoints with 10 requests/day
Alternative: Simple fixed window or none
When OK: High-value endpoints worth protecting

EXACT COUNTING REQUIRED
Problem: Sliding window is an approximation
Example: Regulatory requirement for exact counts
Alternative: Sliding window log (stores each timestamp)
When OK: Approximation within 5% is acceptable

Interview Application

Common Interview Question

Q: “Explain the boundary burst problem in rate limiting and how sliding window solves it.”

Strong Answer:

“The boundary burst problem occurs with fixed window rate limiting. If your window is 1 minute and resets at :00, a client can make 100 requests at 0:59 and 100 more at 1:00—effectively 200 requests in 2 seconds while technically respecting ‘100/minute’.

Sliding window counter solves this by weighting the previous window:

The Formula:
weighted = prev_window × (1 - elapsed_fraction) + curr_window
If we’re 25% into the current window, we count 75% of the previous window’s requests plus 100% of current. This approximates ‘requests in the last 60 seconds from right now.’

Example: At 1:15 (25% into minute 1):

Previous window (minute 0): 80 requests

Current window (minute 1): 25 requests so far

Weighted: 80 × 0.75 + 25 = 60 + 25 = 85

If limit is 100: ALLOWED

Trade-offs:

Pro: Prevents boundary gaming, smooth enforcement

Con: Approximate (not exact count), no burst tolerance

When I’d use it:

Billing/quotas where overage isn’t acceptable

SLA enforcement with partner contracts

When I’d use token bucket instead:

APIs where legitimate burst traffic is expected (mobile app startup)”

Follow-up: How would you implement sliding window for distributed rate limiting?

“For distributed sliding window, I’d use Redis with atomic operations:

Key structure: ratelimit:{user}:{window} stores count per window

Atomic increment: Use INCR with EXPIRE

Two-key lookup: Fetch current and previous window counts

Lua script approach (atomic):
-- Get both windows atomically
local prev = redis.call('GET', prev_key) or 0
local curr = redis.call('GET', curr_key) or 0
local weighted = prev * (1 - elapsed) + curr
if weighted < limit then
  redis.call('INCR', curr_key)
  redis.call('EXPIRE', curr_key, window_size * 2)
  return 1
end
return 0
Considerations:

Redis failures: Fail open (allow) or fail closed (reject) based on use case

Multi-region: Local Redis per region with eventual sync, accept slight inaccuracy

Clock skew: Use Redis server time, not client time”

Follow-up: What’s the sliding window log variant?

“Sliding window log stores the timestamp of each request instead of just a count. To check the limit, you count timestamps within the last window_size seconds.

Pros: Exact counting, no approximation Cons: O(N) space per key, O(log N) per request with cleanup

When to use: Regulatory requirements for exact counts, very low-volume endpoints where precision matters.

For high-throughput APIs, the counter variant’s approximation (typically within 1-5% of actual) is acceptable and much more efficient.”

Code Example

Sliding Window Counter (Python)

import time
from typing import Tuple, Dict
from dataclasses import dataclass

@dataclass
class WindowState:
    """State for sliding window rate limiter."""
    prev_count: int = 0
    curr_count: int = 0
    current_window: int = 0

class SlidingWindowRateLimiter:
    """
    Sliding window counter rate limiter.

    Prevents boundary burst problem by weighting previous window.
    """

    def __init__(self, limit: int, window_seconds: int):
        """
        Args:
            limit: Maximum requests per window
    # ... omitted: keep concept snippets short
    print("Waiting 5 seconds (half a window)...\n")
    time.sleep(5)

    # More requests - previous window still partially counted
    for i in range(3):
        allowed, info = limiter.is_allowed(user)
        status = "✓ ALLOWED" if allowed else "✗ REJECTED"
        print(f"Request {i+9}: {status}")
        print(f"  Weighted count: {info['weighted_count']}")
        print(f"  Remaining: {info['remaining']}\n")

Sliding Window with Redis (Production)

import time
import redis

class RedisSlidingWindow:
    """
    Distributed sliding window rate limiter using Redis.

    Uses two keys per client: previous and current window counts.
    """

    LUA_SCRIPT = """
    local curr_key = KEYS[1]
    local prev_key = KEYS[2]
    local limit = tonumber(ARGV[1])
    local window_size = tonumber(ARGV[2])
    local elapsed = tonumber(ARGV[3])

    -- Get counts (default 0)
    local prev_count = tonumber(redis.call('GET', prev_key) or 0)
    local curr_count = tonumber(redis.call('GET', curr_key) or 0)

    -- Calculate weighted count
    # ... omitted: keep concept snippets short
        result = self.script(
            keys=[curr_key, prev_key],
            args=[self.limit, self.window_seconds, elapsed]
        )

        return result[0] == 1, {
            "remaining": result[1],
            "limit": self.limit,
            "weighted": round(result[2], 2)
        }

Express.js Middleware

const Redis = require('ioredis');

const redis = new Redis(process.env.REDIS_URL);

function slidingWindowLimiter({ limit, windowSeconds, keyGenerator }) {
  const luaScript = `
    local curr_key = KEYS[1]
    local prev_key = KEYS[2]
    local limit = tonumber(ARGV[1])
    local window_size = tonumber(ARGV[2])
    local elapsed = tonumber(ARGV[3])

    local prev_count = tonumber(redis.call('GET', prev_key) or 0)
    local curr_count = tonumber(redis.call('GET', curr_key) or 0)

    local weighted = prev_count * (1 - elapsed) + curr_count

    if weighted < limit then
      redis.call('INCR', curr_key)
      redis.call('EXPIRE', curr_key, window_size * 2)
      return {1, math.floor(limit - weighted - 1)}
    else
  // ... omitted: keep concept snippets short
    }
  };
}

// Usage: 100 requests per minute, smooth enforcement
app.use('/api/', slidingWindowLimiter({
  limit: 100,
  windowSeconds: 60,
  keyGenerator: (req) => req.user?.id || req.ip,
}));

See It In Action:

Rate Limiting Explainer - Compares sliding window with token bucket

Related Concepts:

Rate Limiting - Parent concept
Token Bucket - Alternative algorithm (allows bursts)

Quick Self-Check

Can explain the boundary burst problem in 30 seconds?
Understand the weighted count formula and why it works?
Know the trade-off between sliding window and token bucket?
Can implement sliding window counter with Redis?
Understand when approximation is acceptable vs when you need exact counts?
Know why sliding window is preferred for billing/quotas?