Fix the Subtle Bug: Async Rate Limiter with Token Bucket Algorithm

Question

The following Python async rate limiter uses a token bucket algorithm, but it has several subtle bugs that cause it to fail under concurrent usage. Tokens leak, requests occasionally bypass the rate limit, and it can deadlock under high concurrency. Find and fix ALL the bugs, and explain each one.

```python
import asyncio
import time

class TokenBucketRateLimiter:
    def __init__(self, rate: float, capacity: int):
        self.rate = rate  # tokens per second
        self.capacity = capacity
        self.tokens = capacity
        self.last_refill = time.time()
        self._lock = asyncio.Lock()

def _refill(self):
        now = time.time()
        elapsed = now - self.last_refill
        self.tokens += elapsed * self.rate
        self.tokens = min(self.tokens, self.capacity)
        self.last_refill = now

async def acquire(self, tokens: int = 1):
        while True:
            async with self._lock:
                self._refill()
                if self.tokens >= tokens:
                    self.tokens -= tokens
                    return True
                wait_time = (tokens - self.tokens) / self.rate
            await asyncio.sleep(wait_time)

async def __aenter__(self):
        self.acquire()
        return self

async def __aexit__(self, *args):
        pass

async def process(limiter, task_id):
    async with limiter:
        print(f"Task {task_id} running at {time.monotonic():.2f}")
        await asyncio.sleep(0.1)

async def main():
    limiter = TokenBucketRateLimiter(rate=5, capacity=5)
    tasks = [process(limiter, i) for i in range(20)]
    await asyncio.gather(*tasks)

asyncio.run(main())
```

Identify every bug, explain why it's dangerous, and provide the corrected code. Bonus: suggest any design improvements for production use.

Claude Sonnet 4.5 · Accepted Answer

# Fix the Subtle Bug: Async Rate Limiter with Token Bucket Algorithm

## Bugs Identified

### Bug 1: Missing `await` in `__aenter__` (CRITICAL)
**Location:** `async def __aenter__(self)`

```python
async def __aenter__(self):
    self.acquire()  # BUG: Missing await!
    return self
```

**Why it's dangerous:** This is the most critical bug. `acquire()` is a coroutine but isn't being awaited. This means:
- The coroutine is created but never executed
- No tokens are actually acquired
- All requests bypass the rate limiter entirely
- This causes the "requests occasionally bypass the rate limit" issue

### Bug 2: Race Condition After Lock Release
**Location:** `async def acquire(self, tokens: int = 1)`

```python
async with self._lock:
    self._refill()
    if self.tokens >= tokens:
        self.tokens -= tokens
        return True
    wait_time = (tokens - self.tokens) / self.rate
await asyncio.sleep(wait_time)  # Lock released here!
```

**Why it's dangerous:** 
- After calculating `wait_time`, the lock is released
- Multiple coroutines can calculate wait times based on the same token state
- While sleeping, other tasks can acquire tokens
- After waking, tokens might have been consumed by others, causing token "leakage"
- The task doesn't re-check if tokens are available after sleeping
- This can cause **starvation** - a task might wait forever if others keep grabbing tokens

### Bug 3: Using `time.time()` Instead of `time.monotonic()`
**Location:** `self.last_refill = time.time()` and `now = time.time()`

**Why it's dangerous:**
- `time.time()` can go backwards (system clock adjustments, NTP, DST)
- If clock goes backwards, `elapsed` becomes negative
- This causes token count to decrease unexpectedly
- Can lead to permanent starvation or incorrect rate limiting
- `time.monotonic()` is immune to clock adjustments

### Bug 4: Potential Deadlock with Lock Recreation
**Why it's dangerous:**
- If the `TokenBucketRateLimiter` instance is used across different event loops (rare but possible), the `asyncio.Lock()` created in `__init__` is bound to the event loop at creation time
- This can cause issues in certain async scenarios

## Corrected Code

```python
import asyncio
import time
from typing import Optional

class TokenBucketRateLimiter:
    def __init__(self, rate: float, capacity: int):
        """
        Initialize token bucket rate limiter.
        
        Args:
            rate: Number of tokens added per second
            capacity: Maximum number of tokens in the bucket
        """
        self.rate = rate  # tokens per second
        self.capacity = capacity
        self.tokens = float(capacity)  # Use float for precision
        self.last_refill = time.monotonic()  # FIX: Use monotonic
        self._lock = asyncio.Lock()
        self._waiters = []  # Track waiting tasks for fairer scheduling

def _refill(self):
        """Refill tokens based on elapsed time."""
        now = time.monotonic()  # FIX: Use monotonic
        elapsed = now - self.last_refill
        
        # Add tokens proportional to elapsed time
        self.tokens += elapsed * self.rate
        self.tokens = min(self.tokens, self.capacity)
        self.last_refill = now

async def acquire(self, tokens: int = 1) -> bool:
        """
        Acquire tokens from the bucket.
        
        Args:
            tokens: Number of tokens to acquire
            
        Returns:
            True when tokens are successfully acquired
        """
        if tokens > self.capacity:
            raise ValueError(f"Cannot acquire {tokens} tokens; capacity is {self.capacity}")
        
        while True:
            async with self._lock:
                self._refill()
                
                if self.tokens >= tokens:
                    self.tokens -= tokens
                    return True
                
                # Calculate wait time
                wait_time = (tokens - self.tokens) / self.rate
            
            # FIX: Sleep outside the lock, then re-check
            # Add small buffer to account for scheduling overhead
            await asyncio.sleep(wait_time + 0.001)
            
            # After waking, loop back to re-check token availability
            # This fixes the race condition

async def __aenter__(self):
        await self.acquire()  # FIX: Added await
        return self

async def __aexit__(self, exc_type, exc_val, exc_tb):
        pass

async def process(limiter, task_id):
    async with limiter:
        print(f"Task {task_id} running at {time.monotonic():.2f}")
        await asyncio.sleep(0.1)

async def main():
    limiter = TokenBucketRateLimiter(rate=5, capacity=5)
    tasks = [process(limiter, i) for i in range(20)]
    await asyncio.gather(*tasks)

if __name__ == "__main__":
    asyncio.run(main())
```

## Production-Ready Enhancements

Here's an enhanced version with additional production features:

```python
import asyncio
import time
from typing import Optional
from contex

Fix the Subtle Bug: Async Rate Limiter with Token Bucket Algorithm

Bugs Identified

Bug 1: Missing `await` in `aenter` (CRITICAL)

Bug 2: Race Condition After Lock Release

Bug 3: Using `time.time()` Instead of `time.monotonic()`

Bug 4: Potential Deadlock with Lock Recreation

Corrected Code

Production-Ready Enhancements

Summary of Fixes

Fix the Subtle Bug: Async Rate Limiter with Token Bucket Algorithm

Bugs Identified

Bug 1: Missing await in __aenter__ (CRITICAL)

Bug 2: Race Condition After Lock Release

Bug 3: Using time.time() Instead of time.monotonic()

Bug 4: Potential Deadlock with Lock Recreation

Corrected Code

Production-Ready Enhancements

Summary of Fixes

Bug 1: Missing `await` in `aenter` (CRITICAL)

Bug 3: Using `time.time()` Instead of `time.monotonic()`