Skip to main content

High-performance, multi-algorithm rate limiting for FastAPI with Redis and in-memory backends

Project description

โšก FastAPI Advanced Rate Limiter

High-performance, production-ready rate limiting for FastAPI applications.

GitHub stars GitHub forks License Python Version PyPI version

FastAPI Redis Thread-Safe Production-Ready


FastAPI Advanced Rate Limiter is a battle-tested library providing 6 different rate limiting algorithms with support for both in-memory and Redis backends. Perfect for APIs, microservices, and any FastAPI application that needs protection from abuse, overload, or DDoS attacks.


๐ŸŽฏ Key Features

  • ๐Ÿš€ 6 Production-Ready Algorithms โ€” Token Bucket, Leaky Bucket, Queue-based, Fixed Window, Sliding Window, and Sliding Window Log
  • ๐Ÿ”„ Dual Backend Support โ€” In-memory (zero dependencies) or Redis (distributed, cluster-friendly)
  • ๐ŸŽš๏ธ Flexible Scoping โ€” Global, per-user, or per-IP rate limiting
  • ๐Ÿ”’ Thread-Safe โ€” Per-key locks and atomic Redis operations
  • โšก High Performance โ€” Benchmarked at 15+ req/s for window-based algorithms
  • ๐Ÿ“Š Rich Monitoring โ€” get_status(), get_wait_time(), get_retry_after() helpers
  • ๐Ÿงช Comprehensive Tests โ€” 12 test scenarios covering all algorithms and backends
  • ๐Ÿ“š FastAPI-Style Docs โ€” Clear examples and tutorials

๐Ÿ“š Table of Contents


๐Ÿ’ก Why Rate Limiting?

Rate limiting is essential for:

  • ๐Ÿ›ก๏ธ Preventing Abuse โ€” Stop malicious users from overwhelming your API
  • โš–๏ธ Fair Usage โ€” Ensure all users get equal access to resources
  • ๐Ÿ’ฐ Cost Control โ€” Prevent unexpected bills from cloud services
  • ๐ŸŽฏ SLA Compliance โ€” Meet service level agreements and uptime guarantees
  • ๐Ÿšฆ Traffic Shaping โ€” Smooth out traffic spikes to protect downstream services

Without rate limiting, a single user can:

  • Consume all server resources
  • Trigger cascading failures
  • Generate massive cloud bills
  • Degrade service for legitimate users

โšก Quick Start

30 seconds to your first rate limiter:

from fastapi import FastAPI, Request, HTTPException
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter
import redis

app = FastAPI()

# Initialize rate limiter
redis_client = redis.Redis.from_url("redis://localhost:6379", decode_responses=True)
limiter = SlidingWindowRateLimiter(
    capacity=100,      # 100 requests
    fill_rate=10,      # per 10 seconds
    scope="user",      # per-user limits
    backend="redis",   # distributed
    redis_client=redis_client
)

@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
    # Get user identifier (from auth, IP, etc.)
    user_id = request.headers.get("X-User-ID") or request.client.host
    
    # Check rate limit
    if not limiter.allow_request(user_id):
        wait_time = limiter.get_wait_time(user_id)
        raise HTTPException(
            status_code=429,
            detail=f"Rate limit exceeded. Retry after {wait_time:.0f} seconds.",
            headers={"Retry-After": str(int(wait_time))}
        )
    
    return await call_next(request)

@app.get("/")
async def root():
    return {"message": "Hello World!"}

Run it:

uvicorn main:app --reload

Test it:

# Make requests
curl http://localhost:8000/
# After 100 requests in 10 seconds, you'll get:
# {"detail":"Rate limit exceeded. Retry after 5 seconds."}

๐Ÿ“ฆ Installation

From PyPI (Recommended)

pip install fastapi-advanced-rate-limiter

With Redis Support

pip install fastapi-advanced-rate-limiter redis

From Source (for Contributors)

If you want to contribute or modify the package:

git clone https://github.com/awais7012/FastAPI-RateLimiter.git
cd FastAPI-RateLimiter
pip install -e .

With Development Tools

For contributors who want to run tests and development tools:

pip install -e ".[dev]"

Requirements

  • Python: 3.8+
  • FastAPI: 0.68.0+
  • Redis (optional): 4.0.0+ (install separately if using Redis backend)

๐Ÿงฎ Algorithms Guide

Choose the right algorithm for your use case:

๐Ÿช™ Token Bucket โ€” Best for APIs with occasional bursts

How it works: Imagine a bucket that slowly fills with tokens. Each request takes a token. If the bucket is empty, requests are denied.

from fastapi_advanced_rate_limiter import TokenBucketLimiter

limiter = TokenBucketLimiter(
    capacity=10,    # Bucket holds 10 tokens (burst size)
    fill_rate=1.0,  # Refill 1 token per second
    scope="user",
    backend="memory"
)

Behavior:

  • Allows bursts up to capacity
  • Maintains long-term average rate of fill_rate
  • Good for file uploads, batch operations

Pros: โœ… Flexible, allows bursts
Cons: โš ๏ธ Can be "drained" by burst traffic

Use when: User-triggered actions (file uploads, form submissions)


๐Ÿ’ง Leaky Bucket โ€” Best for traffic shaping

How it works: Requests fill a bucket with a hole at the bottom. The bucket leaks at a constant rate. Overflow = rejected.

from fastapi_advanced_rate_limiter import LeakyBucketLimiter

limiter = LeakyBucketLimiter(
    capacity=5,     # Bucket capacity
    fill_rate=1.0,  # Leak rate (1 req/sec)
    scope="user",
    backend="memory"
)

Behavior:

  • Enforces smooth, consistent output rate
  • No burst allowance
  • Perfect for calling external APIs

Pros: โœ… Smooth, predictable rate
Cons: โŒ Strict (can frustrate users with spiky traffic)

Use when: Calling rate-limited external APIs, message queues


โฑ๏ธ Fixed Window โ€” Fastest, simplest

How it works: Count requests in fixed time windows (e.g., per minute). Reset count at window boundaries.

from fastapi_advanced_rate_limiter import FixedWindowRateLimiter

limiter = FixedWindowRateLimiter(
    capacity=1000,  # 1000 requests
    fill_rate=100,  # per 10 seconds (capacity/fill_rate)
    scope="global",
    backend="redis"
)

Behavior:

  • Simple counter with time-based reset
  • Very fast (Redis INCR operation)
  • Can get 2x capacity at window boundaries

Pros: โœ… Fastest, O(1), minimal memory
Cons: โš ๏ธ Boundary burst vulnerability

Use when: High-throughput internal APIs, coarse-grained limits


๐ŸชŸ Sliding Window โ€” Best balance (RECOMMENDED) โญ

How it works: Combines current and previous window counts with a weighted average to smooth transitions.

from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter

limiter = SlidingWindowRateLimiter(
    capacity=100,
    fill_rate=10,
    scope="user",
    backend="redis"  # Redis version performs best!
)

Behavior:

  • Prevents boundary bursts
  • Still O(1) time complexity
  • Smooth rate limiting without fixed window issues

Pros: โœ… Best balance of accuracy and performance
Cons: โš ๏ธ Slight approximation (but negligible in practice)

Use when: Production APIs, user-facing services (RECOMMENDED for most cases)


๐Ÿ“œ Sliding Window Log โ€” Most accurate

How it works: Logs exact timestamp of each request. Prunes old timestamps outside the window.

from fastapi_advanced_rate_limiter import SlidingWindowLogRateLimiter

limiter = SlidingWindowLogRateLimiter(
    capacity=10,
    fill_rate=10/300,  # 10 per 5 minutes
    scope="ip",
    backend="redis"
)

Behavior:

  • Perfect accuracy (no approximation)
  • Uses Redis sorted sets for efficiency
  • O(n) time complexity

Pros: โœ… Perfect accuracy, no boundary issues
Cons: โŒ Higher memory usage, not suitable for huge scale

Use when: Security-critical endpoints (login, payment), billing APIs


๐Ÿ“ฆ Queue-based Limiter

How it works: Maintains a queue of recent request timestamps. Similar to Sliding Window Log.

from fastapi_advanced_rate_limiter import QueueLimiter

limiter = QueueLimiter(
    capacity=50,
    fill_rate=5.0,
    scope="global",
    backend="memory"
)

Behavior:

  • Fair queuing
  • Predictable wait times

Use when: Background job processing, task queues


๐Ÿ“Š Performance Benchmarks

Test Setup: 5 concurrent users, 10-second tests, mixed traffic patterns

Algorithm Memory (req/s) Redis (req/s) Success Rate Best For
Fixed Window 15.99 ๐Ÿฅ‡ 15.33 53-59% High throughput
Sliding Window 15.03 14.32 51-66% Production (best balance) โญ
Sliding Window Log 15.41 14.58 52-67% Accuracy-critical
Token Bucket 4.76 3.69 14-16% Burst-friendly APIs
Leaky Bucket 4.76 2.93 10-16% Traffic shaping
Queue Limiter 2.76 3.03 9-12% Fair queuing

Key Findings:

  • Window-based algorithms are 3-4x faster than token/leaky bucket
  • Redis adds ~10% overhead but enables distributed rate limiting
  • Sliding Window with Redis has 65.7% success rate (best balanced algorithm)

Winner: ๐Ÿ† Sliding Window (Redis) โ€” Best for production APIs


๐ŸŽฏ Usage Examples

Example 1: Per-User Rate Limiting

from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import HTTPBearer
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter

app = FastAPI()
security = HTTPBearer()
limiter = SlidingWindowRateLimiter(capacity=100, fill_rate=10, scope="user", backend="redis")

def get_current_user(token: str = Depends(security)):
    # Your auth logic here
    return {"user_id": "user_123"}

@app.get("/api/data")
async def get_data(user = Depends(get_current_user)):
    if not limiter.allow_request(user["user_id"]):
        raise HTTPException(status_code=429, detail="Rate limit exceeded")
    
    return {"data": "Your protected data"}

Example 2: Per-IP Rate Limiting (for public endpoints)

from fastapi import FastAPI, Request, HTTPException
from fastapi_advanced_rate_limiter import FixedWindowRateLimiter

app = FastAPI()
limiter = FixedWindowRateLimiter(capacity=1000, fill_rate=100, scope="ip", backend="redis")

@app.get("/public/api")
async def public_endpoint(request: Request):
    client_ip = request.client.host
    
    if not limiter.allow_request(client_ip):
        raise HTTPException(
            status_code=429,
            detail="Too many requests from your IP",
            headers={"Retry-After": "60"}
        )
    
    return {"message": "Public data"}

Example 3: Global Rate Limiting

from fastapi_advanced_rate_limiter import LeakyBucketLimiter

# Limit total API traffic
global_limiter = LeakyBucketLimiter(
    capacity=10000,
    fill_rate=1000,  # 1000 req/sec globally
    scope="global",
    backend="redis"
)

@app.middleware("http")
async def global_rate_limit(request: Request, call_next):
    if not global_limiter.allow_request(None):  # None for global scope
        raise HTTPException(status_code=503, detail="Service temporarily unavailable")
    return await call_next(request)

Example 4: Multi-Layer Rate Limiting (Defense in Depth)

# Layer 1: Global limit (protect infrastructure)
global_limiter = FixedWindowRateLimiter(capacity=100000, fill_rate=10000, scope="global", backend="redis")

# Layer 2: Per-IP limit (prevent DDoS)
ip_limiter = SlidingWindowRateLimiter(capacity=1000, fill_rate=100, scope="ip", backend="redis")

# Layer 3: Per-user limit (fair usage)
user_limiter = TokenBucketLimiter(capacity=100, fill_rate=10, scope="user", backend="redis")

@app.middleware("http")
async def layered_rate_limit(request: Request, call_next):
    # Check global limit first
    if not global_limiter.allow_request(None):
        raise HTTPException(status_code=503, detail="Service overloaded")
    
    # Check IP limit
    if not ip_limiter.allow_request(request.client.host):
        raise HTTPException(status_code=429, detail="IP rate limit exceeded")
    
    # Check user limit (if authenticated)
    user_id = request.headers.get("X-User-ID")
    if user_id and not user_limiter.allow_request(user_id):
        raise HTTPException(status_code=429, detail="User rate limit exceeded")
    
    return await call_next(request)

Example 5: Custom Response with Retry-After

from fastapi.responses import JSONResponse

@app.exception_handler(HTTPException)
async def rate_limit_handler(request: Request, exc: HTTPException):
    if exc.status_code == 429:
        # Get wait time from limiter
        user_id = request.headers.get("X-User-ID") or request.client.host
        wait_time = limiter.get_wait_time(user_id)
        
        return JSONResponse(
            status_code=429,
            content={
                "error": "Rate limit exceeded",
                "retry_after_seconds": int(wait_time),
                "message": f"Please wait {int(wait_time)} seconds before retrying"
            },
            headers={"Retry-After": str(int(wait_time))}
        )
    return exc

๐Ÿ”ง API Reference

Common Methods (All Limiters)

allow_request(identifier=None) -> bool

Check if a request should be allowed.

Parameters:

  • identifier (str, optional): User ID, IP address, or None for global scope

Returns:

  • bool: True if allowed, False if rate limited

Example:

if limiter.allow_request("user_123"):
    # Process request
    pass
else:
    # Return 429
    pass

get_status(identifier=None) -> dict

Get current limiter status for monitoring.

Returns:

{
    "tokens_remaining": 7.5,  # Token Bucket
    "capacity": 10,
    "fill_rate": 1.0,
    "utilization_pct": 25.0
}

Example:

status = limiter.get_status("user_123")
print(f"User has {status['tokens_remaining']} requests remaining")

get_wait_time(identifier=None) -> float

Calculate seconds until next request would be allowed.

Returns:

  • float: Seconds to wait (0.0 if request would be allowed immediately)

Example:

wait = limiter.get_wait_time("user_123")
if wait > 0:
    print(f"Please wait {wait:.1f} seconds")

reset(identifier=None) -> None

Reset rate limit for an identifier (useful for testing or admin actions).

Example:

# Admin endpoint to reset user's rate limit
@app.post("/admin/reset-limit/{user_id}")
async def reset_user_limit(user_id: str):
    limiter.reset(user_id)
    return {"message": f"Rate limit reset for {user_id}"}

๐Ÿงฑ Backend Comparison

In-Memory Backend

Pros:

  • โšก Ultra-fast (no network overhead)
  • ๐ŸŽฏ Zero dependencies
  • ๐Ÿงช Perfect for development/testing

Cons:

  • โŒ Not shared across app instances
  • โŒ Lost on restart
  • โŒ Not suitable for production clusters

Use when:

  • Development/testing
  • Single-instance deployments
  • Non-critical rate limiting

Redis Backend

Pros:

  • ๐ŸŒ Shared across all app instances
  • ๐Ÿ’พ Persistent across restarts
  • ๐Ÿ”’ Atomic operations (race-condition free)
  • ๐Ÿ“ˆ Horizontally scalable

Cons:

  • ๐Ÿข ~10% slower (network latency)
  • ๐Ÿงฑ Requires Redis service
  • ๐Ÿ’ฐ Additional infrastructure cost

Use when:

  • Production environments
  • Load-balanced apps (multiple instances)
  • Microservices architecture
  • When rate limits must persist across restarts

Setup Redis:

# Docker
docker run -d --name redis -p 6379:6379 redis

# Or use Redis Cloud (free tier)
# https://redis.com/try-free/

๐Ÿ’ก Best Practices

1. Choose the Right Algorithm

# For most APIs โ†’ Sliding Window
limiter = SlidingWindowRateLimiter(...)

# For burst-heavy traffic โ†’ Token Bucket
limiter = TokenBucketLimiter(...)

# For critical operations โ†’ Sliding Window Log
limiter = SlidingWindowLogRateLimiter(...)

2. Use Appropriate Scope

# Authenticated users โ†’ per-user
scope="user"

# Public APIs โ†’ per-IP
scope="ip"

# Infrastructure protection โ†’ global
scope="global"

3. Set Realistic Limits

# Don't be too strict!
# Bad: 10 req/hour (users will be frustrated)
# Good: 1000 req/hour (generous but protective)

limiter = SlidingWindowRateLimiter(
    capacity=1000,
    fill_rate=1000/3600,  # 1000 per hour
    scope="user",
    backend="redis"
)

4. Always Include Retry-After Header

if not limiter.allow_request(user_id):
    wait_time = limiter.get_wait_time(user_id)
    raise HTTPException(
        status_code=429,
        headers={"Retry-After": str(int(wait_time))}
    )

5. Monitor Your Rate Limiters

@app.get("/metrics/rate-limits")
async def get_rate_limit_metrics():
    return {
        "global": global_limiter.get_status(None),
        "sample_user": user_limiter.get_status("user_123")
    }

๐Ÿงช Testing

Run All Tests

# Run comprehensive test suite
python tests/all_limiter_test.py

# Or with pytest
pytest tests/

Test Coverage

  • โœ… All 6 algorithms
  • โœ… All 3 scopes (global, user, IP)
  • โœ… Both backends (memory, Redis)
  • โœ… Concurrent access (thread safety)
  • โœ… Multi-layer rate limiting
  • โœ… Total: 72 test scenarios

Example Test Output

==========================================================================================
  FINAL COMPARISON SUMMARY
==========================================================================================
Limiter                   Backend       Allowed    Blocked     Rate/s   Success%       
------------------------------------------------------------------------------------------
Fixed Window              memory            168        145     15.99      53.7%        
Sliding Window            redis             153         80     14.32      65.7%  โญ      
Sliding Window Log        redis             155         95     14.58      62.0%        
Token Bucket              memory             50        262      4.76      16.0%        

๐Ÿ† Best Performance: Sliding Window (redis): 14.32 req/s with 65.7% success rate

๐Ÿค Contributing

We welcome contributions! Here's how:

  1. Fork the repo
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

Development Setup

git clone https://github.com/awais7012/FastAPI-RateLimiter.git
cd FastAPI-RateLimiter
pip install -e ".[dev]"

Code Style

  • Follow PEP 8
  • Add docstrings to all functions
  • Include type hints
  • Write tests for new features

๐Ÿ› Troubleshooting

Redis Connection Errors

# Problem: redis.exceptions.ConnectionError
# Solution: Ensure Redis is running

# Check Redis
docker ps | grep redis

# Or start Redis
docker run -d --name redis -p 6379:6379 redis

Rate Limit Not Working Across Instances

# Problem: Rate limits don't work across multiple app instances
# Solution: Use Redis backend, not memory

# โŒ Wrong (memory backend)
limiter = SlidingWindowRateLimiter(..., backend="memory")

# โœ… Correct (Redis backend)
limiter = SlidingWindowRateLimiter(..., backend="redis", redis_client=redis_client)

Import Errors

# Problem: ModuleNotFoundError: No module named 'fastapi_advanced_rate_limiter'
# Solution: Install the package from PyPI

pip install fastapi-advanced-rate-limiter

# Or if you cloned the repo
pip install -e .

๐Ÿ“„ License

MIT License - see LICENSE file for details.


๐Ÿ™ Acknowledgments

  • Inspired by FastAPI documentation style
  • Rate limiting algorithms based on industry best practices
  • Built with โค๏ธ by Ahmed Awais (Romeo)

๐Ÿ“ž Support


Built for developers who love clean, scalable FastAPI tooling.
โญ Star us on GitHub if you find this useful!

GitHub stars

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastapi_advanced_rate_limiter-2.1.0.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file fastapi_advanced_rate_limiter-2.1.0.tar.gz.

File metadata

File hashes

Hashes for fastapi_advanced_rate_limiter-2.1.0.tar.gz
Algorithm Hash digest
SHA256 8dfe57000eda5aa39b0a4634a066d299aa56a4033c0017b3c399b4c7d437f751
MD5 46e81dcf79b18222838caf691c49272d
BLAKE2b-256 498cab5cd8040a1f3d36c401567bcd959b82cf8925f0fda2eca01a03f5618748

See more details on using hashes here.

File details

Details for the file fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 acc108ce7d0d3fcc2db458ccc52a9b7072415c947df95b390e95a9bf9f4aa6c0
MD5 661a70e73660052564285d3364113adb
BLAKE2b-256 9240dcc81efdceed062088175f88a55c563f1bb54e28644e9dbbd64249f60f99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page