High-performance, multi-algorithm rate limiting for FastAPI with Redis and in-memory backends

These details have not been verified by PyPI

Project links

Project description

⚡ FastAPI Advanced Rate Limiter

High-performance, production-ready rate limiting for FastAPI applications.

Thread-Safe Production-Ready

FastAPI Advanced Rate Limiter is a battle-tested library providing 6 different rate limiting algorithms with support for both in-memory and Redis backends. Perfect for APIs, microservices, and any FastAPI application that needs protection from abuse, overload, or DDoS attacks.

🎯 Key Features

🚀 6 Production-Ready Algorithms — Token Bucket, Leaky Bucket, Queue-based, Fixed Window, Sliding Window, and Sliding Window Log
🔄 Dual Backend Support — In-memory (zero dependencies) or Redis (distributed, cluster-friendly)
🎚️ Flexible Scoping — Global, per-user, or per-IP rate limiting
🔒 Thread-Safe — Per-key locks and atomic Redis operations
⚡ High Performance — Benchmarked at 15+ req/s for window-based algorithms
📊 Rich Monitoring — get_status(), get_wait_time(), get_retry_after() helpers
🧪 Comprehensive Tests — 12 test scenarios covering all algorithms and backends
📚 FastAPI-Style Docs — Clear examples and tutorials

💡 Why Rate Limiting?

Rate limiting is essential for:

🛡️ Preventing Abuse — Stop malicious users from overwhelming your API
⚖️ Fair Usage — Ensure all users get equal access to resources
💰 Cost Control — Prevent unexpected bills from cloud services
🎯 SLA Compliance — Meet service level agreements and uptime guarantees
🚦 Traffic Shaping — Smooth out traffic spikes to protect downstream services

Without rate limiting, a single user can:

Consume all server resources
Trigger cascading failures
Generate massive cloud bills
Degrade service for legitimate users

⚡ Quick Start

30 seconds to your first rate limiter:

from fastapi import FastAPI, Request, HTTPException
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter
import redis

app = FastAPI()

# Initialize rate limiter
redis_client = redis.Redis.from_url("redis://localhost:6379", decode_responses=True)
limiter = SlidingWindowRateLimiter(
    capacity=100,      # 100 requests
    fill_rate=10,      # per 10 seconds
    scope="user",      # per-user limits
    backend="redis",   # distributed
    redis_client=redis_client
)

@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
    # Get user identifier (from auth, IP, etc.)
    user_id = request.headers.get("X-User-ID") or request.client.host
    
    # Check rate limit
    if not limiter.allow_request(user_id):
        wait_time = limiter.get_wait_time(user_id)
        raise HTTPException(
            status_code=429,
            detail=f"Rate limit exceeded. Retry after {wait_time:.0f} seconds.",
            headers={"Retry-After": str(int(wait_time))}
        )
    
    return await call_next(request)

@app.get("/")
async def root():
    return {"message": "Hello World!"}

Run it:

uvicorn main:app --reload

Test it:

# Make requests
curl http://localhost:8000/
# After 100 requests in 10 seconds, you'll get:
# {"detail":"Rate limit exceeded. Retry after 5 seconds."}

📦 Installation

From PyPI (Recommended)

pip install fastapi-advanced-rate-limiter

With Redis Support

pip install fastapi-advanced-rate-limiter redis

From Source (for Contributors)

If you want to contribute or modify the package:

git clone https://github.com/awais7012/FastAPI-RateLimiter.git
cd FastAPI-RateLimiter
pip install -e .

With Development Tools

For contributors who want to run tests and development tools:

pip install -e ".[dev]"

Requirements

Python: 3.8+
FastAPI: 0.68.0+
Redis (optional): 4.0.0+ (install separately if using Redis backend)

🧮 Algorithms Guide

Choose the right algorithm for your use case:

🪙 Token Bucket — Best for APIs with occasional bursts

How it works: Imagine a bucket that slowly fills with tokens. Each request takes a token. If the bucket is empty, requests are denied.

from fastapi_advanced_rate_limiter import TokenBucketLimiter

limiter = TokenBucketLimiter(
    capacity=10,    # Bucket holds 10 tokens (burst size)
    fill_rate=1.0,  # Refill 1 token per second
    scope="user",
    backend="memory"
)

Behavior:

Allows bursts up to capacity
Maintains long-term average rate of fill_rate
Good for file uploads, batch operations

Pros: ✅ Flexible, allows bursts
Cons: ⚠️ Can be "drained" by burst traffic

Use when: User-triggered actions (file uploads, form submissions)

💧 Leaky Bucket — Best for traffic shaping

How it works: Requests fill a bucket with a hole at the bottom. The bucket leaks at a constant rate. Overflow = rejected.

from fastapi_advanced_rate_limiter import LeakyBucketLimiter

limiter = LeakyBucketLimiter(
    capacity=5,     # Bucket capacity
    fill_rate=1.0,  # Leak rate (1 req/sec)
    scope="user",
    backend="memory"
)

Behavior:

Enforces smooth, consistent output rate
No burst allowance
Perfect for calling external APIs

Pros: ✅ Smooth, predictable rate
Cons: ❌ Strict (can frustrate users with spiky traffic)

Use when: Calling rate-limited external APIs, message queues

⏱️ Fixed Window — Fastest, simplest

How it works: Count requests in fixed time windows (e.g., per minute). Reset count at window boundaries.

from fastapi_advanced_rate_limiter import FixedWindowRateLimiter

limiter = FixedWindowRateLimiter(
    capacity=1000,  # 1000 requests
    fill_rate=100,  # per 10 seconds (capacity/fill_rate)
    scope="global",
    backend="redis"
)

Behavior:

Simple counter with time-based reset
Very fast (Redis INCR operation)
Can get 2x capacity at window boundaries

Pros: ✅ Fastest, O(1), minimal memory
Cons: ⚠️ Boundary burst vulnerability

Use when: High-throughput internal APIs, coarse-grained limits

🪟 Sliding Window — Best balance (RECOMMENDED) ⭐

How it works: Combines current and previous window counts with a weighted average to smooth transitions.

from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter

limiter = SlidingWindowRateLimiter(
    capacity=100,
    fill_rate=10,
    scope="user",
    backend="redis"  # Redis version performs best!
)

Behavior:

Prevents boundary bursts
Still O(1) time complexity
Smooth rate limiting without fixed window issues

Pros: ✅ Best balance of accuracy and performance
Cons: ⚠️ Slight approximation (but negligible in practice)

Use when: Production APIs, user-facing services (RECOMMENDED for most cases)

📜 Sliding Window Log — Most accurate

How it works: Logs exact timestamp of each request. Prunes old timestamps outside the window.

from fastapi_advanced_rate_limiter import SlidingWindowLogRateLimiter

limiter = SlidingWindowLogRateLimiter(
    capacity=10,
    fill_rate=10/300,  # 10 per 5 minutes
    scope="ip",
    backend="redis"
)

Behavior:

Perfect accuracy (no approximation)
Uses Redis sorted sets for efficiency
O(n) time complexity

Pros: ✅ Perfect accuracy, no boundary issues
Cons: ❌ Higher memory usage, not suitable for huge scale

Use when: Security-critical endpoints (login, payment), billing APIs

📦 Queue-based Limiter

How it works: Maintains a queue of recent request timestamps. Similar to Sliding Window Log.

from fastapi_advanced_rate_limiter import QueueLimiter

limiter = QueueLimiter(
    capacity=50,
    fill_rate=5.0,
    scope="global",
    backend="memory"
)

Behavior:

Fair queuing
Predictable wait times

Use when: Background job processing, task queues

📊 Performance Benchmarks

Test Setup: 5 concurrent users, 10-second tests, mixed traffic patterns

Algorithm	Memory (req/s)	Redis (req/s)	Success Rate	Best For
Fixed Window	15.99 🥇	15.33	53-59%	High throughput
Sliding Window	15.03	14.32	51-66%	Production (best balance) ⭐
Sliding Window Log	15.41	14.58	52-67%	Accuracy-critical
Token Bucket	4.76	3.69	14-16%	Burst-friendly APIs
Leaky Bucket	4.76	2.93	10-16%	Traffic shaping
Queue Limiter	2.76	3.03	9-12%	Fair queuing

Key Findings:

Window-based algorithms are 3-4x faster than token/leaky bucket
Redis adds ~10% overhead but enables distributed rate limiting
Sliding Window with Redis has 65.7% success rate (best balanced algorithm)

Winner: 🏆 Sliding Window (Redis) — Best for production APIs

🎯 Usage Examples

Example 1: Per-User Rate Limiting

from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import HTTPBearer
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter

app = FastAPI()
security = HTTPBearer()
limiter = SlidingWindowRateLimiter(capacity=100, fill_rate=10, scope="user", backend="redis")

def get_current_user(token: str = Depends(security)):
    # Your auth logic here
    return {"user_id": "user_123"}

@app.get("/api/data")
async def get_data(user = Depends(get_current_user)):
    if not limiter.allow_request(user["user_id"]):
        raise HTTPException(status_code=429, detail="Rate limit exceeded")
    
    return {"data": "Your protected data"}

Example 2: Per-IP Rate Limiting (for public endpoints)

from fastapi import FastAPI, Request, HTTPException
from fastapi_advanced_rate_limiter import FixedWindowRateLimiter

app = FastAPI()
limiter = FixedWindowRateLimiter(capacity=1000, fill_rate=100, scope="ip", backend="redis")

@app.get("/public/api")
async def public_endpoint(request: Request):
    client_ip = request.client.host
    
    if not limiter.allow_request(client_ip):
        raise HTTPException(
            status_code=429,
            detail="Too many requests from your IP",
            headers={"Retry-After": "60"}
        )
    
    return {"message": "Public data"}

Example 3: Global Rate Limiting

from fastapi_advanced_rate_limiter import LeakyBucketLimiter

# Limit total API traffic
global_limiter = LeakyBucketLimiter(
    capacity=10000,
    fill_rate=1000,  # 1000 req/sec globally
    scope="global",
    backend="redis"
)

@app.middleware("http")
async def global_rate_limit(request: Request, call_next):
    if not global_limiter.allow_request(None):  # None for global scope
        raise HTTPException(status_code=503, detail="Service temporarily unavailable")
    return await call_next(request)

Example 4: Multi-Layer Rate Limiting (Defense in Depth)

# Layer 1: Global limit (protect infrastructure)
global_limiter = FixedWindowRateLimiter(capacity=100000, fill_rate=10000, scope="global", backend="redis")

# Layer 2: Per-IP limit (prevent DDoS)
ip_limiter = SlidingWindowRateLimiter(capacity=1000, fill_rate=100, scope="ip", backend="redis")

# Layer 3: Per-user limit (fair usage)
user_limiter = TokenBucketLimiter(capacity=100, fill_rate=10, scope="user", backend="redis")

@app.middleware("http")
async def layered_rate_limit(request: Request, call_next):
    # Check global limit first
    if not global_limiter.allow_request(None):
        raise HTTPException(status_code=503, detail="Service overloaded")
    
    # Check IP limit
    if not ip_limiter.allow_request(request.client.host):
        raise HTTPException(status_code=429, detail="IP rate limit exceeded")
    
    # Check user limit (if authenticated)
    user_id = request.headers.get("X-User-ID")
    if user_id and not user_limiter.allow_request(user_id):
        raise HTTPException(status_code=429, detail="User rate limit exceeded")
    
    return await call_next(request)

Example 5: Custom Response with Retry-After

from fastapi.responses import JSONResponse

@app.exception_handler(HTTPException)
async def rate_limit_handler(request: Request, exc: HTTPException):
    if exc.status_code == 429:
        # Get wait time from limiter
        user_id = request.headers.get("X-User-ID") or request.client.host
        wait_time = limiter.get_wait_time(user_id)
        
        return JSONResponse(
            status_code=429,
            content={
                "error": "Rate limit exceeded",
                "retry_after_seconds": int(wait_time),
                "message": f"Please wait {int(wait_time)} seconds before retrying"
            },
            headers={"Retry-After": str(int(wait_time))}
        )
    return exc

🔧 API Reference

Common Methods (All Limiters)

`allow_request(identifier=None) -> bool`

Check if a request should be allowed.

Parameters:

identifier (str, optional): User ID, IP address, or None for global scope

Returns:

bool: True if allowed, False if rate limited

Example:

if limiter.allow_request("user_123"):
    # Process request
    pass
else:
    # Return 429
    pass

`get_status(identifier=None) -> dict`

Get current limiter status for monitoring.

Returns:

{
    "tokens_remaining": 7.5,  # Token Bucket
    "capacity": 10,
    "fill_rate": 1.0,
    "utilization_pct": 25.0
}

Example:

status = limiter.get_status("user_123")
print(f"User has {status['tokens_remaining']} requests remaining")

`get_wait_time(identifier=None) -> float`

Calculate seconds until next request would be allowed.

Returns:

float: Seconds to wait (0.0 if request would be allowed immediately)

Example:

wait = limiter.get_wait_time("user_123")
if wait > 0:
    print(f"Please wait {wait:.1f} seconds")

`reset(identifier=None) -> None`

Reset rate limit for an identifier (useful for testing or admin actions).

Example:

# Admin endpoint to reset user's rate limit
@app.post("/admin/reset-limit/{user_id}")
async def reset_user_limit(user_id: str):
    limiter.reset(user_id)
    return {"message": f"Rate limit reset for {user_id}"}

🧱 Backend Comparison

In-Memory Backend

Pros:

⚡ Ultra-fast (no network overhead)
🎯 Zero dependencies
🧪 Perfect for development/testing

Cons:

❌ Not shared across app instances
❌ Lost on restart
❌ Not suitable for production clusters

Use when:

Development/testing
Single-instance deployments
Non-critical rate limiting

Redis Backend

Pros:

🌍 Shared across all app instances
💾 Persistent across restarts
🔒 Atomic operations (race-condition free)
📈 Horizontally scalable

Cons:

🐢 ~10% slower (network latency)
🧱 Requires Redis service
💰 Additional infrastructure cost

Use when:

Production environments
Load-balanced apps (multiple instances)
Microservices architecture
When rate limits must persist across restarts

Setup Redis:

# Docker
docker run -d --name redis -p 6379:6379 redis

# Or use Redis Cloud (free tier)
# https://redis.com/try-free/

💡 Best Practices

1. Choose the Right Algorithm

# For most APIs → Sliding Window
limiter = SlidingWindowRateLimiter(...)

# For burst-heavy traffic → Token Bucket
limiter = TokenBucketLimiter(...)

# For critical operations → Sliding Window Log
limiter = SlidingWindowLogRateLimiter(...)

2. Use Appropriate Scope

# Authenticated users → per-user
scope="user"

# Public APIs → per-IP
scope="ip"

# Infrastructure protection → global
scope="global"

3. Set Realistic Limits

# Don't be too strict!
# Bad: 10 req/hour (users will be frustrated)
# Good: 1000 req/hour (generous but protective)

limiter = SlidingWindowRateLimiter(
    capacity=1000,
    fill_rate=1000/3600,  # 1000 per hour
    scope="user",
    backend="redis"
)

4. Always Include Retry-After Header

if not limiter.allow_request(user_id):
    wait_time = limiter.get_wait_time(user_id)
    raise HTTPException(
        status_code=429,
        headers={"Retry-After": str(int(wait_time))}
    )

5. Monitor Your Rate Limiters

@app.get("/metrics/rate-limits")
async def get_rate_limit_metrics():
    return {
        "global": global_limiter.get_status(None),
        "sample_user": user_limiter.get_status("user_123")
    }

🧪 Testing

Run All Tests

# Run comprehensive test suite
python tests/all_limiter_test.py

# Or with pytest
pytest tests/

Test Coverage

✅ All 6 algorithms
✅ All 3 scopes (global, user, IP)
✅ Both backends (memory, Redis)
✅ Concurrent access (thread safety)
✅ Multi-layer rate limiting
✅ Total: 72 test scenarios

Example Test Output

==========================================================================================
  FINAL COMPARISON SUMMARY
==========================================================================================
Limiter                   Backend       Allowed    Blocked     Rate/s   Success%       
------------------------------------------------------------------------------------------
Fixed Window              memory            168        145     15.99      53.7%        
Sliding Window            redis             153         80     14.32      65.7%  ⭐      
Sliding Window Log        redis             155         95     14.58      62.0%        
Token Bucket              memory             50        262      4.76      16.0%        

🏆 Best Performance: Sliding Window (redis): 14.32 req/s with 65.7% success rate

🤝 Contributing

We welcome contributions! Here's how:

Fork the repo
Create a feature branch: git checkout -b feature/amazing-feature
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

Development Setup

git clone https://github.com/awais7012/FastAPI-RateLimiter.git
cd FastAPI-RateLimiter
pip install -e ".[dev]"

Code Style

Follow PEP 8
Add docstrings to all functions
Include type hints
Write tests for new features

🐛 Troubleshooting

Redis Connection Errors

# Problem: redis.exceptions.ConnectionError
# Solution: Ensure Redis is running

# Check Redis
docker ps | grep redis

# Or start Redis
docker run -d --name redis -p 6379:6379 redis

Rate Limit Not Working Across Instances

# Problem: Rate limits don't work across multiple app instances
# Solution: Use Redis backend, not memory

# ❌ Wrong (memory backend)
limiter = SlidingWindowRateLimiter(..., backend="memory")

# ✅ Correct (Redis backend)
limiter = SlidingWindowRateLimiter(..., backend="redis", redis_client=redis_client)

Import Errors

# Problem: ModuleNotFoundError: No module named 'fastapi_advanced_rate_limiter'
# Solution: Install the package from PyPI

pip install fastapi-advanced-rate-limiter

# Or if you cloned the repo
pip install -e .

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Inspired by FastAPI documentation style
Rate limiting algorithms based on industry best practices
Built with ❤️ by Ahmed Awais (Romeo)

📞 Support

🐛 Bug Reports: GitHub Issues
💬 Discussions: GitHub Discussions
📧 Email: ahmadawaisgithub@gmail.com
📦 PyPI: fastapi-advanced-rate-limiter

Built for developers who love clean, scalable FastAPI tooling.
⭐ Star us on GitHub if you find this useful!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.1.0

Dec 22, 2025

2.0.0

Oct 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastapi_advanced_rate_limiter-2.1.0.tar.gz (24.7 kB view details)

Uploaded Dec 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl (22.5 kB view details)

Uploaded Dec 22, 2025 Python 3

File details

Details for the file fastapi_advanced_rate_limiter-2.1.0.tar.gz.

File metadata

Download URL: fastapi_advanced_rate_limiter-2.1.0.tar.gz
Upload date: Dec 22, 2025
Size: 24.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for fastapi_advanced_rate_limiter-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8dfe57000eda5aa39b0a4634a066d299aa56a4033c0017b3c399b4c7d437f751`
MD5	`46e81dcf79b18222838caf691c49272d`
BLAKE2b-256	`498cab5cd8040a1f3d36c401567bcd959b82cf8925f0fda2eca01a03f5618748`

See more details on using hashes here.

File details

Details for the file fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl.

File metadata

Download URL: fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl
Upload date: Dec 22, 2025
Size: 22.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`acc108ce7d0d3fcc2db458ccc52a9b7072415c947df95b390e95a9bf9f4aa6c0`
MD5	`661a70e73660052564285d3364113adb`
BLAKE2b-256	`9240dcc81efdceed062088175f88a55c563f1bb54e28644e9dbbd64249f60f99`

See more details on using hashes here.

fastapi-advanced-rate-limiter 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

⚡ FastAPI Advanced Rate Limiter

🎯 Key Features

📚 Table of Contents

💡 Why Rate Limiting?

⚡ Quick Start

📦 Installation

From PyPI (Recommended)

With Redis Support

From Source (for Contributors)

With Development Tools

Requirements

🧮 Algorithms Guide

🪙 Token Bucket — Best for APIs with occasional bursts

💧 Leaky Bucket — Best for traffic shaping

⏱️ Fixed Window — Fastest, simplest

🪟 Sliding Window — Best balance (RECOMMENDED) ⭐

📜 Sliding Window Log — Most accurate

📦 Queue-based Limiter

📊 Performance Benchmarks

🎯 Usage Examples

Example 1: Per-User Rate Limiting

Example 2: Per-IP Rate Limiting (for public endpoints)

Example 3: Global Rate Limiting

Example 4: Multi-Layer Rate Limiting (Defense in Depth)

Example 5: Custom Response with Retry-After

🔧 API Reference

Common Methods (All Limiters)

allow_request(identifier=None) -> bool

get_status(identifier=None) -> dict

get_wait_time(identifier=None) -> float

reset(identifier=None) -> None

🧱 Backend Comparison

In-Memory Backend

Redis Backend

💡 Best Practices

1. Choose the Right Algorithm

2. Use Appropriate Scope

3. Set Realistic Limits

4. Always Include Retry-After Header

5. Monitor Your Rate Limiters

🧪 Testing

Run All Tests

Test Coverage

Example Test Output

🤝 Contributing

Development Setup

Code Style

🐛 Troubleshooting

Redis Connection Errors

Rate Limit Not Working Across Instances

Import Errors

📄 License

🙏 Acknowledgments

📞 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`allow_request(identifier=None) -> bool`

`get_status(identifier=None) -> dict`

`get_wait_time(identifier=None) -> float`

`reset(identifier=None) -> None`