High-performance, multi-algorithm rate limiting for FastAPI with Redis and in-memory backends
Project description
โก FastAPI Advanced Rate Limiter
High-performance, production-ready rate limiting for FastAPI applications.
FastAPI Advanced Rate Limiter is a battle-tested library providing 6 different rate limiting algorithms with support for both in-memory and Redis backends. Perfect for APIs, microservices, and any FastAPI application that needs protection from abuse, overload, or DDoS attacks.
๐ฏ Key Features
- ๐ 6 Production-Ready Algorithms โ Token Bucket, Leaky Bucket, Queue-based, Fixed Window, Sliding Window, and Sliding Window Log
- ๐ Dual Backend Support โ In-memory (zero dependencies) or Redis (distributed, cluster-friendly)
- ๐๏ธ Flexible Scoping โ Global, per-user, or per-IP rate limiting
- ๐ Thread-Safe โ Per-key locks and atomic Redis operations
- โก High Performance โ Benchmarked at 15+ req/s for window-based algorithms
- ๐ Rich Monitoring โ
get_status(),get_wait_time(),get_retry_after()helpers - ๐งช Comprehensive Tests โ 12 test scenarios covering all algorithms and backends
- ๐ FastAPI-Style Docs โ Clear examples and tutorials
๐ Table of Contents
- Why Rate Limiting?
- Quick Start
- Installation
- Algorithms Guide
- Usage Examples
- Performance Benchmarks
- API Reference
- Backend Comparison
- Best Practices
- Testing
- Contributing
- License
๐ก Why Rate Limiting?
Rate limiting is essential for:
- ๐ก๏ธ Preventing Abuse โ Stop malicious users from overwhelming your API
- โ๏ธ Fair Usage โ Ensure all users get equal access to resources
- ๐ฐ Cost Control โ Prevent unexpected bills from cloud services
- ๐ฏ SLA Compliance โ Meet service level agreements and uptime guarantees
- ๐ฆ Traffic Shaping โ Smooth out traffic spikes to protect downstream services
Without rate limiting, a single user can:
- Consume all server resources
- Trigger cascading failures
- Generate massive cloud bills
- Degrade service for legitimate users
โก Quick Start
30 seconds to your first rate limiter:
from fastapi import FastAPI, Request, HTTPException
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter
import redis
app = FastAPI()
# Initialize rate limiter
redis_client = redis.Redis.from_url("redis://localhost:6379", decode_responses=True)
limiter = SlidingWindowRateLimiter(
capacity=100, # 100 requests
fill_rate=10, # per 10 seconds
scope="user", # per-user limits
backend="redis", # distributed
redis_client=redis_client
)
@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
# Get user identifier (from auth, IP, etc.)
user_id = request.headers.get("X-User-ID") or request.client.host
# Check rate limit
if not limiter.allow_request(user_id):
wait_time = limiter.get_wait_time(user_id)
raise HTTPException(
status_code=429,
detail=f"Rate limit exceeded. Retry after {wait_time:.0f} seconds.",
headers={"Retry-After": str(int(wait_time))}
)
return await call_next(request)
@app.get("/")
async def root():
return {"message": "Hello World!"}
Run it:
uvicorn main:app --reload
Test it:
# Make requests
curl http://localhost:8000/
# After 100 requests in 10 seconds, you'll get:
# {"detail":"Rate limit exceeded. Retry after 5 seconds."}
๐ฆ Installation
From PyPI (Recommended)
pip install fastapi-advanced-rate-limiter
With Redis Support
pip install fastapi-advanced-rate-limiter redis
From Source (for Contributors)
If you want to contribute or modify the package:
git clone https://github.com/awais7012/FastAPI-RateLimiter.git
cd FastAPI-RateLimiter
pip install -e .
With Development Tools
For contributors who want to run tests and development tools:
pip install -e ".[dev]"
Requirements
- Python: 3.8+
- FastAPI: 0.68.0+
- Redis (optional): 4.0.0+ (install separately if using Redis backend)
๐งฎ Algorithms Guide
Choose the right algorithm for your use case:
๐ช Token Bucket โ Best for APIs with occasional bursts
How it works: Imagine a bucket that slowly fills with tokens. Each request takes a token. If the bucket is empty, requests are denied.
from fastapi_advanced_rate_limiter import TokenBucketLimiter
limiter = TokenBucketLimiter(
capacity=10, # Bucket holds 10 tokens (burst size)
fill_rate=1.0, # Refill 1 token per second
scope="user",
backend="memory"
)
Behavior:
- Allows bursts up to
capacity - Maintains long-term average rate of
fill_rate - Good for file uploads, batch operations
Pros: โ
Flexible, allows bursts
Cons: โ ๏ธ Can be "drained" by burst traffic
Use when: User-triggered actions (file uploads, form submissions)
๐ง Leaky Bucket โ Best for traffic shaping
How it works: Requests fill a bucket with a hole at the bottom. The bucket leaks at a constant rate. Overflow = rejected.
from fastapi_advanced_rate_limiter import LeakyBucketLimiter
limiter = LeakyBucketLimiter(
capacity=5, # Bucket capacity
fill_rate=1.0, # Leak rate (1 req/sec)
scope="user",
backend="memory"
)
Behavior:
- Enforces smooth, consistent output rate
- No burst allowance
- Perfect for calling external APIs
Pros: โ
Smooth, predictable rate
Cons: โ Strict (can frustrate users with spiky traffic)
Use when: Calling rate-limited external APIs, message queues
โฑ๏ธ Fixed Window โ Fastest, simplest
How it works: Count requests in fixed time windows (e.g., per minute). Reset count at window boundaries.
from fastapi_advanced_rate_limiter import FixedWindowRateLimiter
limiter = FixedWindowRateLimiter(
capacity=1000, # 1000 requests
fill_rate=100, # per 10 seconds (capacity/fill_rate)
scope="global",
backend="redis"
)
Behavior:
- Simple counter with time-based reset
- Very fast (Redis INCR operation)
- Can get 2x capacity at window boundaries
Pros: โ
Fastest, O(1), minimal memory
Cons: โ ๏ธ Boundary burst vulnerability
Use when: High-throughput internal APIs, coarse-grained limits
๐ช Sliding Window โ Best balance (RECOMMENDED) โญ
How it works: Combines current and previous window counts with a weighted average to smooth transitions.
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter
limiter = SlidingWindowRateLimiter(
capacity=100,
fill_rate=10,
scope="user",
backend="redis" # Redis version performs best!
)
Behavior:
- Prevents boundary bursts
- Still O(1) time complexity
- Smooth rate limiting without fixed window issues
Pros: โ
Best balance of accuracy and performance
Cons: โ ๏ธ Slight approximation (but negligible in practice)
Use when: Production APIs, user-facing services (RECOMMENDED for most cases)
๐ Sliding Window Log โ Most accurate
How it works: Logs exact timestamp of each request. Prunes old timestamps outside the window.
from fastapi_advanced_rate_limiter import SlidingWindowLogRateLimiter
limiter = SlidingWindowLogRateLimiter(
capacity=10,
fill_rate=10/300, # 10 per 5 minutes
scope="ip",
backend="redis"
)
Behavior:
- Perfect accuracy (no approximation)
- Uses Redis sorted sets for efficiency
- O(n) time complexity
Pros: โ
Perfect accuracy, no boundary issues
Cons: โ Higher memory usage, not suitable for huge scale
Use when: Security-critical endpoints (login, payment), billing APIs
๐ฆ Queue-based Limiter
How it works: Maintains a queue of recent request timestamps. Similar to Sliding Window Log.
from fastapi_advanced_rate_limiter import QueueLimiter
limiter = QueueLimiter(
capacity=50,
fill_rate=5.0,
scope="global",
backend="memory"
)
Behavior:
- Fair queuing
- Predictable wait times
Use when: Background job processing, task queues
๐ Performance Benchmarks
Test Setup: 5 concurrent users, 10-second tests, mixed traffic patterns
| Algorithm | Memory (req/s) | Redis (req/s) | Success Rate | Best For |
|---|---|---|---|---|
| Fixed Window | 15.99 ๐ฅ | 15.33 | 53-59% | High throughput |
| Sliding Window | 15.03 | 14.32 | 51-66% | Production (best balance) โญ |
| Sliding Window Log | 15.41 | 14.58 | 52-67% | Accuracy-critical |
| Token Bucket | 4.76 | 3.69 | 14-16% | Burst-friendly APIs |
| Leaky Bucket | 4.76 | 2.93 | 10-16% | Traffic shaping |
| Queue Limiter | 2.76 | 3.03 | 9-12% | Fair queuing |
Key Findings:
- Window-based algorithms are 3-4x faster than token/leaky bucket
- Redis adds ~10% overhead but enables distributed rate limiting
- Sliding Window with Redis has 65.7% success rate (best balanced algorithm)
Winner: ๐ Sliding Window (Redis) โ Best for production APIs
๐ฏ Usage Examples
Example 1: Per-User Rate Limiting
from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import HTTPBearer
from fastapi_advanced_rate_limiter import SlidingWindowRateLimiter
app = FastAPI()
security = HTTPBearer()
limiter = SlidingWindowRateLimiter(capacity=100, fill_rate=10, scope="user", backend="redis")
def get_current_user(token: str = Depends(security)):
# Your auth logic here
return {"user_id": "user_123"}
@app.get("/api/data")
async def get_data(user = Depends(get_current_user)):
if not limiter.allow_request(user["user_id"]):
raise HTTPException(status_code=429, detail="Rate limit exceeded")
return {"data": "Your protected data"}
Example 2: Per-IP Rate Limiting (for public endpoints)
from fastapi import FastAPI, Request, HTTPException
from fastapi_advanced_rate_limiter import FixedWindowRateLimiter
app = FastAPI()
limiter = FixedWindowRateLimiter(capacity=1000, fill_rate=100, scope="ip", backend="redis")
@app.get("/public/api")
async def public_endpoint(request: Request):
client_ip = request.client.host
if not limiter.allow_request(client_ip):
raise HTTPException(
status_code=429,
detail="Too many requests from your IP",
headers={"Retry-After": "60"}
)
return {"message": "Public data"}
Example 3: Global Rate Limiting
from fastapi_advanced_rate_limiter import LeakyBucketLimiter
# Limit total API traffic
global_limiter = LeakyBucketLimiter(
capacity=10000,
fill_rate=1000, # 1000 req/sec globally
scope="global",
backend="redis"
)
@app.middleware("http")
async def global_rate_limit(request: Request, call_next):
if not global_limiter.allow_request(None): # None for global scope
raise HTTPException(status_code=503, detail="Service temporarily unavailable")
return await call_next(request)
Example 4: Multi-Layer Rate Limiting (Defense in Depth)
# Layer 1: Global limit (protect infrastructure)
global_limiter = FixedWindowRateLimiter(capacity=100000, fill_rate=10000, scope="global", backend="redis")
# Layer 2: Per-IP limit (prevent DDoS)
ip_limiter = SlidingWindowRateLimiter(capacity=1000, fill_rate=100, scope="ip", backend="redis")
# Layer 3: Per-user limit (fair usage)
user_limiter = TokenBucketLimiter(capacity=100, fill_rate=10, scope="user", backend="redis")
@app.middleware("http")
async def layered_rate_limit(request: Request, call_next):
# Check global limit first
if not global_limiter.allow_request(None):
raise HTTPException(status_code=503, detail="Service overloaded")
# Check IP limit
if not ip_limiter.allow_request(request.client.host):
raise HTTPException(status_code=429, detail="IP rate limit exceeded")
# Check user limit (if authenticated)
user_id = request.headers.get("X-User-ID")
if user_id and not user_limiter.allow_request(user_id):
raise HTTPException(status_code=429, detail="User rate limit exceeded")
return await call_next(request)
Example 5: Custom Response with Retry-After
from fastapi.responses import JSONResponse
@app.exception_handler(HTTPException)
async def rate_limit_handler(request: Request, exc: HTTPException):
if exc.status_code == 429:
# Get wait time from limiter
user_id = request.headers.get("X-User-ID") or request.client.host
wait_time = limiter.get_wait_time(user_id)
return JSONResponse(
status_code=429,
content={
"error": "Rate limit exceeded",
"retry_after_seconds": int(wait_time),
"message": f"Please wait {int(wait_time)} seconds before retrying"
},
headers={"Retry-After": str(int(wait_time))}
)
return exc
๐ง API Reference
Common Methods (All Limiters)
allow_request(identifier=None) -> bool
Check if a request should be allowed.
Parameters:
identifier(str, optional): User ID, IP address, or None for global scope
Returns:
bool: True if allowed, False if rate limited
Example:
if limiter.allow_request("user_123"):
# Process request
pass
else:
# Return 429
pass
get_status(identifier=None) -> dict
Get current limiter status for monitoring.
Returns:
{
"tokens_remaining": 7.5, # Token Bucket
"capacity": 10,
"fill_rate": 1.0,
"utilization_pct": 25.0
}
Example:
status = limiter.get_status("user_123")
print(f"User has {status['tokens_remaining']} requests remaining")
get_wait_time(identifier=None) -> float
Calculate seconds until next request would be allowed.
Returns:
float: Seconds to wait (0.0 if request would be allowed immediately)
Example:
wait = limiter.get_wait_time("user_123")
if wait > 0:
print(f"Please wait {wait:.1f} seconds")
reset(identifier=None) -> None
Reset rate limit for an identifier (useful for testing or admin actions).
Example:
# Admin endpoint to reset user's rate limit
@app.post("/admin/reset-limit/{user_id}")
async def reset_user_limit(user_id: str):
limiter.reset(user_id)
return {"message": f"Rate limit reset for {user_id}"}
๐งฑ Backend Comparison
In-Memory Backend
Pros:
- โก Ultra-fast (no network overhead)
- ๐ฏ Zero dependencies
- ๐งช Perfect for development/testing
Cons:
- โ Not shared across app instances
- โ Lost on restart
- โ Not suitable for production clusters
Use when:
- Development/testing
- Single-instance deployments
- Non-critical rate limiting
Redis Backend
Pros:
- ๐ Shared across all app instances
- ๐พ Persistent across restarts
- ๐ Atomic operations (race-condition free)
- ๐ Horizontally scalable
Cons:
- ๐ข ~10% slower (network latency)
- ๐งฑ Requires Redis service
- ๐ฐ Additional infrastructure cost
Use when:
- Production environments
- Load-balanced apps (multiple instances)
- Microservices architecture
- When rate limits must persist across restarts
Setup Redis:
# Docker
docker run -d --name redis -p 6379:6379 redis
# Or use Redis Cloud (free tier)
# https://redis.com/try-free/
๐ก Best Practices
1. Choose the Right Algorithm
# For most APIs โ Sliding Window
limiter = SlidingWindowRateLimiter(...)
# For burst-heavy traffic โ Token Bucket
limiter = TokenBucketLimiter(...)
# For critical operations โ Sliding Window Log
limiter = SlidingWindowLogRateLimiter(...)
2. Use Appropriate Scope
# Authenticated users โ per-user
scope="user"
# Public APIs โ per-IP
scope="ip"
# Infrastructure protection โ global
scope="global"
3. Set Realistic Limits
# Don't be too strict!
# Bad: 10 req/hour (users will be frustrated)
# Good: 1000 req/hour (generous but protective)
limiter = SlidingWindowRateLimiter(
capacity=1000,
fill_rate=1000/3600, # 1000 per hour
scope="user",
backend="redis"
)
4. Always Include Retry-After Header
if not limiter.allow_request(user_id):
wait_time = limiter.get_wait_time(user_id)
raise HTTPException(
status_code=429,
headers={"Retry-After": str(int(wait_time))}
)
5. Monitor Your Rate Limiters
@app.get("/metrics/rate-limits")
async def get_rate_limit_metrics():
return {
"global": global_limiter.get_status(None),
"sample_user": user_limiter.get_status("user_123")
}
๐งช Testing
Run All Tests
# Run comprehensive test suite
python tests/all_limiter_test.py
# Or with pytest
pytest tests/
Test Coverage
- โ All 6 algorithms
- โ All 3 scopes (global, user, IP)
- โ Both backends (memory, Redis)
- โ Concurrent access (thread safety)
- โ Multi-layer rate limiting
- โ Total: 72 test scenarios
Example Test Output
==========================================================================================
FINAL COMPARISON SUMMARY
==========================================================================================
Limiter Backend Allowed Blocked Rate/s Success%
------------------------------------------------------------------------------------------
Fixed Window memory 168 145 15.99 53.7%
Sliding Window redis 153 80 14.32 65.7% โญ
Sliding Window Log redis 155 95 14.58 62.0%
Token Bucket memory 50 262 4.76 16.0%
๐ Best Performance: Sliding Window (redis): 14.32 req/s with 65.7% success rate
๐ค Contributing
We welcome contributions! Here's how:
- Fork the repo
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Development Setup
git clone https://github.com/awais7012/FastAPI-RateLimiter.git
cd FastAPI-RateLimiter
pip install -e ".[dev]"
Code Style
- Follow PEP 8
- Add docstrings to all functions
- Include type hints
- Write tests for new features
๐ Troubleshooting
Redis Connection Errors
# Problem: redis.exceptions.ConnectionError
# Solution: Ensure Redis is running
# Check Redis
docker ps | grep redis
# Or start Redis
docker run -d --name redis -p 6379:6379 redis
Rate Limit Not Working Across Instances
# Problem: Rate limits don't work across multiple app instances
# Solution: Use Redis backend, not memory
# โ Wrong (memory backend)
limiter = SlidingWindowRateLimiter(..., backend="memory")
# โ
Correct (Redis backend)
limiter = SlidingWindowRateLimiter(..., backend="redis", redis_client=redis_client)
Import Errors
# Problem: ModuleNotFoundError: No module named 'fastapi_advanced_rate_limiter'
# Solution: Install the package from PyPI
pip install fastapi-advanced-rate-limiter
# Or if you cloned the repo
pip install -e .
๐ License
MIT License - see LICENSE file for details.
๐ Acknowledgments
- Inspired by FastAPI documentation style
- Rate limiting algorithms based on industry best practices
- Built with โค๏ธ by Ahmed Awais (Romeo)
๐ Support
- ๐ Bug Reports: GitHub Issues
- ๐ฌ Discussions: GitHub Discussions
- ๐ง Email: ahmadawaisgithub@gmail.com
- ๐ฆ PyPI: fastapi-advanced-rate-limiter
Built for developers who love clean, scalable FastAPI tooling.
โญ Star us on GitHub if you find this useful!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastapi_advanced_rate_limiter-2.1.0.tar.gz.
File metadata
- Download URL: fastapi_advanced_rate_limiter-2.1.0.tar.gz
- Upload date:
- Size: 24.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8dfe57000eda5aa39b0a4634a066d299aa56a4033c0017b3c399b4c7d437f751
|
|
| MD5 |
46e81dcf79b18222838caf691c49272d
|
|
| BLAKE2b-256 |
498cab5cd8040a1f3d36c401567bcd959b82cf8925f0fda2eca01a03f5618748
|
File details
Details for the file fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl.
File metadata
- Download URL: fastapi_advanced_rate_limiter-2.1.0-py3-none-any.whl
- Upload date:
- Size: 22.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acc108ce7d0d3fcc2db458ccc52a9b7072415c947df95b390e95a9bf9f4aa6c0
|
|
| MD5 |
661a70e73660052564285d3364113adb
|
|
| BLAKE2b-256 |
9240dcc81efdceed062088175f88a55c563f1bb54e28644e9dbbd64249f60f99
|