Rate-Limit Plugin

The Rate-Limit plugin implements token bucket rate limiting to protect your server from abuse and ensure fair resource usage. It tracks request rates per client and blocks excessive requests with appropriate HTTP headers.

Overview

Rate limiting is essential for protecting web services from abuse, preventing DoS attacks, and ensuring fair resource allocation among users. This plugin uses the token bucket algorithm, which allows burst traffic while maintaining a sustainable request rate over time. Clients are identified by IP address, authenticated user, or other configurable strategies.

Key Features

Token bucket algorithm: Allows bursts while maintaining average rate
Multiple key strategies: Rate limit by IP, user, or host
Configurable limits: Set requests per second and burst capacity
Standard HTTP responses: Returns 429 Too Many Requests with Retry-After
Rate limit headers: Informs clients about their current limits
Automatic cleanup: Removes old rate limit buckets to save memory

Configuration

The rate-limit plugin is configured as part of the plugin pipeline in your host configuration:

<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/rate-limit.so</span>
    <meta itemprop="requests_per_second" content="10">
    <meta itemprop="burst_capacity" content="20">
    <meta itemprop="key_strategy" content="ip">
</li>

Configuration Parameters

Parameter	Type	Required	Default	Description
`requests_per_second`	Float	No	10.0	Sustained request rate (tokens added per second)
`burst_capacity`	Float	No	2x requests_per_second	Maximum burst size (bucket capacity)
`key_strategy`	String	No	"ip"	How to identify clients: "ip", "user", or "host"
`cleanup_interval`	Integer	No	300	Seconds between cleanup cycles

Token Bucket Algorithm

The token bucket algorithm works as follows:

Each client gets a bucket with a capacity (burst_capacity)
Tokens are added at a constant rate (requests_per_second)
Each request consumes one token
If no tokens are available, the request is rejected
Unused tokens accumulate up to the bucket capacity

Example: With 10 requests/second and burst capacity of 20:

A client can make 20 requests immediately (burst)
Then must wait for tokens to refill at 10/second
After 1 second of no requests, they can make 10 more

Key Strategies

IP-Based (default)

Rate limits are applied per IP address:

Extracts IP from X-Forwarded-For or X-Real-IP headers
Falls back to connection IP if headers absent
Best for public APIs and general protection

User-Based

Rate limits are applied per authenticated user:

Uses authenticated_user metadata from auth plugins
Falls back to IP for unauthenticated requests
Ideal for authenticated APIs

Host-Based

Rate limits are applied per Host header:

Uses the Host header from the request
Useful for multi-tenant applications
Allows different limits per virtual host

Plugin Pipeline Placement

Important: Rate limiting placement depends on your protection goals:

Before auth plugins: Protect against authentication brute force
After auth plugins: Rate limit authenticated users differently

Typical pipeline orders:

Protect Everything (Including Auth)

1. rate-limit.so      → Rate limit all requests ✓
2. basic-auth.so      → Authentication
3. file-handler.so    → Content serving

Rate Limit Authenticated Users

1. basic-auth.so      → Authentication
2. rate-limit.so      → Rate limit by user ✓
3. file-handler.so    → Content serving

HTTP Headers

Response Headers (Successful Requests)

Header	Description	Example
`X-RateLimit-Limit`	Maximum requests in current window	20
`X-RateLimit-Remaining`	Requests remaining in current window	15
`X-RateLimit-Reset`	Seconds until limit resets	60

Rate Limited Response (429)

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 10

{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again later.",
  "retry_after_seconds": 10
}

Examples

Basic Rate Limiting

<!-- 10 requests per second per IP -->
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/rate-limit.so</span>
</li>

Aggressive Rate Limiting

<!-- 1 request per second, burst of 3 -->
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/rate-limit.so</span>
    <meta itemprop="requests_per_second" content="1">
    <meta itemprop="burst_capacity" content="3">
</li>

User-Based Rate Limiting

<!-- Rate limit authenticated users -->
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/basic-auth.so</span>
    <meta itemprop="authfile" content="file://./users.html">
</li>
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/rate-limit.so</span>
    <meta itemprop="key_strategy" content="user">
    <meta itemprop="requests_per_second" content="100">
</li>

API with Different Limits

<!-- Public API: strict limits -->
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/directory.so</span>
    <meta itemprop="directory" content="/api/public">
</li>
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/rate-limit.so</span>
    <meta itemprop="requests_per_second" content="10">
</li>

<!-- Premium API: relaxed limits -->
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/directory.so</span>
    <meta itemprop="directory" content="/api/premium">
</li>
<li itemprop="plugin" itemscope itemtype="https://rustybeam.net/schema/Plugin">
    <span itemprop="library">file://./plugins/rate-limit.so</span>
    <meta itemprop="requests_per_second" content="1000">
</li>

Testing Rate Limits

Bash Script

#!/bin/bash
# Test rate limiting

URL="http://localhost:3000/api/test"

echo "Making 25 requests rapidly..."
for i in {1..25}; do
    response=$(curl -s -w "\\n%{http_code}" "$URL")
    code=$(echo "$response" | tail -1)
    
    if [ "$code" = "429" ]; then
        echo "Request $i: RATE LIMITED (429)"
        retry=$(curl -s -I "$URL" | grep -i "retry-after" | awk '{print $2}')
        echo "  Retry after: ${retry}s"
    else
        echo "Request $i: SUCCESS ($code)"
    fi
    
    # Small delay to see the progression
    sleep 0.1
done

JavaScript Client

async function testRateLimit() {
    const url = 'http://localhost:3000/api/data';
    let successCount = 0;
    let rateLimitCount = 0;
    
    for (let i = 0; i < 25; i++) {
        try {
            const response = await fetch(url);
            
            if (response.status === 429) {
                rateLimitCount++;
                const retryAfter = response.headers.get('Retry-After');
                console.log(`Rate limited! Retry after ${retryAfter}s`);
                
                // Parse rate limit info
                const data = await response.json();
                console.log(data);
            } else {
                successCount++;
                const remaining = response.headers.get('X-RateLimit-Remaining');
                console.log(`Success! Remaining: ${remaining}`);
            }
        } catch (error) {
            console.error('Request failed:', error);
        }
        
        // Small delay between requests
        await new Promise(resolve => setTimeout(resolve, 100));
    }
    
    console.log(`Results: ${successCount} successful, ${rateLimitCount} rate limited`);
}

testRateLimit();

Python with Retry Logic

import requests
import time

def make_request_with_retry(url, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(url)
        
        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 10))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
        else:
            return response
    
    return None

# Test the rate limiter
url = 'http://localhost:3000/api/data'

for i in range(30):
    response = make_request_with_retry(url)
    if response and response.status_code == 200:
        remaining = response.headers.get('X-RateLimit-Remaining')
        print(f"Request {i+1}: Success (Remaining: {remaining})")
    else:
        print(f"Request {i+1}: Failed after retries")

Best Practices

Set realistic limits: Balance protection with usability
Monitor rate limiting: Track 429 responses to tune limits
Document limits: Inform API users about rate limits
Implement client backoff: Clients should respect Retry-After
Consider user tiers: Different limits for different user types
Whitelist critical services: Health checks, monitoring, etc.

Performance Considerations

Memory usage: Each tracked client uses ~64 bytes
Cleanup cycles: Old buckets removed every 5 minutes
Lock contention: Uses mutex for thread safety
CPU overhead: Minimal - simple arithmetic operations

Capacity Planning

Memory usage estimation:

Memory = (Number of unique clients) × 64 bytes

Examples:
- 10,000 clients = ~640 KB
- 100,000 clients = ~6.4 MB
- 1,000,000 clients = ~64 MB

Security Considerations

IP spoofing: Use trusted proxy headers only
Distributed attacks: IP-based limiting may not stop botnets
Shared IPs: Corporate NATs may share limits
Header manipulation: Validate X-Forwarded-For chain

Troubleshooting

Common Issues

Issue	Cause	Solution
All requests rate limited	Limits too low or shared IP	Increase limits or use different key strategy
No rate limiting occurring	Plugin not in pipeline or limits too high	Check configuration and lower limits for testing
Wrong client identification	Proxy headers not configured	Ensure proxy sends X-Forwarded-For
Memory growth	Cleanup not running	Check cleanup_interval setting
Inconsistent limits	Multiple rate limit plugins	Use single plugin or coordinate limits

Debug Logging

Run the server with -v flag to see rate limiting decisions:

./rusty-beam -v config.html

[RateLimit] Request blocked for key: 192.168.1.100 (retry after: Some(10s))

Integration with Other Plugins

Basic-Auth Plugin

When using user-based rate limiting, place after basic-auth to access authenticated_user metadata.

Directory Plugin

Use with directory plugin to apply different rate limits to different paths.

Access-Log Plugin

The access-log plugin will log 429 responses, helping monitor rate limit effectiveness.