Rate Limiting: Bảo Vệ API Khỏi DDoS, Brute-force 2024

Rate limiting là kỹ thuật bảo vệ API và website khỏi abuse, brute-force attack và DDoS. Không có rate limiting, một client duy nhất có thể gửi 10,000 requests/giây và làm sập server của bạn. Theo OWASP, rate limiting là một trong 10 lỗ hổng bảo mật API phổ biến nhất. Bài viết này hướng dẫn chi tiết về rate limiting: thuật toán, cấu hình Nginx/Redis, và best practices để bảo vệ API của bạn.

Nếu bạn đang xây dựng API công khai (public API) hoặc SaaS product, rate limiting là tính năng bắt buộc. Không chỉ vì bảo mật, mà còn vì economics — không limit = runaway costs. AWS API Gateway tính phí theo số requests, và một client abuse có thể khiến bill tăng gấp 100 lần.

Rate Limiting là gì? Tại sao quan trọng?

Rate limiting là kỹ thuật giới hạn số lượng requests mà một client có thể gửi trong một khoảng thời gian nhất định. Ví dụ: 100 requests/phút cho mỗi IP, hoặc 1000 requests/giờ cho mỗi API key.

Prevent abuse: Một client không thể consume toàn bộ tài nguyên server. Đây là use case phổ biến nhất của rate limiting.
Cost control: API của bạn có pricing dựa trên request count. Không limit = runaway costs. Với AWS/GCP API Gateway, mỗi request đều tính tiền.
Fairness: Đảm bảo tất cả users đều có access ổn định. Không có client nào chiếm toàn bộ bandwidth.
Security: Ngăn chặn brute-force login, credential stuffing, scraping, và DDoS attacks. Đây là lý do chính để implement rate limiting.
System stability: Smooth spikes, tránh cascade failure khi traffic tăng đột biến. Rate limiting giúp hệ thống chịu được traffic cao hơn.

6 thuật toán Rate Limiting phổ biến nhất

Có nhiều thuật toán rate limiting, mỗi cái có trade-offs khác nhau. Dưới đây là 6 thuật toán phổ biến nhất, từ đơn giản đến phức tạp.

1. Fixed Window Counter

Đếm số request trong một khoảng thời gian cố định (ví dụ: 100 req/phút). Đây là thuật toán rate limiting đơn giản nhất, nhưng có edge case: 00:59 (59 req) + 01:01 (59 req) = 118 req trong 1 phút, vượt limit.

# Redis implementation (Lua script) cho Fixed Window
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local current = redis.call('INCR', key)
if current == 1 then
  redis.call('EXPIRE', key, window)
end
if current > limit then
  return 0 -- rejected
end
return 1 -- allowed

2. Sliding Window Log

Chính xác hơn Fixed Window. Lưu timestamp mỗi request, reject nếu tổng timestamps trong window vượt limit. Độ chính xác cao nhưng tốn memory — không phù hợp cho high-traffic APIs.

-- Redis sorted set approach cho Sliding Window
local key = 'rate:' .. ARGV[1]
local now = tonumber(ARGV[2])
local window = tonumber(ARGV[3]) * 1000
local limit = tonumber(ARGV[4])

redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
local count = redis.call('ZCARD', key)
if count >= limit then
  return 0
end
redis.call('ZADD', key, now, now)
redis.call('EXPIRE', key, window / 1000)
return 1

3. Token Bucket (Đề xuất)

Thuật toán rate limiting phổ biến nhất. Mỗi client có bucket chứa tokens. Mỗi request tiêu tốn 1 token. Bucket refill với tốc độ cố định. Cho phép burst (dùng stored tokens) nhưng average rate không vượt limit. Đây là thuật toán rate limiting được recommend cho hầu hết use cases.

-- Token bucket Lua script cho Redis
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local rate = tonumber(ARGV[2]) -- tokens per second
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local data = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(data[1]) or limit
local last_refill = tonumber(data[2]) or now

local elapsed = math.max(0, now - last_refill)
local refill = elapsed * rate
tokens = math.min(limit, tokens + refill)

local allowed = 0
if tokens >= requested then
  tokens = tokens - requested
  allowed = 1
end

redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, 3600)
return allowed

4. Leaky Bucket

Requests được xử lý ở tốc độ cố định. Overflow bị dropped. Đây là thuật toán rate limiting dùng trong Nginx (limit_req). Phù hợp cho việc smooth out traffic spikes.

5. Sliding Window Counter

Kết hợp Fixed Window và Sliding Window. Dùng weighted average để tính requests trong sliding window. Memory-efficient hơn Sliding Window Log, chính xác hơn Fixed Window.

6. Adaptive Rate Limiting

Rate limit thay đổi dựa trên system load. Khi server busy, limit giảm. Khi server idle, limit tăng. Đây là thuật toán rate limiting nâng cao cho microservices architectures.

Cấu hình Nginx Rate Limiting chi tiết

Nginx có rate limiting native, dùng leaky bucket algorithm. Đây là cách implement rate limiting ở HTTP layer, không cần external dependency. Rất phù hợp cho small-medium applications.

# nginx.conf
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;

# Virtual host / location cho rate limiting
location /api/ {
    limit_req zone=api_limit burst=20 nodelay;
    
    # Trả về 429 với Retry-After header
    limit_req_status 429;
    add_header Retry-After 60 always;
    
    proxy_pass http://backend;
}

# Per-user rate limiting (dùng API key)
limit_req_zone $http_authorization zone=per_user:10m rate=5r/s;

location /api/private/ {
    limit_req zone=per_user burst=10 nodelay;
    auth_request /verify_token;
    proxy_pass http://backend;
}

# Rate limiting theo request body (cho POST requests)
limit_req_zone $binary_remote_addr $request_body zone=post_limit:10m rate=1r/s;

location /api/submit/ {
    limit_req zone=post_limit burst=5;
    limit_req_status 429;
    proxy_pass http://backend;
}

Redis-based Rate Limiting (Distributed)

Nginx limit_req local, không sync được giữa nhiều servers. Redis-based rate limiting dùng distributed state, scale được horizontal. Đây là cách implement rate limiting cho distributed systems.

Express.js + Redis middleware

const redis = require('ioredis');
const client = new redis();

async function rateLimit(options) {
  const { maxRequests = 100, windowMs = 60000, keyPrefix = 'rl:' } = options;
  
  return async (req, res, next) => {
    const key = `${keyPrefix}${req.ip}:${Math.floor(Date.now() / windowMs)}`;
    
    const multi = client.multi();
    multi.incr(key);
    multi.pttl(key);
    const results = await multi.exec();
    
    const current = results[0][1];
    const ttl = results[1][1];
    
    if (current > maxRequests) {
      return res.status(429).json({
        error: 'Too many requests',
        retryAfter: Math.ceil(ttl / 1000)
      }).set('Retry-After', Math.ceil(ttl / 1000));
    }
    
    res.set('X-RateLimit-Limit', maxRequests);
    res.set('X-RateLimit-Remaining', Math.max(0, maxRequests - current));
    res.set('X-RateLimit-Reset', Date.now() + ttl);
    next();
  };
}

app.use('/api', rateLimit({ maxRequests: 100, windowMs: 60000 }));

PHP / Laravel middleware

ip() . ':' . floor(time() / $windowSeconds);
    
    $current = Redis::incr($key);
    if ($current === 1) {
      Redis::expire($key, $windowSeconds);
    }
    
    $ttl = Redis::ttl($key);
    
    if ($current > $maxRequests) {
      return response()->json([
        'error' => 'Too many requests',
        'retryAfter' => $ttl
      ], 429)->withHeaders([
        'Retry-After' => $ttl,
        'X-RateLimit-Limit' => $maxRequests,
        'X-RateLimit-Remaining' => 0
      ]);
    }
    
    return $next($request)->withHeaders([
      'X-RateLimit-Limit' => $maxRequests,
      'X-RateLimit-Remaining' => max(0, $maxRequests - $current)
    ]);
  }
}

API Gateway Rate Limiting (Kong, Tyk, AWS)

Nếu dùng API Gateway (Kong, Tyk, AWS API Gateway), rate limiting được xử lý ở gateway level — đơn giản, không cần implement trong application code. Đây là cách implement rate limiting tốt nhất cho microservices.

Kong: rate-limiting-advanced plugin, hỗ trợ Redis backend, sliding window, multiple limits per consumer. Đây là rate limiting plugin phổ biến nhất cho Kong.
Tyk: quota + rate limiter, per-key configuration, global & per-endpoint. Rate limiting trong Tyk rất linh hoạt.
AWS API Gateway: Throttle (burst + rate), quota per API key, Usage Plans. Rate limiting native của AWS.
Cloudflare: Rate Limiting rules at edge — không cần request đến origin để check limit. Đây là cách rate limiting hiệu quả nhất.

# Kong declarative config (deck) cho rate limiting
plugins:
- name: rate-limiting-advanced
  config:
    limit: 100
    window_size: 60
    window_type: sliding
    identifier: consumer
    strategy: redis
    redis:
      host: redis.cluster.internal
      port: 6379
      password: "$os.getenv('REDIS_PASSWORD')"
      timeout: 1000
    error_code: 429
    retry_after_header: true

Security-specific Rate Limiting (Brute-force Protection)

Rate limiting không chỉ cho API, mà còn cho login, password reset, và các endpoints nhạy cảm khác. Đây là cách implement rate limiting để ngăn chặn brute-force attacks.

Login Brute-Force Protection

Progressive lockout: 5 failed attempts = 1 phút lock, 10 attempts = 15 phút, 20 attempts = 1 giờ. Sau đó exponential decay. Đây là best practice cho brute-force protection.
CAPTCHA after N attempts: Dùng Google reCAPTCHA v2 hoặc hCaptcha sau 3 failed logins. Rate limiting + CAPTCHA = brute-force protection hiệu quả.
Log & alert: Alert khi có IP hoặc user bị flagged nhiều lần. Monitor rate limiting violations để phát hiện attacks sớm.

API Scraping Prevention

Request fingerprinting: Detect bots bằng User-Agent patterns, missing headers, JavaScript execution required (Challenge-Response). Đây là cách rate limiting phát hiện bots.
Signature-based throttle: Yêu cầu client ký request bằng secret key. Verify signature server-side. Rate limiting + signature = bảo mật API tối đa.
Behavior analysis: Nếu IP access pattern giống scrape (sequential IDs, full scans), challenge or block. Đây là adaptive rate limiting.

HTTP Headers chuẩn cho Rate Limiting

Rate limiting headers giúp client biết limit và retry time. Đây là headers chuẩn cho rate limiting:

Header	Ý nghĩa	Ví dụ
`X-RateLimit-Limit`	Maximum requests per window	100
`X-RateLimit-Remaining`	Requests còn lại	45
`X-RateLimit-Reset`	Unix timestamp khi limit reset	1716547200
`Retry-After`	Giây chờ trước khi retry (khi bị 429)	60

Response khi bị Rate Limited (429)

Khi client vượt quá rate limit, server trả về HTTP 429 Too Many Requests. Đây là response format chuẩn:

{
  "error": "Too many requests",
  "message": "Bạn đã vượt quá giới hạn 100 yêu cầu/phút. Vui lòng thử lại sau.",
  "code": "RATE_LIMIT_EXCEEDED",
  "retryAfter": 45,
  "docs": "https://vnhte.com/api-rate-limits"
}

HTTP Status: 429 Too Many Requests. Không dùng 403 Forbidden (confusing) hay 503 Service Unavailable (implies server down). Đây là best practice cho rate limiting response.

Monitoring & Alerting cho Rate Limiting

Rate limiting không có monitoring thì không biết có attacks đang xảy ra. Đây là metrics và alerts cần thiết:

Metric: rate_limit_exceeded_total by endpoint, client (IP/API key), response code. Đây là metric quan trọng nhất cho rate limiting monitoring.
Alert when rate limit hits > 5% of total requests (legitimate users bị affected). Đây là alert threshold cho rate limiting.
Track abuse patterns: single IP, single user token, geographic clustering. Rate limiting giúp phát hiện attacks sớm.
Dashboard: % requests allowed vs rejected. Monitor rate limiting effectiveness.

Best Practices & Checklist Rate Limiting

[ ] Token bucket cho API public, sliding window cho endpoints nhạy cảm. Đây là cách chọn thuật toán rate limiting đúng.
[ ] Dùng Redis cluster cho distributed rate limiting. Đảm bảo rate limiting hoạt động across multiple servers.
[ ] Trả về 429 + Retry-After header đúng chuẩn. Client cần biết khi nào retry.
[ ] Separate limits cho authenticated vs anonymous users. Authenticated users thường có limit cao hơn.
[ ] Progressive lockout cho login endpoints. Đây là brute-force protection tốt nhất.
[ ] CAPTCHA sau N failed attempts. Rate limiting + CAPTCHA = bảo mật layered.
[ ] Log và monitor rate limit violations. Monitor rate limiting để phát hiện attacks.
[ ] Allowlist whitelisted IPs (partners, internal services). Không apply rate limiting cho trusted IPs.
[ ] Per-endpoint limits (expensive operations < cheaper reads). Write operations thường có limit thấp hơn read operations.
[ ] Document rate limits trong API docs, thông báo users trước khi hit. Transparency giúp users plan better.

Kết Luận

Rate limiting là security baseline cho bất kỳ API public nào. Với API public, SaaS products, và bất kỳ ứng dụng nào có authentication, đây là bắt buộc. Chiến lược tốt: Token bucket ở API gateway (layer đầu tiên) + Redis-based limit per consumer trong app + Progressive lockout cho login (security layer). Monitor để phát hiện abuse sớm.

Implement rate limiting đúng cách giúp bảo vệ API khỏi DDoS, brute-force, và abuse. Để tìm hiểu thêm về bảo mật API, hãy xem hướng dẫn tối ưu tốc độ website và các best practices về performance.