f in x
Rate Limiting for APIs — How to Protect Your API from Abuse with Throttling and Effective Controls
> cd .. / HUB_EDITORIALE
Sviluppo di siti web

Rate Limiting for APIs — How to Protect Your API from Abuse with Throttling and Effective Controls

[2026-06-23] Author: Ing. Calogero Bono

Do you have a public API? If you haven't implemented rate limiting, someone is probably using it for scraping, DDoS, or brute force. It's not a matter of 'if' but 'when'. At Meteora Web, we see it every day in projects coming to us: unprotected APIs collapsing under malicious load or burning third-party budgets because a misconfigured client spams requests.

In this hands-on guide, we'll show you what rate limiting is, how to implement it in Node.js, Express, and NGINX, and which throttling strategies to use to protect your APIs without penalizing legitimate users.

What are rate limiting and throttling, and why are they essential for APIs?

Rate limiting controls the number of requests a client can make within a given time window. For example: max 100 requests every 15 minutes per IP. Throttling is a softer variant: instead of blocking, it slows down the response or reduces priority. Analogy: rate limiting is a traffic light that turns red (block), throttling is a speed bump that forces slowing down.

Why do you need them? Protection from abuse, security, fairness, cost. An e-commerce client we worked with had a public catalog API with no limits. A competitor scraped the entire database every 10 seconds for days. Simple rate limiting would have stopped it all. Instead, the server crashed and the client lost sales for hours.

An API without rate limiting is like a bank without doors: sooner or later someone walks in. If you integrate paid third-party APIs (e.g., OpenAI, Stripe), rate limiting prevents you from exceeding your budget due to an infinite loop.

Sponsored Protocol

Action for you: Check if your API has a limit right now. Open a terminal and fire 50 consecutive requests with curl. If you always get 200, you're vulnerable.

How to implement rate limiting in a REST API with Node.js and Express?

The fastest and most robust way to add rate limiting to an Express API is the express-rate-limit middleware. It supports fixed windows, sliding windows with Redis, and standard headers. We use it in production for dozens of projects.

Install the package:

npm install express-rate-limit

Basic configuration:

const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // limit each IP to 100 requests per window
  standardHeaders: true,
  legacyHeaders: false,
  message: { error: 'Too many requests. Please try again later.' }
});

// Apply to all /api/ routes
app.use('/api/', limiter);

You can customize per route. For authentication (prone to brute force), use stricter limits:

const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 5,
  message: { error: 'Too many login attempts. Try again in 15 minutes.' }
});
app.use('/api/auth/login', authLimiter);

Important: In multi-server environments, use a shared store like Redis. Example with rate-limit-redis:

Sponsored Protocol

const RedisStore = require('rate-limit-redis');
const redisClient = require('./redis');

const limiter = rateLimit({
  store: new RedisStore({
    sendCommand: (...args) => redisClient.call(...args),
  }),
  windowMs: 15 * 60 * 1000,
  max: 100,
});

Action for you: Add this middleware to your project. Test with a simple script:

for i in {1..110}; do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:3000/api/ | tail -n 5

You should see 429 after the hundredth attempt.

Which throttling strategies should you choose for your use case?

Not all APIs have the same traffic pattern. Here are the most common strategies with pros and cons:

  • Fixed Window – counts requests in a fixed time interval. Simple but allows bursts at the window boundary. Good to start.
  • Sliding Window – tracks timestamps of requests, more precise. We use it almost always in production with Redis.
  • Token Bucket – accumulates tokens over time, allows controlled bursts. Ideal for APIs with occasional bursts (e.g., push notifications).
  • Leaky Bucket – requests enter a queue and are processed at a constant rate. Useful for backend with limited resources.

At Meteora Web, we chose sliding window with Redis for a client integrating a billing API with usage-based pricing. With a simple token bucket we would exceed the budget during peaks. With sliding window we granularly limited requests, saving hundreds of euros each month.

Sponsored Protocol

How to implement sliding window with Redis manually? You can use the Redis store of express-rate-limit (shown above) or write custom logic with ZADD and ZREMRANGEBYSCORE. Simplified example:

const WINDOW = 60; // seconds
const MAX = 10;

async function rateLimit(clientId) {
  const now = Date.now();
  const key = `ratelimit:${clientId}`;
  const multi = redisClient.multi();
  multi.zremrangebyscore(key, 0, now - WINDOW * 1000);
  multi.zadd(key, now, now);
  multi.zcard(key);
  multi.expire(key, WINDOW);
  const results = await multi.exec();
  const count = results[2][1];
  if (count > MAX) {
    throw new Error('Rate limit exceeded');
  }
}

Action for you: If you already have an API in production, move from in-memory store to Redis sliding window. Clients using fixed window often report unexpected bursts. Test the new configuration by simulating load with Artillery.

How to protect APIs with NGINX or an API Gateway?

You don't always want to modify your application code. At the reverse proxy or API Gateway level, you can implement rate limiting transparently. NGINX, which many already use, offers the limit_req module. Configuration example:

http {
  limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

  server {
    location /api/ {
      limit_req zone=api burst=20 nodelay;
      proxy_pass http://backend;
    }
  }
}

Here we allow 10 requests per second per IP, with a burst of 20. Excess requests get 503 (by default). Advantages: no backend code changes, centralized configuration. Disadvantage: less flexibility than application logic (e.g., different limits for authenticated vs anonymous users).

Sponsored Protocol

More structured alternatives are API Gateways like Kong or AWS API Gateway. They're useful for complex architectures, but watch out for vendor lock-in. We prefer solutions that keep control with the client – a well-configured NGINX with Lua scripting can do everything an enterprise gateway does.

Action for you: If you already have an NGINX proxy, add the limit_req directive to your API block. Then test with wrk or siege. Example with wrk:

wrk -t2 -c20 -d10s http://localhost/api/

Observe how many 503s you get. Tweak the parameters until you find the sweet spot.

How to monitor and test rate limiting in production?

Rate limiting should never be left unmonitored. Three things to do immediately:

  1. Add response headers. Express-rate-limit adds them automatically if standardHeaders: true. You'll get X-RateLimit-Limit, X-RateLimit-Remaining, and Retry-After on 429s. NGINX can also expose them with limit_req_status and logging.
  2. Log every block. Example with Express: use a custom handler.
  3. Set alerts. If 429s exceed a threshold, something is wrong (attack or false positive).

Test the system with a load script. Simple command to see behavior locally:

Sponsored Protocol

for i in {1..120}; do curl -s -o /dev/null -w "%{http_code} %{time_total}s\n" http://localhost:3000/api/; done | sort | uniq -c

You should see a mix of 200 and 429. If you get only 200, rate limiting isn't working.

Action for you: Set an alert in Grafana or via webhook when the percentage of 429 exceeds 5% of total requests per minute. If you have an external API, use logs to analyze blocked IPs.

What to do now

Here's your operational checklist to secure your API:

  • Identify critical routes (login, registration, search, checkout).
  • Choose your strategy – fixed for simplicity, sliding window for production.
  • Implement with express-rate-limit (Node.js) or limit_req (NGINX).
  • Configure headers and logging for debugging and monitoring.
  • Test with load tools (wrk, artillery, autocannon).
  • Monitor in production – alert on too many 429s.

If you don't have time or skills to implement it, reach out. At Meteora Web we do this for our clients every week, integrating it into custom architectures. Rate limiting is not a luxury – it's a deadbolt for your API. Don't wait for someone to break it.

Ing. Calogero Bono

> AUTHOR_EXTRACTED

Ing. Calogero Bono

Ingegnere Informatico, co-fondatore di Meteora Web. Esperto in architetture software, sicurezza informatica e sviluppo sistemi scalabili.
[ Read Full Dossier ]

> METEORA_WEB // DIGITAL AGENCY

We build the digital presence your business deserves.

Websites, social media, online advertising, e-commerce and high-performance hosting, engineered with method by computer engineers in Sciacca, for all of Italy.

> MW_JOURNAL

> READ_ALL()