HIGH rate limiting bypassexpressbearer tokens

Rate Limiting Bypass in Express with Bearer Tokens

Rate Limiting Bypass in Express with Bearer Tokens — how this specific combination creates or exposes the vulnerability

Rate limiting in Express is commonly implemented to restrict the number of requests a client can make over a time window. When Bearer tokens are used for authentication, the combination can expose bypass patterns if the rate limiter is not scoped to the token or identity. A typical vulnerability arises when rate limits are applied only to IP address or endpoint path, without incorporating the token into the limiting key. In such configurations, an authenticated client can rotate through multiple valid Bearer tokens while sharing the same source IP, effectively multiplying the allowed request volume beyond intended policy.

Another bypass pattern occurs when the token is present but not validated before rate limiting is applied. If the middleware order is incorrect—such as applying rate limiting before authentication or skipping rate limits for requests with missing or malformed tokens—an attacker can send unauthenticated requests or use invalid tokens to consume rate capacity reserved for legitimate tokens. This can lead to denial of service against authenticated users or allow brute-force attempts against token-protected endpoints.

Express applications often use packages like express-rate-limit and define key generators based on IP. When Bearer tokens are introduced, the default key generator does not automatically include the token value. Without explicitly customizing the key function to incorporate the token (or a token-derived identifier), the limiter treats requests with different tokens as separate clients from the same IP, weakening the intended limit. This is especially risky in shared environments or APIs where token issuance is not tightly controlled per user.

Additionally, stateless token handling can interact poorly with in-memory store rate limiters. If tokens are short-lived and rotated frequently, but the rate limiter uses a fixed window per IP, bursts of requests using different tokens can slip through within the same window. Adversaries may script token acquisition or exploit token issuance logic to generate multiple valid tokens, testing whether the rate limiter distinguishes between identities rather than just network endpoints.

To illustrate a vulnerable setup, consider an Express app that applies rate limiting globally without token awareness:

const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 100, // limit each IP to 100 requests per windowMs
  message: 'Too many requests, try again later.'
});
app.use(limiter);

In this configuration, requests carrying different Bearer tokens but originating from the same IP share the same counter. An attacker with a pool of tokens can send 100 requests per token, potentially exhausting limits for legitimate users. Correctly scoping the limit to the token requires customizing the key generator to include the token value, ensuring each token has its own rate bucket.

Bearer Tokens-Specific Remediation in Express — concrete code fixes

Remediation centers on ensuring the rate limiter incorporates the Bearer token (or a normalized identity derived from it) as part of the key used to bucket requests. This prevents different tokens from sharing the same rate limit window when they represent distinct identities.

First, extract the token reliably from the Authorization header and use it in the rate limiter key. Below is a secure example that creates a per-token rate limit for authenticated endpoints:

const rateLimit = require('express-rate-limit');

const tokenLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  keyGenerator: (req) => {
    const auth = req.headers.authorization || '';
    const prefix = 'Bearer ';
    if (auth.startsWith(prefix)) {
      return auth.slice(prefix.length);
    }
    // fallback to IP if no valid Bearer token, but prefer token identity
    return req.ip;
  },
  message: 'Too many requests, try again later.'
});

// Apply to authenticated routes only
app.use('/api', authenticateToken, tokenLimiter);

The keyGenerator isolates the token value so that each distinct Bearer token has its own rate window. The fallback to IP ensures that unauthenticated or malformed requests are still limited, though such requests should ideally be rejected earlier by authentication middleware.

Second, enforce authentication before rate limiting to avoid leaking rate capacity to unauthenticated traffic. Define an authenticateToken middleware that validates the Bearer token against your identity provider or token store:

function authenticateToken(req, res, next) {
  const auth = req.headers.authorization || '';
  const prefix = 'Bearer ';
  if (!auth.startsWith(prefix)) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  const token = auth.slice(prefix.length);
  // Replace with actual token validation logic (e.g., verify JWT, check revocation)
  if (isValidToken(token)) {
    req.user = extractUserFromToken(token);
    return next();
  }
  return res.status(401).json({ error: 'Invalid token' });
}

By ordering the middleware so that authenticateToken runs before tokenLimiter, you ensure that only requests with valid tokens consume rate limits. This ordering also allows you to attach user identity to the request object for logging or further processing.

For applications using multiple APIs behind a gateway, consider namespacing the token in the key to avoid collisions across services:

keyGenerator: (req) => {
  const token = extractBearer(req);
  return `api:${token}`;
}

Finally, monitor rate limit rejections per token and correlate with authentication logs to detect abnormal patterns, such as a single IP cycling through many tokens or repeated failures for a specific token, which may indicate enumeration or abuse.

Related CWEs: resourceConsumption

CWE IDNameSeverity
CWE-400Uncontrolled Resource Consumption HIGH
CWE-770Allocation of Resources Without Limits MEDIUM
CWE-799Improper Control of Interaction Frequency MEDIUM
CWE-835Infinite Loop HIGH
CWE-1050Excessive Platform Resource Consumption MEDIUM

Frequently Asked Questions

Can rotating Bearer tokens prevent rate limiting if the limiter is IP-based?
Yes. If the rate limiter uses only IP as the key, an attacker can rotate through multiple valid Bearer tokens from the same IP and bypass per-user limits. Include the token or a token-derived identifier in the rate limiter key to scope limits per identity.
Should rate limiting be applied before or after Bearer token validation in Express?
Apply authentication before rate limiting. Validating the Bearer token first ensures that only authenticated requests consume rate capacity, preventing unauthenticated or tokenless requests from exhausting the limit.