HIGH log injectiondjangobearer tokens

Log Injection in Django with Bearer Tokens

Log Injection in Django with Bearer Tokens — how this specific combination creates or exposes the vulnerability

Log injection occurs when untrusted input is written directly into application or system logs without proper sanitization, enabling an attacker to forge log entries, inject newlines, or insert misleading context. In Django, this risk is pronounced when Bearer tokens from Authorization headers are handled naively and then echoed into logs. A common pattern is to extract the token for validation and then include it verbatim in request-scoped logging, metrics, or debug output, which turns the token into a controllable data field rather than an opaque credential.

Consider a Django view that reads Authorization: Bearer <token>, validates the token, and then logs the token for debugging or audit purposes. If the token string contains newline characters (e.g., an attacker-supplied token like abc%0A%20%20malicious: injected), writing it directly into a log line can produce multiple log entries or inject structured-looking content into the log stream. This can obscure the true request origin, spoof service identifiers, or create the appearance of multiple distinct events from a single request. In distributed environments, logs are often aggregated and parsed automatically; injected newlines or structured text (such as key-value pairs) can break expected parsing, leading to missed alerts or, worse, crafted log lines that trigger incorrect downstream rules.

Another vector involves logging the full Authorization header or token-derived metadata (e.g., scopes, user ID) without sanitization. If the log format is JSON and the token is placed as a JSON string without proper escaping, newline or quote characters inside the token can break the JSON structure, causing parsing failures or enabling log injection that manipulates log levels or associated metadata fields. Attackers may also attempt token exfiltration via log injection by embedding external endpoints in the token itself (e.g., a crafted token containing a URL), aiming to have the log line reach an external system when logs are forwarded. Although the token remains opaque to the application, its uncontrolled inclusion in logs expands the attacker’s influence on the logging pipeline, potentially aiding in information leakage or social engineering.

Django’s request logging via LOGGING configuration can inadvertently surface raw headers when custom filters or formatters include request attributes. For instance, a custom logging filter that adds request.META.get('HTTP_AUTHORIZATION') to every log record must sanitize the value before interpolation. Without validation and normalization (e.g., masking or omitting the token), each log line becomes a potential injection surface. Real-world log injection techniques such as newline injection, tag injection, or field manipulation apply directly to these logs, especially when tokens contain unexpected characters or when token generation schemes embed structured hints (like prefixes or delimiters) that are not treated as opaque data.

Bearer Tokens-Specific Remediation in Django — concrete code fixes

Remediation centers on treating Bearer tokens as untrusted data, avoiding their inclusion in logs, and ensuring any logged representation is normalized and masked. Below are concrete, safe patterns for handling Authorization headers in Django while preserving auditability.

Extract and validate the token without logging it. Use a masked representation for any logging needs:

import re
def masked_token(token: str | None) -> str:
    if not token:
        return 'none'
    # Keep a stable, non-sensitive representation
    return f'{token[:4]}...{token[-4:]}' if len(token) > 8 else '****'

def my_view(request):
    auth = request.META.get('HTTP_AUTHORIZATION', '')
    token = None
    if auth.startswith('Bearer '):
        token = auth[7:].strip()
    # Validate token (pseudocode)
    # is_valid = validate_bearer_token(token)
    # Log using masked token, never raw token
    masked = masked_token(token)
    request_id = secure_random_id()
    logger.info('API request', extra={
        'request_id': request_id,
        'masked_token': masked,
        'path': request.path,
        'method': request.method,
    })

Configure structured logging safely by ensuring token fields are escaped and never interpolated into format strings. With Python’s logging module, pass token data via extra and format with caution:

import logging
import json

logger = logging.getLogger('django')

class JsonFormatter(logging.Formatter):
    def format(self, record):
        # Ensure payload is a dict; escape unsafe characters in masked_token
        payload = {
            'ts': self.formatTime(record),
            'level': record.levelname,
            'msg': record.getMessage(),
            'request_id': getattr(record, 'request_id', ''),
            'masked_token': getattr(record, 'masked_token', ''),
        }
        # Use json.dumps to safely escape control characters
        return json.dumps(payload, ensure_ascii=False)

handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter())
logger.addHandler(handler)

Sanitize token-derived metadata in logs. If you must log scope or user ID derived from the token, compute these server-side and log only validated, canonical values:

def get_user_from_token(token: str):
    # Decode and validate token; return a stable user identifier
    payload = decode_jwt(token)
    return payload.get('sub')

user_id = get_user_from_token(token) if token else 'unknown'
logger.info('Authorization success', extra={
    'user_id': user_id,
    'masked_token': masked_token(token),
})

Audit your logging configuration for direct header inclusion. Search for patterns like 'Authorization': request.META.get('HTTP_AUTHORIZATION') or custom filters that forward raw headers. Replace them with masked or omitted values:

# Avoid this in logging filters or formatters:
# {'authorization': request.META.get('HTTP_AUTHORIZATION')}

# Prefer this:
{'authorization_present': bool(request.META.get('HTTP_AUTHORIZATION'))}

Frequently Asked Questions

Can a Bearer token's newline characters affect log integrity in Django?

Yes. If a token containing newline or carriage-return characters is written directly into logs, it can inject additional log entries or corrupt structured log formats, making it difficult to distinguish real events from injected lines.

What is a safe way to include token context in Django logs without risking injection?

Log only a masked representation of the token (e.g., first and last four characters), a boolean indicating presence, or a server-side user identifier derived from validated token claims—never the raw token.

Log Injection in Django with Bearer Tokens