Log Injection in Django with Bearer Tokens
Log Injection in Django with Bearer Tokens — how this specific combination creates or exposes the vulnerability
Log injection occurs when untrusted input is written directly into application or system logs without proper sanitization, enabling an attacker to forge log entries, inject newlines, or insert misleading context. In Django, this risk is pronounced when Bearer tokens from Authorization headers are handled naively and then echoed into logs. A common pattern is to extract the token for validation and then include it verbatim in request-scoped logging, metrics, or debug output, which turns the token into a controllable data field rather than an opaque credential.
Consider a Django view that reads Authorization: Bearer <token>, validates the token, and then logs the token for debugging or audit purposes. If the token string contains newline characters (e.g., an attacker-supplied token like abc%0A%20%20malicious: injected), writing it directly into a log line can produce multiple log entries or inject structured-looking content into the log stream. This can obscure the true request origin, spoof service identifiers, or create the appearance of multiple distinct events from a single request. In distributed environments, logs are often aggregated and parsed automatically; injected newlines or structured text (such as key-value pairs) can break expected parsing, leading to missed alerts or, worse, crafted log lines that trigger incorrect downstream rules.
Another vector involves logging the full Authorization header or token-derived metadata (e.g., scopes, user ID) without sanitization. If the log format is JSON and the token is placed as a JSON string without proper escaping, newline or quote characters inside the token can break the JSON structure, causing parsing failures or enabling log injection that manipulates log levels or associated metadata fields. Attackers may also attempt token exfiltration via log injection by embedding external endpoints in the token itself (e.g., a crafted token containing a URL), aiming to have the log line reach an external system when logs are forwarded. Although the token remains opaque to the application, its uncontrolled inclusion in logs expands the attacker’s influence on the logging pipeline, potentially aiding in information leakage or social engineering.
Django’s request logging via LOGGING configuration can inadvertently surface raw headers when custom filters or formatters include request attributes. For instance, a custom logging filter that adds request.META.get('HTTP_AUTHORIZATION') to every log record must sanitize the value before interpolation. Without validation and normalization (e.g., masking or omitting the token), each log line becomes a potential injection surface. Real-world log injection techniques such as newline injection, tag injection, or field manipulation apply directly to these logs, especially when tokens contain unexpected characters or when token generation schemes embed structured hints (like prefixes or delimiters) that are not treated as opaque data.
Bearer Tokens-Specific Remediation in Django — concrete code fixes
Remediation centers on treating Bearer tokens as untrusted data, avoiding their inclusion in logs, and ensuring any logged representation is normalized and masked. Below are concrete, safe patterns for handling Authorization headers in Django while preserving auditability.
- Extract and validate the token without logging it. Use a masked representation for any logging needs:
import re
def masked_token(token: str | None) -> str:
if not token:
return 'none'
# Keep a stable, non-sensitive representation
return f'{token[:4]}...{token[-4:]}' if len(token) > 8 else '****'
def my_view(request):
auth = request.META.get('HTTP_AUTHORIZATION', '')
token = None
if auth.startswith('Bearer '):
token = auth[7:].strip()
# Validate token (pseudocode)
# is_valid = validate_bearer_token(token)
# Log using masked token, never raw token
masked = masked_token(token)
request_id = secure_random_id()
logger.info('API request', extra={
'request_id': request_id,
'masked_token': masked,
'path': request.path,
'method': request.method,
})
- Configure structured logging safely by ensuring token fields are escaped and never interpolated into format strings. With Python’s
loggingmodule, pass token data viaextraand format with caution:
import logging
import json
logger = logging.getLogger('django')
class JsonFormatter(logging.Formatter):
def format(self, record):
# Ensure payload is a dict; escape unsafe characters in masked_token
payload = {
'ts': self.formatTime(record),
'level': record.levelname,
'msg': record.getMessage(),
'request_id': getattr(record, 'request_id', ''),
'masked_token': getattr(record, 'masked_token', ''),
}
# Use json.dumps to safely escape control characters
return json.dumps(payload, ensure_ascii=False)
handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter())
logger.addHandler(handler)
- Sanitize token-derived metadata in logs. If you must log scope or user ID derived from the token, compute these server-side and log only validated, canonical values:
def get_user_from_token(token: str):
# Decode and validate token; return a stable user identifier
payload = decode_jwt(token)
return payload.get('sub')
user_id = get_user_from_token(token) if token else 'unknown'
logger.info('Authorization success', extra={
'user_id': user_id,
'masked_token': masked_token(token),
})
- Audit your logging configuration for direct header inclusion. Search for patterns like
'Authorization': request.META.get('HTTP_AUTHORIZATION')or custom filters that forward raw headers. Replace them with masked or omitted values:
# Avoid this in logging filters or formatters:
# {'authorization': request.META.get('HTTP_AUTHORIZATION')}
# Prefer this:
{'authorization_present': bool(request.META.get('HTTP_AUTHORIZATION'))}