HIGH regex dosdjangobasic auth

Regex Dos in Django with Basic Auth

Regex Dos in Django with Basic Auth — how this specific combination creates or exposes the vulnerability

When Django uses HTTP Basic Authentication, credentials are transmitted as a base64-encoded string in the Authorization header. Although base64 is not encryption, this mechanism relies on the server-side parsing and validation of the decoded username and password. If application code or third-party libraries use regular expressions to validate or parse credentials, certain regex patterns can trigger catastrophic backtracking or exponential time complexity when presented with specially crafted input. This is a Regex DoS (ReDoS) scenario, where an attacker sends a malformed or long authorization string designed to force the regex engine into excessive backtracking, consuming significant CPU resources and potentially degrading service availability.

In the context of Django with Basic Auth, a common pitfall is applying overly permissive regex patterns to validate usernames or passwords after decoding. For example, patterns that use nested quantifiers without atomic groups or possessive quantifiers can be exploited. Consider a naive validation routine that checks a username against a pattern like (a+)+ — an attacker can provide a long string of a characters followed by a non-matching character, causing the regex engine to explore an exponential number of paths. Because Basic Auth validation often occurs on every request, this can lead to a denial of service that affects all users, not just the targeted endpoint.

Django’s built-in authentication views and decorators do not inherently use vulnerable regexes, but developers sometimes introduce custom authentication backends or middleware that perform additional credential validation using regular expressions. If such custom logic is added without attention to regex safety, the attack surface expands. The scanner category “Input Validation” in middleBrick specifically tests for patterns like system prompt leakage and injection probes, but it also checks for server-side behaviors that indicate unsafe handling of malformed inputs, including malformed authorization headers that could trigger regex inefficiencies. Because Basic Auth credentials are visible in request logs and network traces when not protected by encryption, there is also a data exposure risk if logs inadvertently store raw header values.

Another subtle interaction involves the use of regexes in URL routing combined with Basic Auth. If a Django URL pattern uses a regex group to capture credentials or tokens, and that pattern is combined with a view that performs additional regex validation, an attacker can craft a path that maximizes backtracking in both routing and authentication checks. This layered use of regex can amplify the impact, turning a theoretically benign pattern into a practical denial-of-vector. The middleBrick “Unsafe Consumption” and “Input Validation” checks are designed to surface such risky configurations by correlating runtime behavior with spec definitions, including OpenAPI/Swagger 2.0, 3.0, and 3.1 documents with full $ref resolution, to ensure that declared patterns do not contradict actual runtime handling.

To summarize, the combination of Django, HTTP Basic Auth, and unsafe regular expressions creates a scenario where an attacker can force the server into heavy computation using maliciously crafted credentials. This does not require authentication bypass or data theft; simply triggering the regex processing path with a pathological input can be sufficient to degrade performance. Because middleBrick tests unauthenticated attack surfaces, it can identify endpoints where Basic Auth is exposed and where input validation patterns might be vulnerable, providing findings with severity and remediation guidance rather than attempting to fix the issue directly.

Basic Auth-Specific Remediation in Django — concrete code fixes

Remediation focuses on avoiding custom regex validation of credentials and ensuring that any pattern matching is done with safe constructs. If you must validate usernames or passwords against a specific format, prefer exact string checks or use Django’s built-in validators, which are implemented in Python without risky backtracking-prone regexes.

Example 1: Using Django’s built-in authentication without custom regex

from django.contrib.auth.models import User
from django.http import HttpResponse
from django.views.decorators.http import require_http_methods

@require_http_methods(["GET"])
def profile_view(request):
    # Rely on Django’s standard authentication middleware
    if request.user.is_authenticated:
        return HttpResponse(f"Hello, {request.user.username}")
    return HttpResponse("Unauthorized", status=401)

Example 2: Custom Basic Auth parsing without dangerous regex

import base64
from django.http import HttpResponse, HttpResponseForbidden
from django.views.decorators.http import require_http_methods

@require_http_methods(["GET"])
def safe_basic_auth_view(request):
    auth = request.META.get("HTTP_AUTHORIZATION", "")
    if not auth.lower().startswith("basic "):
        return HttpResponseForbidden()
    try:
        encoded = auth.split(" ", 1)[1]
        decoded = base64.b64decode(encoded).decode("utf-8")
        username, _, password = decoded.partition(":")
        if not username or not password:
            return HttpResponseForbidden()
        # Perform your user lookup without regex on username/password
        # Example: User.objects.get(username=username) — ensure you handle DoesNotExist
        return HttpResponse(f"Authenticated as {username}")
    except Exception:
        return HttpResponseForbidden()

Example 3: Validating format with safe string methods instead of regex

def is_valid_username_format(value: str) -> bool:
    # Allow only alphanumeric and underscores, length 3..30
    if not (3 <= len(value) <= 30):
        return False
    return all(c.isalnum() or c == "_" for c in value)

# Usage in a view or serializer:
# if is_valid_username_format(candidate): ...

General recommendations

  • Do not use regex patterns with nested quantifiers like (a+)+, (a|aa)*, or (x+)* on user-controlled input.
  • If regex is unavoidable, use atomic groups (?>(...)) or possessive quantifiers *+, ++, ?+ where supported, and keep patterns simple and linear.
  • Ensure that Basic Auth is always served over HTTPS to protect credentials in transit and reduce data exposure in logs.
  • Use middleBrick’s “Encryption” and “Data Exposure” checks to verify that credentials are not leaked in logs or error messages.

Related CWEs: inputValidation

CWE IDNameSeverity
CWE-20Improper Input Validation HIGH
CWE-22Path Traversal HIGH
CWE-74Injection CRITICAL
CWE-77Command Injection CRITICAL
CWE-78OS Command Injection CRITICAL
CWE-79Cross-site Scripting (XSS) HIGH
CWE-89SQL Injection CRITICAL
CWE-90LDAP Injection HIGH
CWE-91XML Injection HIGH
CWE-94Code Injection CRITICAL

Frequently Asked Questions

Can middleBrick detect Regex DoS risks in Basic Auth endpoints?
Yes. middleBrick runs an unauthenticated scan that includes the Input Validation check, which tests for server-side behaviors that can indicate unsafe regex patterns and excessive backtracking, even when Basic Auth is present.
Does middleBrick fix the vulnerabilities it finds?
No. middleBrick detects and reports findings with severity and remediation guidance. It does not fix, patch, block, or remediate issues. Developers should apply safe coding practices, such as avoiding nested quantifiers in regex and using Django’s built-in authentication utilities.