HIGH unicode normalizationdjangobasic auth

Unicode Normalization in Django with Basic Auth

Unicode Normalization in Django with Basic Auth — how this specific combination creates or exposes the vulnerability

Unicode normalization inconsistencies between the normalized credentials stored in Django and the non-normalized credentials sent via HTTP Basic Auth can bypass authentication checks and expose account takeover risks. In Django, user passwords are typically stored in a normalized form when using custom user models with Unicode-aware validators, but the HTTP Basic Auth header transmits the raw bytes provided by the client. If the client sends a decomposed Unicode form (e.g., é as e + combining acute) while Django normalizes to composed form (é) during comparison, the comparison may succeed or fail depending on the exact code path, potentially allowing an attacker to authenticate with a visually identical but different byte sequence.

During a middleBrick scan, the Authentication check flags this as a finding when the endpoint accepts Basic Auth and the framework does not enforce canonical normalization before credential validation. Attack patterns include credential confusion (using different Unicode representations of the same visual account name) and privilege confusion if the normalized lookup resolves to a different user than intended. Input validation checks highlight that Basic Auth credentials are not normalized to NFC or NFKC before being compared against Django’s stored representation, and Data Exposure findings may appear if error messages or logs inadvertently disclose whether normalization succeeded, aiding account enumeration.

OpenAPI/Swagger analysis with full $ref resolution helps correlate the runtime behavior: if the security scheme is defined as type: http and scheme: basic, but the server implementation does not explicitly normalize the username and password, the spec may declare authentication success while the runtime accepts mismatched Unicode forms. This gap is surfaced in the report’s cross-references between spec definitions and runtime findings, emphasizing the need to normalize inputs before they reach Django’s authentication backend.

Basic Auth-Specific Remediation in Django — concrete code fixes

Apply Unicode normalization to the username and password before they are used for authentication in Django. For HTTP Basic Auth, extract credentials from the Authorization header, decode them, normalize with unicodedata, and pass the normalized values to Django’s authentication backend. Below is a concrete example using Django middleware for endpoints that rely on Basic Auth, ensuring consistent NFC comparison.

import base64
import unicodedata
from django.utils.deprecation import MiddlewareMixin

class BasicAuthNormalizationMiddleware(MiddlewareMixin):
    def process_request(self, request):
        auth = request.META.get('HTTP_AUTHORIZATION', '')
        if auth.startswith('Basic '):
            encoded = auth.split(' ', 1)[1]
            try:
                decoded = base64.b64decode(encoded).decode('utf-8')
                username, _, password = decoded.partition(':')
                username = unicodedata.normalize('NFC', username)
                password = unicodedata.normalize('NFC', password)
                request.META['PHP_AUTH_USER'] = username
                request.META['PHP_AUTH_PW'] = password
            except Exception:
                # Keep original to let Django handle invalid credentials gracefully
                pass

In your authentication view or token-based login, explicitly normalize before calling authenticate(). This example uses Django’s built-in authenticate with a custom backend that normalizes inputs, aligning stored credentials (typically normalized) with runtime inputs.

import unicodedata
from django.contrib.auth import authenticate

def login_with_basic_auth(request):
    auth = request.META.get('HTTP_AUTHORIZATION', '')
    if auth.startswith('Basic '):
        encoded = auth.split(' ', 1)[1]
        decoded = base64.b64decode(encoded).decode('utf-8')
        username, _, password = decoded.partition(':')
        username = unicodedata.normalize('NFC', username)
        password = unicodedata.normalize('NFC', password)
        user = authenticate(request, username=username, password=password)
        if user is not None:
            # proceed with login
            pass

Additionally, ensure your user model or admin creates usernames using NFC to maintain consistency. For custom user models, override clean() or use validators to normalize usernames at rest. Combine these practices with rate limiting and proper error handling to reduce enumeration risks flagged by middleBrick’s Rate Limiting and Data Exposure checks.

Frequently Asked Questions

Why does Unicode normalization matter when using HTTP Basic Auth in Django?
Because HTTP Basic Auth transmits credentials as raw bytes, while Django may store and compare them in a normalized Unicode form. Mismatches between decomposed and composed forms can allow attackers to bypass authentication using visually identical but different byte sequences.
Can middleBrick detect Unicode normalization issues during a scan?
Yes, middleBrick’s Authentication check can flag endpoints where credentials are accepted via Basic Auth but normalization is not enforced before comparison, and Data Exposure findings may highlight inconsistencies in error handling that aid enumeration.