Unicode Normalization in Django with Mutual Tls
Unicode Normalization in Django with Mutual Tls — how this specific combination creates or exposes the vulnerability
Unicode normalization affects how strings are compared and stored. In Django, model fields and form validation often rely on exact string matching. When a client presents a certificate containing Unicode identities (for example, common names or organizational units), the server-side certificate parsing may produce a normalized or unnormalized representation before it is used in business logic. If the application compares certificate-derived identifiers (such as usernames or tenant keys) using unnormalized Unicode strings, an attacker can bypass authorization checks via homoglyphs or composed/decomposed equivalence. This becomes particularly relevant in Mutual Tls setups where client certificates carry identity information used to map to user accounts or scopes.
Mutual Tls requires both the client and server to present valid certificates. In Django, this typically means configuring your HTTP layer to request and validate client certificates, then extracting identity details from the certificate’s subject fields. If these identity fields contain Unicode characters, the Python ssl module and Django’s certificate parsing may yield different normalization forms depending on the platform and OpenSSL version. An attacker can exploit this inconsistency by registering a certificate with a visually identical hostname or username that uses different normalization, leading to authentication or authorization confusion. For example, a username in the certificate subject like josé (U+00E9) might be compared to a database entry stored as jośe (U+006A + COMBINING ACUTE ACCENT + U+0065), and if normalization is not applied consistently, the comparison may incorrectly succeed.
Additionally, path traversal or host-header logic that incorporates certificate fields can be abused when normalization differences affect how strings are matched against allowlists. Security checks that rely on exact string comparisons without normalization can treat two logically identical identifiers as different, bypassing intended restrictions. This can intersect with other findings such as IDOR if the mismatched identity leads to accessing another user’s resources. Since middleBrick tests input validation and authentication across normalized and unnormalized inputs, it can surface these inconsistencies by observing whether endpoints accept mismatched Unicode representations during Mutual Tls authentication.
Mutual Tls-Specific Remediation in Django — concrete code fixes
To securely handle Unicode normalization in a Mutual Tls enabled Django application, normalize certificate-derived identifiers before using them in authentication or authorization logic. Use Python’s standard library unicodedata to apply a consistent form, typically NFC or NFD, across both certificate extraction and stored data. Combine this with strict certificate validation and explicit mapping of certificate fields to user accounts.
Example: Configure Mutual Tls in Django with normalized certificate extraction
import ssl
import unicodedata
from django.conf import settings
def get_ssl_context():
ssl_context = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.load_cert_chain(
certfile=settings.SSL_CERT_PATH,
keyfile=settings.SSL_KEY_PATH,
)
# Load trusted CA certificates for client verification
ssl_context.load_verify_locations(cafile=settings.SSL_CA_PATH)
# Ensure hostname checking is enforced at the application layer
ssl_context.check_hostname = True
return ssl_context
In your WSGI server or ASGI configuration, use this SSL context so that client certificates are validated before the Django request reaches your views. Once a client certificate is validated, extract subject fields and normalize them.
import ssl
from unicodedata import normalize
def extract_normalized_identity(cert):
# Example: extract commonName (CN) from certificate subject
subject = cert.get_subject()
common_name = subject.CN if hasattr(subject, 'CN') else ''
# Apply NFC normalization for consistent comparison
return normalize('NFC', common_name)
Use this normalized identity when querying your user model. For example, if you store normalized usernames in the database, ensure registration and login paths also normalize input before comparison:
from django.contrib.auth import get_user_model
User = get_user_model()
def authenticate_with_certificate(cert):
identity = extract_normalized_identity(cert)
try:
user = User.objects.get(username=identity)
return user
except User.DoesNotExist:
# Handle missing user appropriately, e.g., log and deny access
return None
For authorization, avoid relying on raw certificate fields in URLs or query parameters. Instead, map the normalized identity to a scoped token or session and enforce object-level permissions using Django’s permission system or a policy library. This reduces the risk of IDOR when normalization discrepancies would otherwise allow username confusion.
Finally, include normalization in your validation tests so that middleware or unit tests verify consistent handling. middleBrick’s checks around input validation and authentication can be used to confirm that your endpoints reject non-normalized equivalents of accepted identities during Mutual Tls handshakes.