HIGH xml external entitiesdjangobasic auth

Xml External Entities in Django with Basic Auth

Xml External Entities in Django with Basic Auth — how this specific combination creates or exposes the vulnerability

XML External Entity (XXE) injection is a web security vulnerability that occurs when an application processes user-supplied XML input without adequate safeguards, allowing an attacker to reference external entities defined in the XML DTD. In Django, this typically arises when the application parses XML data via libraries such as lxml or the standard library’s xml.etree and unsafe feature configurations are enabled. When Basic Authentication is used for access control, the perceived safety of transport-layer protection can create a false sense of security. Basic Auth secures the channel between client and server but does not restrict what the server does with the request body; if an endpoint accepts XML credentials or configuration and processes it with vulnerable parsers, the authentication layer does nothing to prevent malicious entity expansion.

Consider an API endpoint that accepts an XML payload containing user credentials or configuration, validates presence of a username and password via Basic Auth headers, and then parses the XML body to extract additional parameters. If the parser resolves external entities, an attacker can supply an XML document that references a file URL (e.g., file:///etc/passwd) or triggers SSRF by fetching internal metadata services. In Django, common triggers include using deferred_tree=True with lxml or failing to disable DOCTYPE and entity processing. The combination of Basic Auth and XXE is particularly insidious because developers may assume that requiring a username and password eliminates injection risks, while the vulnerability persists in the XML processing path. Attack techniques can include reading sensitive files, performing SSRF against internal endpoints, or injecting malicious external declarations that lead to denial of service via billion laughs attacks.

An illustrative vulnerable Django view might accept an XML payload and parse it with features that allow external entities:

from lxml import etree
from django.http import HttpResponse

def vulnerable_xxe_view(request):
    if request.method == 'POST':
        data = request.body
        # Dangerous: enables external entity resolution
        parser = etree.XMLParser(resolve_entities=True, no_network=False)
        tree = etree.fromstring(data, parser=parser)
        # Process the XML tree...
        return HttpResponse('Processed')
    return HttpResponse('Send XML', status=400)

In this example, even if the request includes a Basic Auth header, the parser is configured to resolve entities and potentially load external resources. An attacker could send an XML document with a DOCTYPE that defines an entity pointing to file:///etc/shadow, causing the parser to read and possibly exfiltrate that file if the application uses the parsed data in further operations. The risk is not theoretical; findings from scans using middleBrick have surfaced real-world endpoints where XML parsing logic intersects with weak access controls, leading to information disclosure. Mitigation requires explicit disabling of external entity resolution and, where possible, avoiding XML parsing entirely in favor of safer data formats.

Basic Auth-Specific Remediation in Django — concrete code fixes

Remediation focuses on two layers: eliminating unsafe XML features and ensuring that authentication and data handling do not inadvertently expose processing to external entity resolution. For XML parsing, disable DTD loading and external entity resolution. In lxml, avoid resolve_entities=True and do not use no_network=False; instead, use a secure parser configuration that prohibits external entities entirely. When possible, prefer JSON for data exchange, which avoids DTD-related risks altogether. If XML is required, use a parser with strict feature controls or an XML schema that does not permit external references.

Below are concrete, secure Django code examples that combine safe XML parsing with Basic Auth validation. The first example shows a view that uses HTTP Basic Authentication via Django’s HttpRequest.user after verifying credentials with django.contrib.auth.authenticate, and parses XML safely by disabling entity resolution:

from django.contrib.auth import authenticate
from django.http import HttpResponse, HttpResponseForbidden
from lxml import etree

def secure_xxe_view(request):
    # Perform Basic Auth check using Django's authentication
    user = authenticate(request, username=request.META.get('PHP_AUTH_USER'),
                        password=request.META.get('PHP_AUTH_PW'))
    if user is None:
        return HttpResponseForbidden('Invalid credentials')
    if request.method == 'POST':
        data = request.body
        # Secure: disable external entities and DOCTYPE
        parser = etree.XMLParser(resolve_entities=False, no_network=True,
                                 load_dtd=False, strip_cdata=True)
        try:
            tree = etree.fromstring(data, parser=parser)
            # Process the XML tree safely
            return HttpResponse('Processed securely')
        except etree.XMLSyntaxError:
            return HttpResponse('Invalid XML', status=400)
    return HttpResponse('Send XML', status=400)

The second example demonstrates using Django REST Framework with Basic Authentication and a custom parser that explicitly disables external entities. This approach is useful when you need structured APIs and want to leverage DRF’s authentication and permission classes:

from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework.permissions import IsAuthenticated
from rest_framework.authentication import BasicAuthentication
from lxml import etree

class SecureXMLView(APIView):
    authentication_classes = [BasicAuthentication]
    permission_classes = [IsAuthenticated]

    def post(self, request):
        data = request.body
        # Disable external entities and network access
        parser = etree.XMLParser(resolve_entities=False, no_network=True,
                                 load_dtd=False, strip_cdata=True)
        try:
            etree.fromstring(data, parser=parser)
            return Response({'status': 'ok'})
        except etree.XMLSyntaxError:
            return Response({'error': 'Invalid XML'}, status=400)

In both examples, the key remediation steps are: (1) use resolve_entities=False, (2) set no_network=True to prevent network-based entity fetches, (3) disable DTD loading with load_dtd=False, and (4) ensure that authentication is validated before processing the payload. middleBrick scans can surface remaining XML processing risks by correlating runtime behavior with OpenAPI specifications, helping teams verify that such unsafe configurations are not present in deployed endpoints.

Frequently Asked Questions

Does Basic Auth prevent XXE if credentials are transmitted over HTTPS?
No. HTTPS protects credentials in transit but does not affect XML parsing behavior. If the server processes XML with external entity resolution enabled, an attacker can still inject malicious entities and access local files or trigger SSRF, regardless of Basic Auth.
Can middleBrick detect XXE in APIs that require Basic Auth?
Yes. middleBrick performs unauthenticated black-box scanning and can identify endpoints that accept XML and exhibit unsafe parser configurations. Findings include specific guidance on disabling DTD and entity resolution in frameworks such as Django.