Xml External Entities in Flask
How Xml External Entities Manifests in Flask
XML External Entity (XXE) injection occurs when an application parses untrusted XML input without disabling external entity resolution. In Flask, this vulnerability typically manifests in endpoints that accept XML payloads—common in legacy integrations, SOAP APIs, or third-party webhook handlers. Unlike frameworks with built-in XML parsers (e.g., Django’s django.views.decorators.csrf.csrf_exempt does not prevent XXE), Flask provides no default protection and relies entirely on developer diligence.
XXE exploits occur when using Python’s xml.etree.ElementTree or lxml with default settings. For example, consider this vulnerable Flask route:
from flask import Flask, request, jsonify
import xml.etree.ElementTree as ET
app = Flask(__name__)
@app.route('/upload', methods=['POST'])
def upload():
data = request.data.decode('utf-8')
root = ET.fromstring(data) # Vulnerable: no parser configuration
return jsonify({'status': 'ok', 'id': root.find('id').text})Here, an attacker could submit XML with an external entity to exfiltrate files:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<data>
<id>&xxe;</id>
</data>Flask’s request.data and request.stream provide raw request bodies, making it easy to inadvertently parse untrusted XML. Other common patterns include:
- Using
lxml.etree.fromstring()withoutresolve_entities=False - Passing XML from
request.form(e.g.,Content-Type: text/xmlin multipart forms) - Integrating with SOAP services via
zeeporspynewithout disabling external DTDs
Flask’s lack of default XML hardening means every XML parser instantiation is a potential vector. Unlike Spring Boot (with spring.autoconfigure.jackson.xml.disable-external-entities) or .NET’s XmlReaderSettings.DtdProcessing, Flask offers no environment-based guardrails.
Flask-Specific Detection
XXE in Flask is detectable through static analysis, runtime scanning, and dynamic probing. MiddleBrick identifies this risk via its Input Validation and Data Exposure checks, using black-box payloads that triggerXXE behavior without requiring source code access.
Key detection signals:
- XML responses containing file paths or system errors (e.g.,
<error>file:///etc/passwd not found</error>) - Out-of-band (OOB) DNS exfiltration attempts when the target resolves attacker-controlled subdomains
- Timeouts or resource exhaustion from billion laughs attacks
MiddleBrick tests Flask endpoints by sending malicious XML payloads like:
<!DOCTYPE r [<!ENTITY a "xxxxxxxxxx"> <!ENTITY b "&a;&a;&a;&a;&a;"> <!ENTITY c "&b;&b;&b;&b;"> <!ENTITY d "&c;&c;&c;"> <!ENTITY e "&d;&d;"> <!ENTITY f "&e;&e;"> <!ENTITY g "&f;"> <!ENTITY h "&g;"> <!ENTITY i "&h;"> <!ENTITY j "&i;"> <!ENTITY k "&j;"> <!ENTITY l "&k;"> <!ENTITY m "&l;"> <!ENTITY n "&m;"> <!ENTITY o "&n;"> <!ENTITY p "&o;"> <!ENTITY q "&p;"> <!ENTITY r "&q;"> <!ENTITY s "&r;"> <!ENTITY t "&s;"> <!ENTITY u "&t;"> <!ENTITY v "&u;"> <!ENTITY w "&v;"> <!ENTITY x "&w;"> <!ENTITY y "&x;"> <!ENTITY z "&y;"> <data>&z;</data>]and checks for:
- HTTP 500/503 responses indicating parser exhaustion
- XML comments or error traces leaking stack traces (e.g.,
ElementTree.ParseError: undefined entity) - External resource fetch attempts logged in server DNS queries
Manually, developers can use the middlebrick CLI to scan Flask endpoints:
middlebrick scan http://localhost:5000/upload \
--header "Content-Type: text/xml" \
--payload-file xxe-payload.xmlMiddleBrick reports XXE under the Input Validation category with severity critical and maps it to OWASP API Top 10 (A04:2023 – Insecure Design) and CWE-611.
Flask-Specific Remediation
Remediation in Flask requires configuring Python’s XML parsers explicitly. There are two robust strategies:
1. Use Safe Parsers with Entity Resolution Disabled
For xml.etree.ElementTree, always use ET.XMLParser with resolve_entities=False (Python ≥3.8.8) or defusedxml as a drop-in replacement:
from flask import Flask, request, jsonify
import defusedxml.ElementTree as ET # Recommended for Python <3.8.8
app = Flask(__name__)
@app.route('/upload', methods=['POST'])
def upload():
data = request.data.decode('utf-8')
root = ET.fromstring(data) # defusedxml blocks external entities by default
return jsonify({'status': 'ok', 'id': root.find('id').text})defusedxml is the safest choice for Flask apps—it enforces secure defaults and is actively maintained for CVEs like CVE-2022-23337, CVE-2013-1664.
2. Disable DTDs Explicitly with lxml
If using lxml, configure the parser to reject DTDs:
from lxml import etree
from flask import Flask, request, jsonify
app = Flask(__name__)
parser = etree.XMLParser(resolve_entities=False, no_network=True, dtd_validation=False)
@app.route('/upload', methods=['POST'])
def upload():
data = request.data.decode('utf-8')
root = etree.fromstring(data.encode('utf-8'), parser=parser)
return jsonify({'status': 'ok', 'id': root.find('id').text})Key flags:
resolve_entities=False— prevents entity resolutionno_network=True— blocks HTTP/HTTPS entity fetchesdtd_validation=False— disables DTD processing
Never use ET.parse() with file paths or URLs, and avoid xml.etree.ElementTree.fromstring() without parser context. Also, set Content-Type: application/xml validation middleware to reject non-XML requests early:
@app.before_request
def validate_content_type():
if request.method == 'POST' and request.data:
ct = request.content_type or ''
if 'xml' in ct and not request.content_length:
return 'Missing XML body', 400For compliance, document parser configurations in your requirements.txt and dockerfile to ensure all environments enforce safe XML handling.
Frequently Asked Questions
Does Flask’s built-in <code>request.get_json()</code> protect against XXE?
request.get_json() only parses JSON, not XML. XXE occurs when XML is parsed via xml.etree.ElementTree, lxml, or SOAP libraries. Always check the Content-Type header and payload format before parsing.Can using <code>flask-wtf</code> prevent XXE in form-submitted XML?
flask-wtf provides CSRF protection and form validation but does not sanitize or restrict XML content. If your form accepts XML (e.g., via request.form['xml_data']), you must still disable external entities during parsing.