Xml External Entities on Docker
How Xml External Entities Manifests in Docker
XML External Entity (XXE) attacks in Docker environments exploit the way XML parsers process external references, creating unique attack vectors when containerized applications handle untrusted XML input. In Docker contexts, XXE vulnerabilities become particularly dangerous because containers often process configuration files, API requests, or data from external sources without proper validation.
The most common Docker-specific XXE manifestation occurs in applications that parse XML configurations or API payloads. Consider a Node.js application running in a Docker container that processes XML uploads. If the application uses a vulnerable XML parser without disabling external entity processing, an attacker can craft a malicious XML document that reads sensitive files from the container's filesystem or even the host system if volumes are improperly mounted.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >
]>
<foo>&xxe;</foo>In a Docker container, this attack can be amplified by the container's filesystem structure. If the application runs as root or has elevated privileges, the XXE payload might access files like /proc/self/cgroup to reveal container metadata, or even attempt to read from mounted volumes that expose host system files.
Another Docker-specific scenario involves XML processing in build stages. Multi-stage Docker builds often use XML for dependency management (Maven, Gradle, NuGet). If these build tools process untrusted XML without proper validation, attackers could inject XXE payloads during the build process, potentially causing the build to fail or exfiltrate build secrets.
Container orchestration platforms add another layer of complexity. When XML data flows between microservices in Kubernetes or Docker Swarm, an XXE vulnerability in one service can become a supply chain attack vector, allowing attackers to compromise downstream services that process the malicious XML.
Docker-Specific Detection
Detecting XXE vulnerabilities in Docker environments requires both static analysis of container images and runtime scanning of running containers. The most effective approach combines automated scanning with manual code review.
middleBrick's Docker-specific XXE detection examines running containers for vulnerable XML processing patterns. When you scan a Docker-hosted API endpoint, middleBrick tests for XXE by sending crafted XML payloads and analyzing the responses for signs of successful entity expansion or file access attempts.
For manual detection, examine your Dockerfile and application code for XML processing libraries. Look for patterns like:
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]In your application code, search for XML parser configurations that might be vulnerable:
// Vulnerable Java code
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream); // No security features enabledmiddleBrick specifically tests for XXE by attempting to access well-known paths within the container, such as /proc/self/cgroup, /etc/hostname, and /proc/version. If the application responds with data from these paths, it indicates a successful XXE attack.
The scanner also checks for XML parsers configured with external entity processing enabled. For example, in Python applications, it looks for patterns where xml.sax or xml.etree.ElementTree is used without disabling external entity loading.
middleBrick's LLM/AI Security module adds another layer of detection by scanning for AI/ML endpoints that might process XML prompts or configuration files, testing for XXE in AI-specific contexts where XML might be used for model configuration or prompt templates.
Docker-Specific Remediation
Remediating XXE vulnerabilities in Docker environments requires a defense-in-depth approach that addresses both the XML processing configuration and the container security posture. The primary mitigation is configuring XML parsers to disable external entity processing.
For Java applications, modify your XML processing code:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);For Python applications, use safe XML parsing:
import xml.etree.ElementTree as ET
from defusedxml.ElementTree import parse
# Use defusedxml instead of standard library
tree = parse('input.xml')
root = tree.getroot()Integrate these security configurations directly into your Docker build process. Create a multi-stage build that includes security scanning:
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm audit --audit-level=high
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
RUN npm install --only=production
# Add security scanning
RUN npm install -g middlebrick
RUN middlebrick scan http://localhost:3000Implement runtime security controls by running containers with minimal privileges:
docker run -d \
--read-only \
--user 1000:1000 \
--cap-drop ALL \
--cap-add CHOWN \
-e NODE_ENV=production \
-p 3000:3000 \
--name myapp \
myapp:latestFor applications that must process XML but need to handle external entities safely, implement content security policies and input validation. Use allowlists for XML schemas and validate all incoming XML against known-good schemas before processing.
middleBrick's continuous monitoring in the Pro plan can help maintain these security controls by regularly scanning your APIs for XXE vulnerabilities, even as your application evolves. The GitHub Action integration allows you to fail builds if XXE vulnerabilities are detected, ensuring that security regressions are caught before deployment.