HIGH xpath injectionfirestore

Xpath Injection in Firestore

How Xpath Injection Manifests in Firestore

Firestore is a NoSQL document store, but applications often store XML‑encoded data (e.g., SAML assertions, SOAP envelopes, or legacy configuration) inside Firestore documents. When the application later retrieves that XML and evaluates an XPath expression built from user‑supplied input, an injection vector appears.

Typical vulnerable flow:

  1. User input (e.g., a search term) is received via an API endpoint.
  2. The endpoint reads a Firestore document that contains an XML string.
  3. The application concatenates the raw input into an XPath query string and passes it to an XPath library (such as the xpath npm package).
  4. The library evaluates the expression against the XML, allowing the attacker to alter the query logic.

Example vulnerable Node.js code using the Firestore SDK:

const { getFirestore, doc, getDoc } = require('firebase/firestore');
const { select } = require('xpath');
const { DOMParser } = require('xmldom');

async function getUserPosts(req, res) {
  const userId = req.query.id; // <-- user‑controlled
  const firestore = getFirestore();
  const xmlDocRef = doc(firestore, 'xmlData', 'userPosts');
  const xmlSnap = await getDoc(xmlDocRef);
  if (!xmlSnap.exists()) {
    return res.status(404).send('Not found');
  }
  const xmlString = xmlSnap.data().payload; // XML stored in Firestore
  const dom = new DOMParser().parseFromString(xmlString, 'text/xml');
  // VULNERABLE: direct concatenation
  const xpathExpr = "//post[author='" + userId + "']";
  const nodes = select(xpathExpr, dom);
  res.json({ results: nodes.map(n => n.textContent) });
}

An attacker can supply id value like ' or '1'='1, resulting in the XPath expression:

//post[author='' or '1'='1']

which matches every post node, effectively bypassing intended access controls and potentially exposing all stored XML data.

Because the injection occurs after data is fetched from Firestore, the vulnerability is purely in the application layer, but the data source (Firestore) is the vector that brings the XML into the execution context.

Firestore‑Specific Detection

Detecting XPath injection in a Firestore‑backed service requires observing how user input is combined with XML data retrieved from the database. middleBrick’s black‑box scanner performs active probing against unauthenticated endpoints, looking for classic XPath payloads that cause observable changes in responses (e.g., differing HTTP status codes, response length, or error messages).

During a scan, middleBrick will:

  • Identify endpoints that accept query parameters, JSON body fields, or headers that are later reflected in the response.
  • For each parameter, inject a set of XPath‑specific strings such as:
    • ' or '1'='1
    • ' or 1=1 or '
    • '] | //*[name()='
    • ' and substring(password,1,1)='a
  • Monitor the response for signs of successful injection:
    • Return of additional data that should not be visible (e.g., extra XML nodes).
    • Changes in response time indicating boolean‑based blind XPath.
    • Error messages that reveal XPath syntax problems (e.g., "Invalid XPath expression").
  • Correlate the finding with the presence of a Firestore read operation by checking if the endpoint returns data that matches known Firestore document structures (collection/document IDs, timestamps, etc.).

If the scanner detects a parameter where injection alters the returned XML payload, it flags the issue under the "Input Validation" category with a severity rating based on the impact (data exposure, authentication bypass, etc.). The finding includes the exact payload used, the affected parameter, and a snippet of the response showing the injected result.

Because middleBrick does not require agents or credentials, it can test any publicly accessible API URL — ideal for scanning staging or production endpoints that interact with Firestore.

Firestore‑Specific Remediation

The most reliable fix is to avoid building XPath expressions from untrusted input altogether. Instead, use Firestore’s native querying capabilities to filter data at the source, eliminating the need to pull large XML blobs and process them client‑side.

If XML must be stored and queried, apply strict input validation and use an XPath library that supports parameterized queries or proper escaping. Below are two remediation patterns.

1. Replace XML‑in‑Firestore with Structured Firestore Fields

Store each piece of data as a separate Firestore field. Then query using Firestore’s where clauses, which are immune to XPath injection.

// Firestore document structure:
// collection: userPosts
// document id: auto‑generated
// fields: author (string), content (string), timestamp (timestamp)

async function getUserPostsSafe(req, res) {
  const userId = req.query.id;
  if (!/^[a-zA-Z0-9_-]{1,30}$/.test(userId)) {
    return res.status(400).send('Invalid user ID');
  }
  const firestore = getFirestore();
  const postsCol = collection(firestore, 'userPosts');
  const q = query(postsCol, where('author', '==', userId));
  const snapshot = await getDocs(q);
  const results = [];
  snapshot.forEach(doc => {
    results.push({ id: doc.id, ...doc.data() });
  });
  res.json({ results });
}

This approach removes XML handling entirely, so XPath injection cannot occur.

2. Safe XPath with Whitelisting and Escaping

If XML storage is unavoidable, restrict user input to a known safe set (e.g., alphanumeric usernames) and escape any special characters before concatenation.

const { escapeXPath } = require('xpath-escape'); // hypothetical helper

async function getUserPostsEscaped(req, res) {
  const userId = req.query.id;
  // Whitelist: only allow alphanumeric and underscore
  if (!/^[a-zA-Z0-9_]+$/.test(userId)) {
    return res.status(400).send('Invalid identifier');
  }
  const firestore = getFirestore();
  const xmlDocRef = doc(firestore, 'xmlData', 'userPosts');
  const xmlSnap = await getDoc(xmlDocRef);
  if (!xmlSnap.exists()) {
    return res.status(404).send('Not found');
  }
  const xmlString = xmlSnap.data().payload;
  const dom = new DOMParser().parseFromString(xmlString, 'text/xml');
  // Safe concatenation after validation
  const xpathExpr = `//post[author='${userId}']`;
  const nodes = select(xpathExpr, dom);
  res.json({ results: nodes.map(n => n.textContent) });
}

// If a more complex helper is needed, implement escaping:
function escapeXPath(value) {
  // Replace single quotes with XPath‑safe concatenation
  return "concat('" + value.replace(/'/g, "','\"', '") + "')";
}
// Usage: xpathExpr = `//post[author=${escapeXPath(userId)}]`;

The whitelist ensures only expected characters reach the XPath builder, while the escaping function (if needed) neutralizes quote‑based breakouts. Always prefer the first option — storing data in Firestore’s native structure — because it eliminates the injection surface and leverages Firestore’s built‑in security rules.

Frequently Asked Questions

Can middleBrick fix the XPath injection vulnerability in my Firestore‑backed API?
No. middleBrick is a detection‑only scanner. It identifies the vulnerability, provides details about the affected parameter and payload, and offers remediation guidance, but it does not modify or patch your code.
Is XPath injection only a problem when XML is stored in Firestore?
The injection occurs in the application layer when user input is used to build XPath expressions. Storing XML in Firestore is a common way that the XML reaches the application, but the same flaw would appear with any source of XML data (files, external services, etc.).