Xpath Injection in Cockroachdb
How Xpath Injection Manifests in CockroachDB
Xpath Injection occurs when an application incorporates untrusted user input into an Xpath expression without proper sanitization, allowing an attacker to manipulate the query logic. While CockroachDB does not natively support Xpath or XML data types (unlike PostgreSQL), this vulnerability commonly arises in applications that store XML documents in CockroachDB's TEXT or VARCHAR columns and process them using application-layer Xpath engines (e.g., libxml2, lxml, xmldom). The attack surface exists at the API layer where user input influences Xpath queries executed against XML data retrieved from CockroachDB.
A typical vulnerable pattern in a Node.js/Express API using the pgx driver for CockroachDB might look like this:
// Vulnerable: User input directly concatenated into Xpath expression
app.get('/books/search', async (req, res) => {
const { author } = req.query;
// Fetch XML document from CockroachDB
const { rows } = await pool.query(
"SELECT xml_content FROM books WHERE id = $1",
[req.query.bookId]
);
const xmlDoc = rows[0].xml_content;
// Build Xpath using unsanitized user input
const xpathExpr = `//book[author='${author}']`;
const result = evaluateXpath(xmlDoc, xpathExpr); // Application-layer Xpath eval
res.json(result);
});An attacker could supply author=' or '1'='1 to bypass authorization and retrieve all books. Alternatively, recursive Xpath payloads like author=' | //book[position()=1] can extract sensitive nodes. Since CockroachDB stores the XML as plain text, the database itself is not directly vulnerable—the flaw lies in how the application constructs Xpath expressions from data stored in CockroachDB.
CockroachDB-specific considerations include: (1) Distributed SQL contexts may amplify impact if the same vulnerable code runs across multiple services accessing shared XML data; (2) CockroachDB's strong consistency means injected queries see a unified snapshot, potentially exposing more data; (3) If the XML contains PII or API keys (common in configuration documents), an Xpath injection can lead to compliance violations (GDPR, HIPAA) as mapped by middleBrick's scoring.
CockroachDB-Specific Detection
Detecting Xpath Injection in APIs backed by CockroachDB requires testing endpoints that accept parameters influencing Xpath expressions on stored XML. middleBrick's Input Validation check actively probes for this by injecting payloads designed to break Xpath syntax or alter query logic. The scanner operates as a black-box tool—no credentials or agents needed—submitting requests like:
' or '1'='1– Boolean-based detection' | //book– Union-style extraction'] | //book[1]– Closing bracket injection
middleBrick analyzes responses for patterns indicating successful injection: identical response structures with different data, error messages revealing Xpath context (e.g., Invalid expression), or unexpected XML node counts. Because CockroachDB APIs often return JSON, middleBrick correlates response anomalies with the injected payloads to flag potential Xpath Injection.
For example, scanning an endpoint like GET /api/books?search=author:John might reveal a vulnerability if a payload like author:' or '1'='1 returns all books instead of a filtered set. middleBrick's report includes the exact payload used, the observed behavioral change, and severity scoring per OWASP API Top 10:2019 (A5:2019 — Broken Function Level Authorization, though Xpath Injection often maps to A3:2019 — XML External Entities, which includes Xpath flaws). The scanner also cross-references any OpenAPI spec provided to identify parameters that might be used in Xpath contexts.
Unlike generic scanners, middleBrick's 12 parallel checks include specific logic for Xpath pattern detection, ensuring that even subtle injection points in CockroachDB-backed APIs are surfaced. The 5–15 second scan provides immediate feedback, crucial for CI/CD integration via middleBrick's GitHub Action, where you can fail builds if Xpath Injection risks exceed a threshold.
CockroachDB-Specific Remediation
Remediation focuses on eliminating user input from Xpath expression construction. The most robust approach is to avoid Xpath entirely by storing data in CockroachDB's native JSONB type and using its built-in JSON path functions (e.g., jsonb_path_query), which support parameterized queries. If XML must be used, implement strict input validation and use prepared statements at the application layer.
Example 1: Migrate from XML to JSONB in CockroachDB
-- Alter table to use JSONB
ALTER TABLE books ALTER COLUMN xml_content TYPE JSONB USING xml_content::JSONB;
-- Query using parameterized JSON path
SELECT jsonb_path_query(data, '$.author ? (@ == $1)', jsonb_build_object('1', $2))
FROM books WHERE id = $3;In application code (Node.js):
// Safe: Parameterized JSON path query
app.get('/books/search', async (req, res) => {
const { author } = req.query;
const { rows } = await pool.query(
`SELECT jsonb_path_query(data, '$.author ? (@ == $1)', jsonb_build_object('1', $2))
FROM books WHERE id = $3`,
[author, author, req.query.bookId]
);
res.json(rows);
});Example 2: If XML is unavoidable, use a whitelist and library-level escaping
With Python's lxml and CockroachDB's psycopg2:
from lxml import etree
import re
# Validate input against expected pattern (e.g., alphanumeric names only)
AUTHOR_PATTERN = re.compile(r'^[a-zA-Z0-9 ]+$')
def search_books(author_input, book_id):
if not AUTHOR_PATTERN.match(author_input):
raise ValueError('Invalid author name')
# Fetch XML from CockroachDB
cur.execute("SELECT xml_content FROM books WHERE id = %s", (book_id,))
xml_content = cur.fetchone()[0]
# Build Xpath safely using library functions
xpath = f"//book[author={etree.XPathLiteral(author_input)}]"
# etree.XPathLiteral properly escapes quotes and special chars
result = etree.XPath(xpath)(xml_content)
return resultKey CockroachDB-specific notes: Use prepared statements ($1, $2 placeholders) for all SQL interactions to separate query structure from data. Never concatenate user input into SQL or Xpath strings. If legacy XML data exists, consider a migration plan to JSONB to leverage CockroachDB's native query planning and indexing. middleBrick's remediation guidance in its reports will point to these exact patterns, mapping fixes to OWASP and compliance frameworks like PCI-DSS 6.5.1 (Xpath Injection).