Log Injection in Fastapi with Firestore
Log Injection in Fastapi with Firestore — how this specific combination creates or exposes the vulnerability
Log injection occurs when untrusted data is written directly into application logs without validation or encoding, allowing an attacker to forge log entries, hide malicious activity, or poison log-based monitoring. In a Fastapi application that uses Google Cloud Firestore as a backend datastore, the risk emerges from two intersecting factors: the web framework’s request handling and Firestore’s document-oriented persistence.
Fastapi processes incoming HTTP requests and often logs information such as usernames, identifiers, or query parameters directly from request bodies, headers, or path parameters. If these values are concatenated into log messages without sanitization, attackers can inject newline characters or structured payloads that alter the log format. Because Firestore is frequently used to store user data, activity records, or audit trails, developers sometimes log Firestore document IDs, collection names, or query results to aid debugging. When those logs contain attacker-controlled values, the injected content can split log lines or embed additional entries, making it difficult to distinguish legitimate events from malicious ones.
For example, consider a Fastapi endpoint that retrieves a user profile from Firestore and logs the operation with the user-supplied identifier:
from fastapi import FastAPI, HTTPException
import logging
from google.cloud import firestore
app = FastAPI()
logger = logging.getLogger(__name__)
db = firestore.Client()
@app.get("/users/{user_id}")
async def get_user(user_id: str):
doc_ref = db.collection("users").document(user_id)
doc = doc_ref.get()
if not doc.exists:
logger.warning(f"User not found: {user_id}")
raise HTTPException(status_code=404, detail="User not found")
logger.info(f"Retrieved user: {user_id}")
return doc.to_dict()
If user_id contains a newline (e.g., attacker%0A%20%20%20%20injected:true), the log line can be split, and the injected text may be interpreted as a separate log entry by log aggregation tools. Because Firestore document IDs are often used directly in log messages, similar injection can occur when IDs or query fields contain carriage returns or other control characters. This can complicate log analysis, trigger incorrect alerts, or obscure genuine security events.
Additionally, Firestore queries in Fastapi may include dynamic filters based on request parameters. Logging the constructed query or its results without sanitization can amplify the impact. For instance, logging the entire query dictionary or a document snapshot may inadvertently expose injected content that alters log structure. The combination of Fastapi’s flexible parameter handling and Firestore’s rich data model increases the surface for log injection if input is not treated as untrusted.
Firestore-Specific Remediation in Fastapi — concrete code fixes
To mitigate log injection when using Fastapi with Firestore, treat all data that enters log messages as untrusted. Apply normalization, strict validation, and structured logging to preserve log integrity.
- Validate and sanitize inputs before logging: Reject or encode newline and carriage return characters in identifiers and user-controlled fields. For IDs used with Firestore, enforce a strict allowlist pattern and avoid echoing raw values into logs.
- Use structured logging with safe serialization: Log structured objects rather than string concatenation. This avoids accidental line breaks and makes logs easier to parse securely.
- Avoid logging raw Firestore documents or queries: Extract only necessary, sanitized fields. Do not directly serialize document IDs or query filters into log messages.
Here is a secure version of the previous endpoint using structured logging and input validation:
from fastapi import FastAPI, HTTPException, Query
import logging
import re
from google.cloud import firestore
app = FastAPI()
logger = logging.getLogger(__name__)
db = firestore.Client()
# Strict allowlist for user IDs: alphanumeric and underscores, 3–64 chars
USER_ID_RE = re.compile(r"^[A-Za-z0-9_]{3,64}$")
def safe_log_user(message: str, user_id: str) -> None:
"""Log user-related events with sanitized identifiers."""
if not USER_ID_RE.match(user_id):
# Reject or mask invalid IDs; do not log raw value
logger.warning(f"{message} [invalid_id_masked]")
return
logger.info(f"{message} user_id={user_id}")
@app.get("/users/{user_id}")
async def get_user(user_id: str):
if not USER_ID_RE.match(user_id):
raise HTTPException(status_code=400, detail="Invalid user identifier")
doc_ref = db.collection("users").document(user_id)
doc = doc_ref.get()
if not doc.exists:
safe_log_user("User not found", user_id)
raise HTTPException(status_code=404, detail="User not found")
safe_log_user("Retrieved user", user_id)
# Return only safe, intended fields; avoid logging full document
return {"id": doc.id, "name": doc.get("name")}
In this example, input validation ensures that only safe identifiers reach the log message. The safe_log_user helper centralizes logging logic and prevents accidental exposure of raw user input. By avoiding direct inclusion of Firestore document IDs or query parameters in log strings, you reduce the risk of newline or control-character injection.
For broader protection, configure your logging formatter to escape or drop control characters. In production, pair this with monitoring rules that detect anomalous log patterns, but remember that middleBrick focuses on detection and reporting: it scans endpoints and returns security risk scores with prioritized findings and remediation guidance, helping you identify log injection issues among other API security concerns. Whether running a quick check with the CLI using middlebrick scan <url> or integrating scans into CI/CD with the GitHub Action, or leveraging the MCP Server to scan APIs directly from your AI coding assistant, middleBrick provides findings and guidance without attempting to fix or block issues itself.