Format String in Fastapi with Dynamodb
Format String in Fastapi with Dynamodb — how this specific combination creates or exposes the vulnerability
A format string vulnerability occurs when user-controlled input is directly used in string formatting operations without proper sanitization. In FastAPI, this commonly arises in route parameters, query strings, or request bodies that are later passed to logging, error messages, or DynamoDB attribute values. When an attacker can influence the format string—such as providing %s, %x, or other format specifiers—and that input is used with Python’s % operator or .format(), they can read from or potentially write to memory.
Combining FastAPI and DynamoDB introduces a specific risk pattern: user input intended for a DynamoDB operation (e.g., a table key or filter expression) is first interpolated into a logging or error string using insecure formatting. For example, if a developer writes f"Fetching item with id {user_id}" or uses "Fetching item with id %s" % user_id before passing user_id to a DynamoDB GetItem call, the format string becomes a side channel. Although DynamoDB itself does not interpret format specifiers, the vulnerability occurs in the application layer before the request reaches the database. An attacker can supply payloads like "%s %s %s" to cause information disclosure through log output or crash the service with malformed input.
Consider a FastAPI endpoint that retrieves a DynamoDB item using a user-supplied ID:
from fastapi import FastAPI, HTTPException
import boto3
from botocore.exceptions import ClientError
app = FastAPI()
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Items')
@app.get("/items/{item_id}")
def read_item(item_id: str):
# Unsafe: using user input directly in a log message with % formatting
print("Fetching item with id: %s" % item_id)
try:
response = table.get_item(Key={"id": item_id})
item = response.get('Item')
if not item:
raise HTTPException(status_code=404, detail="Item not found")
return item
except ClientError as e:
# Unsafe: embedding user input in error string with .format()
raise HTTPException(status_code=500, detail="DB error: {}".format(e.response['Error']['Message']))
In this example, item_id is used in a %s format string and also interpolated into an error message via .format(). An attacker sending item_id="%x%x%x" could cause the application to leak stack memory in logs. Similarly, if e.response['Error']['Message'] contains user-influenced values, further injection becomes possible. The DynamoDB-specific harm is indirect: the vulnerability does not exploit DynamoDB parsing, but rather the logging and error-handling layer surrounding the database call, which can lead to information disclosure that aids further attacks.
Additionally, if user input is used to construct a DynamoDB expression attribute values or filter expressions without validation, and that expression is later logged, the same format string risk applies. For instance, building a filter string with "FilterExpression": "#status = %s" % status before passing it to DynamoDB exposes the application if status is attacker-controlled. The core issue is insecure formatting, not DynamoDB itself, but the combination amplifies impact because database operations often involve sensitive data that may be exposed through logs or error messages.
Dynamodb-Specific Remediation in Fastapi — concrete code fixes
Remediation centers on separating data from code and avoiding string interpolation for control information. For DynamoDB operations in FastAPI, always use parameterized SDK methods and structured logging. Never embed user input directly into format strings or log messages. Instead, use Python’s logging module with proper argument passing, and rely on DynamoDB’s native parameterization for attribute values and expression attribute names.
Below is a secure version of the earlier endpoint. It uses table.get_item(Key={"id": item_id}) with no string interpolation for user data, and structured logging that avoids format string risks:
import logging
from fastapi import FastAPI, HTTPException
import boto3
from botocore.exceptions import ClientError
app = FastAPI()
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Items')
logger = logging.getLogger(__name__)
@app.get("/items/{item_id}")
def read_item(item_id: str):
# Safe: no format string interpolation with user input
logger.info("Fetching item with id", extra={"item_id": item_id})
try:
response = table.get_item(Key={"id": item_id})
item = response.get('Item')
if not item:
raise HTTPException(status_code=404, detail="Item not found")
return item
except ClientError as e:
# Safe: error message is from the SDK, not user input
error_code = e.response['Error']['Code']
error_message = e.response['Error']['Message']
logger.error("DynamoDB error", extra={"error_code": error_code, "error_message": error_message})
raise HTTPException(status_code=500, detail="Internal server error")
When constructing DynamoDB expression attribute names or values, use the SDK’s built-in mechanisms. For example, when a field name is dynamic, use expression attribute names; when data is dynamic, use expression attribute values:
def query_by_status(status: str):
response = table.query(
FilterExpression=Attr("#status").eq(status),
ExpressionAttributeNames={"#status": "status"},
ExpressionAttributeValues={":val": status}
)
return response['Items']
This approach ensures that user input is never part of the format string. The status value is passed as a parameter in ExpressionAttributeValues, and dynamic field names are mapped safely via ExpressionAttributeNames. Logging uses structured data (e.g., extra dict) so the logging backend handles formatting, preventing any chance of format string exploitation. Regular security scans with tools like middleBrick can help detect any remaining instances of unsafe string usage around DynamoDB calls.