Formula Injection in Django with Dynamodb
Formula Injection in Django with Dynamodb — how this specific combination creates or exposes the vulnerability
Formula Injection occurs when untrusted input is concatenated into a computed expression that is later evaluated, often leading to unintended behavior or code execution. In Django applications that use Amazon DynamoDB as a backend, this risk arises when user-controlled data is used to construct query expressions, filter values, or condition parameters that are passed to the DynamoDB API. Although DynamoDB does not evaluate arbitrary code, formula injection can still manifest through malformed expressions that affect query logic, data retrieval, or access control decisions.
Django’s ORM does not directly interact with DynamoDB; instead, developers typically use the AWS SDK (boto3) via custom managers, repositories, or raw calls. When constructing DynamoDB request dictionaries, untrusted input may be interpolated into key condition expressions, projection expressions, or filter values. For example, concatenating a user-supplied field name or value into an ExpressionAttributeNames or ExpressionAttributeValues mapping without proper validation can shift the intended semantics of the request. If the application dynamically builds a KeyConditionExpression such as user_id = :uid and injects a value like :uid OR attribute_exists(aws), the resulting expression can bypass intended partition key matching, exposing data belonging to other users.
DynamoDB’s type system and expression syntax introduce additional risks. Numeric or string values that are directly interpolated can change the structure of the request. If an attacker provides a string like 1; DROP TABLE in a field that is later used in a condition, DynamoDB will treat it as a literal string, but surrounding application logic might misinterpret the input and produce unsafe downstream behavior. In addition, if the application uses raw boto3 calls with string formatting to assemble request payloads, injection can affect pagination, filtering, or conditional updates. For instance, using Python’s .format() or f-strings to build a UpdateItem request can allow an attacker to inject additional attribute updates or condition expressions that modify more items than intended.
Another vector arises from the use of untrusted input in SortKey comparisons. If a Django view builds a KeyConditionExpression using string concatenation such as partition_key = :pk AND sort_key > :sk and the sort key value is supplied by the client, an attacker could supply a value like :sk OR attribute_exists(aws). Although DynamoDB will treat this as a string, the SDK request may still be malformed, leading to unexpected query results or data exposure. Because DynamoDB responses are not rendered as HTML, reflected XSS is less common, but formula injection can still lead to sensitive data exposure when query filters are bypassed.
Because DynamoDB is a managed NoSQL service, it does not provide SQL-like injection protections. Developers must validate and sanitize all inputs that influence expression construction. The absence of a traditional database abstraction layer in Django means that these safeguards must be implemented at the integration layer. MiddleBrick’s scans can detect patterns where user input directly influences DynamoDB request structures, highlighting areas where formula injection may occur in unauthenticated or authenticated attack surfaces.
Dynamodb-Specific Remediation in Django — concrete code fixes
To prevent formula injection when using DynamoDB with Django, ensure that all user input is strictly validated and never directly interpolated into expression strings. Use DynamoDB expression attribute names and values correctly, and rely on the SDK’s built-in parameterization rather than string formatting. Below are concrete patterns and code examples to secure the integration.
1. Use ExpressionAttributeNames and ExpressionAttributeValues
Always use placeholders for attribute names and values. Do not concatenate user input into key condition expressions.
import boto3
from django.conf import settings
dynamodb = boto3.resource('dynamodb', region_name=settings.AWS_REGION)
table = dynamodb.Table('UserProfiles')
def get_user_profile(user_id: str, sort_key_condition: str) -> dict:
response = table.query(
KeyConditionExpression='user_id = :uid AND sort_key > :sk',
ExpressionAttributeNames={'#uid': 'user_id', '#sk': 'sort_key'},
ExpressionAttributeValues={
':uid': user_id,
':sk': sort_key_condition
}
)
return response
This ensures that user_id and sort_key are treated as literal values, not executable expressions.
2. Validate and sanitize input types
Ensure that numeric fields remain numeric and that strings do not contain unexpected control characters or expression fragments.
def sanitize_sort_key(value: str) -> str:
# Allow only alphanumeric, underscore, and hyphen
if not re.match(r'^[a-zA-Z0-9_-]+$', value):
raise ValueError('Invalid sort key')
return value
# Usage
sort_key = sanitize_sort_key(user_input)
3. Avoid dynamic expression building with string formatting
Never use f-strings or .format() to assemble KeyConditionExpression or UpdateItem update expressions.
# Unsafe
expr = f"user_id = '{user_id}' AND sort_key = '{sort_key}'"
# Safe
expr = 'user_id = :uid AND sort_key = :sk'
params = {':uid': user_id, ':sk': sort_key}
4. Use strict schema validation
Define expected attribute types and enforce them before constructing requests.
from pydantic import BaseModel, constr
class QueryInput(BaseModel):
user_id: constr(min_length=1, max_length=100)
sort_key: constr(regex=r'^[a-zA-Z0-9_-]+$')
def query_profile(data: dict) -> dict:
validated = QueryInput(**data)
# proceed with validated data
return table.query(
KeyConditionExpression='user_id = :uid',
ExpressionAttributeValues={':uid': validated.user_id}
)
5. Leverage middleware for request inspection
In Django, use middleware to log or reject requests where DynamoDB parameters appear suspicious, such as containing keywords like OR, AND, or semicolons in numeric fields.
By combining strict input validation, proper use of expression attributes, and schema enforcement, Django applications using DynamoDB can effectively mitigate formula injection risks while maintaining compatibility with DynamoDB’s query model.