Out Of Bounds Write in Django with Dynamodb
Out Of Bounds Write in Django with Dynamodb — how this specific combination creates or exposes the vulnerability
An Out Of Bounds Write occurs when data is written outside the intended memory boundaries. In the context of Django using Amazon DynamoDB as a backend, this typically manifests not as a memory corruption issue but as unchecked user input used to control key construction, item size, or conditional update expressions that exceed service limits or overwrite adjacent data structures. Because DynamoDB is a schemaless NoSQL service, the boundary is defined by service constraints (attribute value size 400 KB, partition key size restrictions, and conditional expression limits) and by application-level assumptions about record structure.
Consider a Django model that stores user profiles with a composite key made from tenant_id and user_id. If tenant_id is taken directly from an HTTP request without validation, an attacker can supply a tenant_id containing special characters or extremely long strings. DynamoDB will accept the request, but downstream logic that builds index names, cache keys, or S3 object paths from this identifier can overflow buffers or access unintended resources. For example, concatenating tenant_id with a fixed suffix to form a DynamoDB Stream ARN or a Lambda event context may produce an ARN that exceeds length limits or points to another tenant’s stream, enabling data exposure or privilege escalation across tenants.
Another common scenario involves update expressions that dynamically include attribute names based on user input. If an attacker controls the attribute name in a SET expression, they can write to system-reserved attribute names like aws:requestid or dynamodb:approximateitemcollectioncount, or induce sparse attribute growth that changes partition efficiency and error rates. This can lead to inconsistent state across items and logical data corruption, even though no memory is overwritten. The Django ORM’s limited DynamoDB support often encourages developers to construct expression attribute names using string interpolation, which bypasses ORM safeguards and introduces boundary violations.
An illustrative vulnerable pattern is direct string formatting of a key condition expression. Suppose a view builds a filter like partition_key = '{user_input}'. If user_input is a long string or contains delimiter characters, the resulting expression may scan more items than intended or fail validation in unexpected ways, causing the service to return partial or incorrect results. While this does not corrupt storage, it violates the intended access boundary and can be chained with other flaws to achieve unauthorized read or write.
In DynamoDB, item size is capped at 400 KB. A Django serializer that accepts nested lists or JSON blobs from user input can create items that approach or exceed this limit. If the application performs in-place updates by replacing large attributes, the update may fail or truncate data in unexpected ways, effectively creating an out-of-bounds condition where partial writes leave items in an inconsistent state. This is particularly risky when combined with conditional writes that do not validate size beforehand, as the conditional check may pass while the write fails mid-operation.
Dynamodb-Specific Remediation in Django — concrete code fixes
Remediation focuses on strict input validation, bounded key construction, safe expression building, and size enforcement. Treat all user-controlled data as untrusted and never directly interpolate it into DynamoDB expression attribute names or key condition strings.
Validated key construction
Ensure tenant and entity identifiers are normalized and bounded before use. Use a whitelist of allowed characters and enforce length limits.
import re
def normalize_tenant_id(tenant_id: str) -> str:
# Allow only alphanumeric and hyphens, max 32 chars
cleaned = re.sub(r'[^a-zA-Z0-9\-]', '', tenant_id)[:32]
if not cleaned:
raise ValueError('Invalid tenant_id')
return cleaned
tenant_id = normalize_tenant_id(request.GET.get('tenant', ''))
user_id = normalize_tenant_id(request.GET.get('user', ''))
partition_key = f'{tenant_id}#{user_id}' # bounded and safe
Safe update expressions with boto3 and Django model manager
Use the DynamoDB ExpressionAttributeNames to avoid injecting attribute names, and ExpressionAttributeValues for data. Validate list lengths before sending updates.
import boto3
from django.core.exceptions import ValidationError
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('UserProfiles')
def safe_update_profile(profile_id: str, updates: dict):
# Validate size: ensure item stays under 400 KB after update
current = table.get_item(Key={'profile_id': profile_id}).get('Item', {})
# shallow size estimate for demo; in production use pydantic or schematics
current_size = len(str(current))
new_data = str(updates)
if current_size + len(new_data) > 400 * 1024:
raise ValidationError('Update would exceed item size limit')
update_expr = 'SET '
expr_attr_names = {}
expr_attr_values = {}
for idx, (key, value) in enumerate(updates.items()):
attr_name = f'#attr{idx}'
expr_attr_names[attr_name] = key
expr_attr_values[f':val{idx}'] = value
update_expr += f'{attr_name} = {f\":val{idx}\"} ,'
update_expr = update_expr.rstrip(',')
table.update_item(
Key={'profile_id': profile_id},
UpdateExpression=update_expr,
ExpressionAttributeNames=expr_attr_names,
ExpressionAttributeValues=expr_attr_values,
ConditionExpression='attribute_exists(profile_id)'
)
Parameterized key condition expressions
Never format key condition strings. Use Placeholders via ExpressionAttributeNames and pass values through ExpressionAttributeValues.
response = table.query(
KeyConditionExpression=boto3.dynamodb.conditions.Key('tenant_id').eq(tenant_id) &
boto3.dynamodb.conditions.Key('entity_id').eq(entity_id),
ExpressionAttributeNames={'#tenant': 'tenant_id', '#entity': 'entity_id'},
ExpressionAttributeValues={':v_tenant': tenant_id, ':v_entity': entity_id},
Limit=100
)
Size and schema enforcement in serializers
Validate payload size and field types before attempting writes. Reject fields that map to reserved attribute names.
RESERVED = {'aws:requestid', 'dynamodb:approximateitemcollectioncount', 'id', 'partition_key'}
def validate_item(data: dict):
if any(k in RESERVED for k in data.keys()):
raise ValidationError('Reserved attribute name used')
if len(json.dumps(data)) > 400 * 1024:
raise ValidationError('Item exceeds 400 KB limit')
# further type checks
if 'email' in data and not '@' in data['email']:
raise ValidationError('Invalid email')
Middleware for request normalization
Add a lightweight middleware that cleans identifiers early, preventing boundary issues from reaching the ORM layer.
class TenantNormalizationMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
request.tenant = normalize_tenant_id(request.GET.get('tenant', 'default'))
return self.get_response(request)