HIGH out of bounds writedjangodynamodb

Out Of Bounds Write in Django with Dynamodb

Out Of Bounds Write in Django with Dynamodb — how this specific combination creates or exposes the vulnerability

An Out Of Bounds Write occurs when data is written outside the intended memory boundaries. In the context of Django using Amazon DynamoDB as a backend, this typically manifests not as a memory corruption issue but as unchecked user input used to control key construction, item size, or conditional update expressions that exceed service limits or overwrite adjacent data structures. Because DynamoDB is a schemaless NoSQL service, the boundary is defined by service constraints (attribute value size 400 KB, partition key size restrictions, and conditional expression limits) and by application-level assumptions about record structure.

Consider a Django model that stores user profiles with a composite key made from tenant_id and user_id. If tenant_id is taken directly from an HTTP request without validation, an attacker can supply a tenant_id containing special characters or extremely long strings. DynamoDB will accept the request, but downstream logic that builds index names, cache keys, or S3 object paths from this identifier can overflow buffers or access unintended resources. For example, concatenating tenant_id with a fixed suffix to form a DynamoDB Stream ARN or a Lambda event context may produce an ARN that exceeds length limits or points to another tenant’s stream, enabling data exposure or privilege escalation across tenants.

Another common scenario involves update expressions that dynamically include attribute names based on user input. If an attacker controls the attribute name in a SET expression, they can write to system-reserved attribute names like aws:requestid or dynamodb:approximateitemcollectioncount, or induce sparse attribute growth that changes partition efficiency and error rates. This can lead to inconsistent state across items and logical data corruption, even though no memory is overwritten. The Django ORM’s limited DynamoDB support often encourages developers to construct expression attribute names using string interpolation, which bypasses ORM safeguards and introduces boundary violations.

An illustrative vulnerable pattern is direct string formatting of a key condition expression. Suppose a view builds a filter like partition_key = '{user_input}'. If user_input is a long string or contains delimiter characters, the resulting expression may scan more items than intended or fail validation in unexpected ways, causing the service to return partial or incorrect results. While this does not corrupt storage, it violates the intended access boundary and can be chained with other flaws to achieve unauthorized read or write.

In DynamoDB, item size is capped at 400 KB. A Django serializer that accepts nested lists or JSON blobs from user input can create items that approach or exceed this limit. If the application performs in-place updates by replacing large attributes, the update may fail or truncate data in unexpected ways, effectively creating an out-of-bounds condition where partial writes leave items in an inconsistent state. This is particularly risky when combined with conditional writes that do not validate size beforehand, as the conditional check may pass while the write fails mid-operation.

Dynamodb-Specific Remediation in Django — concrete code fixes

Remediation focuses on strict input validation, bounded key construction, safe expression building, and size enforcement. Treat all user-controlled data as untrusted and never directly interpolate it into DynamoDB expression attribute names or key condition strings.

Validated key construction

Ensure tenant and entity identifiers are normalized and bounded before use. Use a whitelist of allowed characters and enforce length limits.

import re
def normalize_tenant_id(tenant_id: str) -> str:
    # Allow only alphanumeric and hyphens, max 32 chars
    cleaned = re.sub(r'[^a-zA-Z0-9\-]', '', tenant_id)[:32]
    if not cleaned:
        raise ValueError('Invalid tenant_id')
    return cleaned

tenant_id = normalize_tenant_id(request.GET.get('tenant', ''))
user_id = normalize_tenant_id(request.GET.get('user', ''))
partition_key = f'{tenant_id}#{user_id}'  # bounded and safe

Safe update expressions with boto3 and Django model manager

Use the DynamoDB ExpressionAttributeNames to avoid injecting attribute names, and ExpressionAttributeValues for data. Validate list lengths before sending updates.

import boto3
from django.core.exceptions import ValidationError

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('UserProfiles')

def safe_update_profile(profile_id: str, updates: dict):
    # Validate size: ensure item stays under 400 KB after update
    current = table.get_item(Key={'profile_id': profile_id}).get('Item', {})
    # shallow size estimate for demo; in production use pydantic or schematics
    current_size = len(str(current))
    new_data = str(updates)
    if current_size + len(new_data) > 400 * 1024:
        raise ValidationError('Update would exceed item size limit')

    update_expr = 'SET '
    expr_attr_names = {}
    expr_attr_values = {}
    for idx, (key, value) in enumerate(updates.items()):
        attr_name = f'#attr{idx}'
        expr_attr_names[attr_name] = key
        expr_attr_values[f':val{idx}'] = value
        update_expr += f'{attr_name} = {f\":val{idx}\"} ,'
    update_expr = update_expr.rstrip(',')

    table.update_item(
        Key={'profile_id': profile_id},
        UpdateExpression=update_expr,
        ExpressionAttributeNames=expr_attr_names,
        ExpressionAttributeValues=expr_attr_values,
        ConditionExpression='attribute_exists(profile_id)'
    )

Parameterized key condition expressions

Never format key condition strings. Use Placeholders via ExpressionAttributeNames and pass values through ExpressionAttributeValues.

response = table.query(
    KeyConditionExpression=boto3.dynamodb.conditions.Key('tenant_id').eq(tenant_id) & 
                           boto3.dynamodb.conditions.Key('entity_id').eq(entity_id),
    ExpressionAttributeNames={'#tenant': 'tenant_id', '#entity': 'entity_id'},
    ExpressionAttributeValues={':v_tenant': tenant_id, ':v_entity': entity_id},
    Limit=100
)

Size and schema enforcement in serializers

Validate payload size and field types before attempting writes. Reject fields that map to reserved attribute names.

RESERVED = {'aws:requestid', 'dynamodb:approximateitemcollectioncount', 'id', 'partition_key'}

def validate_item(data: dict):
    if any(k in RESERVED for k in data.keys()):
        raise ValidationError('Reserved attribute name used')
    if len(json.dumps(data)) > 400 * 1024:
        raise ValidationError('Item exceeds 400 KB limit')
    # further type checks
    if 'email' in data and not '@' in data['email']:
        raise ValidationError('Invalid email')

Middleware for request normalization

Add a lightweight middleware that cleans identifiers early, preventing boundary issues from reaching the ORM layer.

class TenantNormalizationMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        request.tenant = normalize_tenant_id(request.GET.get('tenant', 'default'))
        return self.get_response(request)

Frequently Asked Questions

Can an Out Of Bounds Write in DynamoDB lead to cross-tenant data access?
Yes. If tenant identifiers are not validated and are used directly to construct DynamoDB keys, expression attribute names, or ARNs, an attacker can manipulate these values to reference another tenant’s items or streams, resulting in unauthorized access across boundaries.
Does middleBrick detect Out Of Bounds Write risks in DynamoDB-backed APIs?
middleBrick scans unauthenticated attack surfaces and includes checks such as Property Authorization and Input Validation that can identify unsafe key construction and oversized payload risks. Findings include severity, impact description, and remediation guidance, though middleBrick does not fix or block issues directly.