HIGH symlink attackdjangodynamodb

Symlink Attack in Django with Dynamodb

Symlink Attack in Django with Dynamodb — how this specific combination creates or exposes the vulnerability

A symlink attack in the context of Django using Amazon DynamoDB occurs when an application writes user-controlled data to a file or key path that an attacker can redirect via a symbolic link or equivalent indirection. Although DynamoDB is a managed NoSQL service without a traditional POSIX filesystem, the risk emerges at the application layer: Django code may interact with the local filesystem (for example, to stage uploads, cache objects, or serialize data) and reference paths that ultimately map to logical identifiers stored in DynamoDB. If an attacker can influence temporary file locations, cache keys, or object keys derived from user input, they may trick the application into reading or overwriting files outside the intended directory. In DynamoDB terms, this can mean unintended reads or writes to items that the application logic did not intend to expose, especially when primary key attributes are derived from user input without strict validation or isolation.

Consider a Django service that stores user-generated assets in the local filesystem and records metadata in DynamoDB. If the application uses an unvalidated user-supplied identifier to build both the DynamoDB item key and a local filename, an attacker can supply a path such as ../../../etc/passwd as part of a filename. When Django resolves this path and writes a cached object, the symlink or traversal may point outside the application’s sandbox. Because the metadata in DynamoDB references the same manipulated key, the application may later retrieve or overwrite files the attacker can influence. This illustrates the importance of treating primary key construction in DynamoDB with the same rigor as filesystem path handling: enforce strict allowlists, avoid direct concatenation of user input into keys, and isolate tenant or user contexts.

Another scenario involves DynamoDB Streams and Lambda triggers. If a Django application consumes stream records and uses attributes such as S3 object keys without verifying that the keys reside within an authorized prefix, an attacker who can influence earlier stages of data entry may craft keys that traverse logical boundaries. Although DynamoDB itself does not follow symlinks, the surrounding infrastructure — including temporary storage, backups, or export processes — may map logical keys to filesystem paths. Without robust input validation and strict key design, this mapping can be abused to read or write sensitive artifacts, effectively creating a symlink-like attack vector across the storage boundary.

Dynamodb-Specific Remediation in Django — concrete code fixes

To mitigate symlink-style risks when using DynamoDB with Django, focus on strict key design, input validation, and isolation of tenant or user contexts. Avoid deriving DynamoDB keys directly from concatenated user input. Instead, use a deterministic, application-controlled identifier and encode user input safely. The following examples demonstrate secure patterns for primary key construction and attribute validation.

Secure DynamoDB key construction with UUIDs

Generate a unique, non-guessable identifier for each logical entity and store user-supplied values as a separate, validated attribute. This prevents key manipulation attacks and limits lateral access across users.

import uuid
import boto3
from django.conf import settings

dynamodb = boto3.resource('dynamodb', region_name=settings.AWS_REGION)
table = dynamodb.Table(settings.DYNAMODB_TABLE_USERS)

def create_user_item(username: str, email: str) -> str:
    # Enforce username constraints before using as an attribute
    if not username.isalnum():
        raise ValueError('Invalid username')
    item_id = str(uuid.uuid4())
    table.put_item(Item={
        'user_id': item_id,          # Partition key, application-controlled
        'username': username,        # Stored as an attribute, not a key
        'email': email,
        # Add ownership or tenant context if needed
    })
    return item_id

Validating and namespacing keys for multi-tenant designs

When multi-tenancy is required, embed a tenant or namespace prefix in the key and validate it rigorously. Do not allow user input to dictate the prefix directly. Use allowlists and strict regex patterns to ensure keys remain within expected boundaries.

import re
import boto3
from django.conf import settings

dynamodb = boto3.resource('dynamodb', region_name=settings.AWS_REGION)
table = dynamodb.Table(settings.DYNAMODB_TABLE_DATA)

def put_tenant_data(tenant: str, external_id: str, payload: dict):
    # Strict tenant allowlist or pattern — never concatenate raw tenant
    if not re.fullmatch(r'[a-z0-9\-]{1,32}', tenant):
        raise ValueError('Invalid tenant')
    if not re.fullmatch(r'[a-f0-9]{8}-[a-f0-9]{4}-[1-5][a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}', external_id):
        raise ValueError('Invalid external_id')
    partition = f'{tenant}:data:{external_id}'  # Controlled composition
    table.put_item(Item={
        'pk': partition,   # Designated partition key
        'data': payload,
    })

Safe consumption of DynamoDB Streams in Django consumers

When processing stream records, validate and scope keys against allowed prefixes and avoid using raw key values as filesystem paths. Treat all attributes as untrusted input.

import boto3
import os
from django.conf import settings

s3_client = boto3.client('s3', region_name=settings.AWS_REGION)

def handle_stream_record(record):
    event_name = record['eventName']
    new_image = record['dynamodb'].get('NewImage', {})
    key = new_image.get('s3_key', {}).get('S', '')
    # Verify key prefix and ownership before any further action
    if not key.startswith('tenant-a/uploads/'):
        raise PermissionError('Invalid key prefix')
    # Use key in a controlled way, never directly as a local path
    # Example: download to a secure temporary location
    local_path = f'/tmp/{os.path.basename(key)}'
    s3_client.download_file(settings.AWS_S3_BUCKET, key, local_path)
    # Process file securely

General operational practices

Enforce least privilege IAM policies for the Django application’s DynamoDB access, restricting actions to specific table prefixes and operations.
Log key construction patterns and monitor for unexpected key formats that may indicate probing or abuse.
Apply validation on both create and read paths; treat primary key design as a security boundary.

Frequently Asked Questions

Why is it unsafe to use raw user input as DynamoDB primary keys in Django?

Using raw user input in keys enables key manipulation and unintended access across users or tenants. It can expose or overwrite items if input is not strictly validated, and it complicates isolation and auditing. Always use application-controlled identifiers and store user input as validated attributes.

How does DynamoDB’s lack of a filesystem reduce but not eliminate symlink risks?

DynamoDB itself does not support symlinks or filesystem paths, so traditional symlink attacks on the database layer are not possible. However, risks arise when applications map DynamoDB keys to local storage, caches, or external services. If key construction or temporary file paths are influenced by user input, attackers may traverse logical boundaries or trigger unintended reads/writes through the surrounding infrastructure.

Symlink Attack in Django with Dynamodb