Pii Leakage in Django with Dynamodb
Pii Leakage in Django with Dynamodb — how this specific combination creates or exposes the vulnerability
Django applications that use Amazon DynamoDB as a persistence layer can unintentionally expose personally identifiable information (PII) when data access patterns, serialization logic, and DynamoDB’s schema-less design interact in insecure ways. Because DynamoDB does not enforce a fixed schema at the database level, developers must enforce access controls and data handling in application code; if those controls are incomplete, queries that return user data can also return sensitive fields such as email, phone, or IAM-related attributes.
One common pattern is using DynamoDB’s low-level client or higher-level abstractions to perform get_item or query operations without explicitly projecting only the required attributes. If a view deserializes the full DynamoDB JSON response—including metadata like user_subscriptions or internal_flags—and passes it to a template or API response, PII can be leaked to clients or logs. In addition, DynamoDB Streams or export-to-S3 features can replicate sensitive data to locations that lack encryption or access controls, increasing exposure risk.
Django’s serializer layer may also contribute to leakage when developers map DynamoDB items directly to dictionaries and feed them into JSONResponse or third‑party packages without redaction. For example, a serializer that includes fields such as ssn, date_of_birth, or password_hash without conditional logic can expose credentials or health identifiers. Logging of requests and responses in Django, combined with DynamoDB debug output, can further amplify inadvertent PII leakage in server logs or error traces.
The OWASP API Top 10 category ‘2023 –5: Data Exposure’ is directly relevant here: improper filtering of returned data and excessive data exposure in API responses. Real-world findings from continuous scans have linked this pattern to misconfigured IAM policies and missing field-level validation, which can allow an unauthenticated or low-privilege caller to retrieve sensitive attributes through enumeration or malformed queries.
Dynamodb-Specific Remediation in Django — concrete code fixes
Remediation focuses on strict attribute selection, server-side filtering, and disciplined serialization. Always prefer projection expressions in DynamoDB queries so only required, non-sensitive fields are returned. Combine this with explicit field whitelisting in Django serializers and avoid dumping raw DynamoDB responses to clients.
Example: Safe query with projection and serializer whitelisting
import boto3
from django.http import JsonResponse
from django.core.exceptions import PermissionDenied
# Configure a DynamoDB resource
session = boto3.Session(
aws_access_key_id='env-managed',
aws_secret_access_key='env-managed',
region_name='us-east-1'
)
dynamodb = session.resource('dynamodb')
table = dynamodb.Table('users')
def get_user_profile(user_id: str, requester_id: str):
# Enforce ownership or role checks before querying
if not _can_view_profile(requester_id, user_id):
raise PermissionDenied('You cannot view this profile')
response = table.get_item(
Key={'user_id': user_id},
ProjectionExpression='user_id, display_name, email, created_at'
)
item = response.get('Item')
if not item:
return None
# Explicitly return only safe fields
return {
'user_id': item['user_id'],
'display_name': item['display_name'],
'email': item['email'],
'created_at': item['created_at']
}
def user_profile_view(request, user_id):
profile = get_user_profile(user_id, request.user.user_id if hasattr(request.user, 'user_id') else None)
if profile is None:
return JsonResponse({'error': 'Not found'}, status=404)
return JsonResponse(profile)
Example: Django REST Framework-like serializer with field validation
from rest_framework import serializers
class SafeUserProfileSerializer(serializers.Serializer):
user_id = serializers.CharField(max_length=255)
display_name = serializers.CharField(max_length=255)
email = serializers.EmailField()
created_at = serializers.DateTimeField()
def to_representation(self, instance):
# Ensure only whitelisted keys are serialized
data = super().to_representation(instance)
# Explicitly drop any unexpected keys that DynamoDB might return
allowed = {'user_id', 'display_name', 'email', 'created_at'}
return {k: v for k, v in data.items() if k in allowed}
# Usage in a view
from django.http import JsonResponse
def profile_view(request, user_id):
item = table.get_item(Key={'user_id': user_id}).get('Item', {})
if not item:
return JsonResponse({'error': 'Not found'}, status=404)
serializer = SafeUserProfileSerializer(item)
return JsonResponse(serializer.data)
Operational and configuration practices
- Use IAM policies with least privilege and conditionally restrict who can access sensitive attributes at the DynamoDB level.
- Enable encryption at rest and enforce HTTPS for all DynamoDB traffic; avoid exporting raw backups that contain PII without masking or redaction.
- Audit logs: monitor CloudTrail and application logs for unusual query patterns that may indicate enumeration or scraping of PII.
- Regularly scan your API surface with middleBrick to detect PII leakage across endpoints; the dashboard can track findings over time and the Pro plan supports continuous monitoring with alerts for new sensitive data exposure patterns.
Related CWEs: dataExposure
| CWE ID | Name | Severity |
|---|---|---|
| CWE-200 | Exposure of Sensitive Information | HIGH |
| CWE-209 | Error Information Disclosure | MEDIUM |
| CWE-213 | Exposure of Sensitive Information Due to Incompatible Policies | HIGH |
| CWE-215 | Insertion of Sensitive Information Into Debugging Code | MEDIUM |
| CWE-312 | Cleartext Storage of Sensitive Information | HIGH |
| CWE-359 | Exposure of Private Personal Information (PII) | HIGH |
| CWE-522 | Insufficiently Protected Credentials | CRITICAL |
| CWE-532 | Insertion of Sensitive Information into Log File | MEDIUM |
| CWE-538 | Insertion of Sensitive Information into Externally-Accessible File | HIGH |
| CWE-540 | Inclusion of Sensitive Information in Source Code | HIGH |