Prompt Injection in Flask with Dynamodb
Prompt Injection in Flask with Dynamodb — how this specific combination creates or exposes the vulnerability
Prompt injection becomes a practical concern when an LLM endpoint is built with a web framework such as Flask and relies on DynamoDB for data access. In this stack, Flask routes often construct and forward user-influenced input directly into LLM prompts, and DynamoDB is used to retrieve documents, configuration, or user context. If user input is concatenated into a system or user message without validation or isolation, an attacker can inject instructions that alter the LLM behavior. At the same time, DynamoDB patterns common in Flask services—such as building query expressions from request parameters—can expose metadata or error paths that aid prompt injection or information leakage.
Consider a Flask endpoint that retrieves a user’s stored instruction profile from DynamoDB and then asks an LLM to summarize content in the style of that profile. If the profile is fetched by a simple key (e.g., user_id from a token) and then embedded into the prompt as raw text, an authenticated attacker who can control the profile may store crafted instructions that cause the LLM to ignore safety guidelines or exfiltrate data. More broadly, prompt injection in this combination is not only about the LLM; it is about how Flask request handling, DynamoDB access patterns, and LLM input construction interact. For example, insufficient input validation on query parameters used to select DynamoDB keys can allow an attacker to force reads of arbitrary items, which may then be included in prompts. Error messages from DynamoDB—such as conditional check failures or provisioned throughput exceptions—may also be surfaced to the LLM or returned directly to the client, providing clues for injection attempts.
The LLM/AI Security checks in middleBrick specifically target these cross-layer risks by detecting system prompt leakage and performing active prompt injection testing with sequential probes (system prompt extraction, instruction override, DAN jailbreak, data exfiltration, cost exploitation). When scanning a Flask service that uses DynamoDB, the scanner checks whether unauthentinated endpoints expose LLM interfaces, whether user-influenced data reaches prompts without sanitization, and whether DynamoDB interactions can be abused to influence prompt context. Findings include severity-ranked guidance to separate data retrieval from prompt construction, enforce strict allowlists on dynamic values, and avoid including raw user data or internal error details in LLM inputs.
Dynamodb-Specific Remediation in Flask — concrete code fixes
Remediation focuses on strict separation between data retrieval and prompt assembly, controlled exposure of data to the LLM, and safe handling of DynamoDB responses in Flask. Below are concrete, safe patterns you can adopt.
1. Parameterized DynamoDB access with strict key validation
Always validate and sanitize inputs used to construct DynamoDB key expressions. Do not concatenate user input into key names or conditional expressions. Use bounded allowlists for identifiers and treat user input as data, not code.
import boto3
from flask import Flask, request, jsonify
import re
app = Flask(__name__)
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('UserProfiles')
@app.route('/profile')
def get_profile():
user_id = request.args.get('user_id')
# Strict validation: alphanumeric and underscores, 3–64 chars
if not re.match(r'^[A-Za-z0-9_]{3,64}$', user_id):
return jsonify({'error': 'invalid user_id'}), 400
response = table.get_item(Key={'user_id': user_id})
item = response.get('Item')
if not item:
return jsonify({'error': 'not found'}), 404
return jsonify({'profile': item.get('profile_text', '')})
2. Isolate data from prompts; do not embed raw user data
Retrieve data from DynamoDB for business logic or context, but do not directly embed potentially mutable fields into LLM prompts. Instead, pass only vetted, non-instructional fields and enforce a strict prompt template on the server side.
import boto3
from flask import Flask, jsonify
app = Flask(__name__)
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('UserPreferences')
@app.route('/summarize')
def summarize():
user_id = request.args.get('user_id')
if not re.match(r'^[A-Za-z0-9_]{3,64}$', user_id):
return jsonify({'error': 'invalid user_id'}), 400
pref = table.get_item(Key={'user_id': user_id}).get('Item', {})
# Use only safe, non-instructional fields as context; do not embed raw profile_text
safe_context = pref.get('preferred_language', 'English')
# Server-side prompt template; user data is never part of the instruction
prompt = f"Summarize the following text in {safe_context}. Text: {{user_text}}"
# llm_response = call_llm(prompt, user_text=user_supplied_text)
return jsonify({'prompt_template': prompt})
3. Handle DynamoDB errors safely and avoid leaking metadata to LLM
Ensure that DynamoDB exceptions do not become part of LLM input. Map errors to generic responses and log details securely for investigation. This prevents attackers from using error conditions to infer table structures or inject via error payloads.
import botocore
@app.route('/data')
def get_data():
try:
resp = table.get_item(Key={'id': request.args.get('id')})
except botocore.exceptions.ClientError as e:
# Log full exception internally; return generic message
app.logger.error('dynamodb_error: %s', e.response['Error'])
return jsonify({'error': 'internal error'}), 500
item = resp.get('Item')
if not item:
return jsonify({'error': 'not found'}), 404
return jsonify({'data': item})
4. Apply allowlists and schema checks on retrieved data
Even after retrieval, validate shapes and content of DynamoDB items before any optional use in prompts. Define a strict schema for items you expect to use, and reject items that deviate.
from jsonschema import validate
profile_schema = {
'type': 'object',
'properties': {
'user_id': {'type': 'string', 'pattern': '^[A-Za-z0-9_]{3,64}$'},
'profile_text': {'type': 'string', 'maxLength': 2000}
},
'required': ['user_id']
}
@app.route('/profile-safe')
def get_profile_safe():
user_id = request.args.get('user_id')
if not re.match(r'^[A-Za-z0-9_]{3,64}$', user_id):
return jsonify({'error': 'invalid user_id'}), 400
item = table.get_item(Key={'user_id': user_id}).get('Item')
if not item:
return jsonify({'error': 'not found'}), 404
try:
validate(instance=item, schema=profile_schema)
except Exception:
return jsonify({'error': 'invalid data'}), 500
# Only pass explicitly allowed fields to LLM context
return jsonify({'user_id': item['user_id']})
Related CWEs: llmSecurity
| CWE ID | Name | Severity |
|---|---|---|
| CWE-754 | Improper Check for Unusual or Exceptional Conditions | MEDIUM |