Zip Slip in Flask with Dynamodb
Zip Slip in Flask with Dynamodb — how this specific combination creates or exposes the vulnerability
Zip Slip is a path traversal vulnerability that occurs when an application constructs extraction paths by directly concatenating user-supplied archive entries with a base extraction directory. In a Flask application that interacts with Amazon DynamoDB, the risk emerges not from DynamoDB itself, but from unsafe file handling before or after data is stored or retrieved. If a Flask endpoint accepts file uploads or archive downloads, uses values from client requests to build file paths, and then extracts archives without validating path components, an attacker can craft entries like ../../../etc/passwd to escape the intended directory. When combined with DynamoDB, the exposure surface involves how application code maps user input to primary keys or attribute values used in archive naming or storage references. For example, if a developer uses a user-controlled value as part of a filename that is later archived and extracted, or includes a DynamoDB primary key in an archive path without validation, Zip Slip can lead to arbitrary file writes or reads outside the intended directory. Because the scan tests unauthenticated attack surfaces across input validation and file handling checks, it can flag unsafe patterns where DynamoDB identifiers or request parameters influence local filesystem paths in Flask routes.
Dynamodb-Specific Remediation in Flask — concrete code fixes
To mitigate Zip Slip in Flask when working with DynamoDB, ensure that any user input used to construct file paths, archive entries, or keys is sanitized and confined to a safe directory. Use strict path normalization and reject entries that attempt directory traversal. Below are concrete code examples for a Flask route that stores and retrieves objects while referencing DynamoDB, demonstrating safe practices.
import os
from flask import Flask, request, jsonify
import boto3
from werkzeug.utils import secure_filename
app = Flask(__name__)
# Initialize DynamoDB client
client = boto3.client('dynamodb', region_name='us-east-1')
TABLE_NAME = os.getenv('TABLE_NAME', 'secure_files')
# Ensure the target base directory exists
UPLOAD_BASE = '/safe/extract/dir'
os.makedirs(UPLOAD_BASE, exist_ok=True)
def is_within_directory(directory, target):
# Normalize and check that target remains inside directory
abs_directory = os.path.abspath(directory)
abs_target = os.path.abspath(target)
return os.path.commonpath([abs_directory, abs_target]) == abs_directory
@app.route('/file', methods=['POST'])
def handle_file():
# Validate and sanitize inputs
file = request.files.get('file')
if not file:
return jsonify({'error': 'missing file'}), 400
# Use secure_filename and an additional allowlist for safety
filename = secure_filename(file.filename)
if not filename:
return jsonify({'error': 'invalid filename'}), 400
# Build target path safely; do not use user input directly in paths
target_path = os.path.join(UPLOAD_BASE, filename)
if not is_within_directory(UPLOAD_BASE, target_path):
return jsonify({'error': 'invalid path'}), 400
# Save file to safe location
file.save(target_path)
# Store metadata in DynamoDB using a sanitated key, not raw user input
item = {
'file_id': {'S': filename}, # or use a UUID
'original_name': {'S': filename},
'size': {'N': str(os.path.getsize(target_path))}
}
client.put_item(TableName=TABLE_NAME, Item=item)
return jsonify({'status': 'stored', 'file_id': filename}), 201
@app.route('/file/', methods=['GET'])
def get_file(file_id):
# Retrieve metadata from DynamoDB
resp = client.get_item(TableName=TABLE_NAME, Key={'file_id': {'S': file_id}})
item = resp.get('Item')
if not item:
return jsonify({'error': 'not found'}), 404
safe_name = item['original_name']['S']
# Ensure safe_name is still valid before joining
safe_name = secure_filename(safe_name)
target_path = os.path.join(UPLOAD_BASE, safe_name)
if not is_within_directory(UPLOAD_BASE, target_path):
return jsonify({'error': 'invalid reference'}), 400
if not os.path.exists(target_path):
return jsonify({'error': 'file missing'}), 404
return send_file(target_path, as_attachment=True, download_name=safe_name)
@app.route('/extract', methods=['POST'])
def extract_archive():
archive = request.files.get('archive')
if not archive:
return jsonify({'error': 'missing archive'}), 400
tmp_path = os.path.join(UPLOAD_BASE, secure_filename(archive.filename))
archive.save(tmp_path)
import zipfile
try:
with zipfile.ZipFile(tmp_path, 'r') as zf:
for member in zf.infolist():
# Reject paths that attempt traversal regardless of platform
member_path = os.path.normpath(member.filename)
if member_path.startswith('..') or not is_within_directory(UPLOAD_BASE, os.path.join(UPLOAD_BASE, member_path)):
return jsonify({'error': 'unsafe archive entry'}), 400
zf.extract(member, path=UPLOAD_BASE)
except Exception:
return jsonify({'error': 'extraction failed'}), 400
finally:
if os.path.exists(tmp_path):
os.remove(tmp_path)
return jsonify({'status': 'extracted'}), 200
Key remediation points specific to the Flask + DynamoDB context:
- Do not use raw DynamoDB attribute values (such as primary keys or filenames stored in the table) to construct filesystem paths without normalization and allowlist validation.
- Apply
secure_filenameor a strict allowlist for filenames derived from user input or DynamoDB attributes before path joins. - Always verify extracted archive members with a path traversal check (e.g.,
is_within_directory) rather than relying on the archive library’s default behavior. - Keep user-controlled data separate from filesystem decisions; use opaque identifiers (e.g., UUIDs) as keys in DynamoDB, and map them to safe filenames server-side.