Zip Slip in Flask with Firestore
Zip Slip in Flask with Firestore — how this specific combination creates or exposes the vulnerability
Zip Slip is a path traversal vulnerability that occurs when a filename from a zip archive is used to construct a filesystem path without proper validation. In a Flask application that interacts with Google Cloud Firestore, the risk is not that Firestore itself is exploited through Zip Slip, but that an attacker can manipulate file paths during archive extraction before any Firestore operations occur. If your Flask service accepts uploaded archives, extracts them, and uses extracted filenames to build document IDs or file paths that are later stored in Firestore, malicious paths (e.g., ../../../etc/passwd) can escape the intended directory and overwrite arbitrary files on the host. This becomes a supply-chain or data-integrity issue when the compromised files or their metadata are later read and written to Firestore, potentially corrupting or exfiltrating sensitive data.
Consider a Flask route that receives a zip file, extracts it, and uses each entry’s name as a Firestore document ID or to form a Cloud Storage object path. If the extraction logic does not sanitize path segments, an entry like ../../malicious.json can traverse outside the extraction root. When the resulting file is read and its contents are pushed to Firestore, an attacker could inject unexpected data structures or overwrite critical documents. Additionally, if your Firestore documents store references to files on disk (for example, a profilePhoto field pointing to a user-specific path), tampered files may lead to unauthorized information disclosure or code execution when those files are later served by the application.
Note that middleBrick’s unauthenticated scan can flag such issues under the Property Authorization and Input Validation checks when it observes unsafe handling of user-supplied archive contents, even though the scan does not run authenticated workflows. The findings will include severity and remediation guidance, helping you to prioritize fixes that ensure extracted paths remain confined to a safe directory and that Firestore writes are guarded against malicious input.
Firestore-Specific Remediation in Flask — concrete code fixes
To prevent Zip Slip in a Flask and Firestore workflow, you must validate and sanitize filenames before using them to influence filesystem paths or Firestore document identities. Below are concrete, safe patterns.
Secure extraction with path validation
Use os.path.normpath and ensure the resolved path remains within the intended directory. Do not trust archive member names directly.
import os
import zipfile
from flask import Flask, request, jsonify
app = Flask(__name__)
def is_within_directory(directory, target):
# Normalize and resolve both paths
abs_directory = os.path.abspath(directory)
abs_target = os.path.abspath(target)
prefix = os.path.commonprefix([abs_directory, abs_target])
return prefix == abs_directory
@app.route('/upload-zip', methods=['POST'])
def upload_zip():
if 'archive' not in request.files:
return jsonify({'error': 'no file'}), 400
file = request.files['archive']
extract_to = '/safe/extraction/directory'
with zipfile.ZipFile(file.stream, 'r') as zf:
for member in zf.infolist():
member_path = os.path.normpath(member.filename)
destination = os.path.join(extract_to, member_path)
if not is_within_directory(extract_to, destination):
raise ValueError(f'Invalid path in archive: {member.filename}')
zf.extract(member, extract_to)
return jsonify({'status': 'ok'}), 200
Safe Firestore document IDs and field values
When creating or updating Firestore documents from extracted files, avoid using raw filenames as document IDs. Instead, derive sanitized IDs or use server-assigned IDs. Also, avoid storing raw filesystem paths in Firestore fields; store references in a controlled manner.
import firebase_admin
from firebase_admin import credentials, firestore
import re
cred = credentials.ApplicationDefault()
firebase_admin.initialize_app(cred)
db = firestore.client()
def sanitize_document_id(name: str) -> str:
# Allow only alphanumeric, dash, and underscore; truncate length
sanitized = re.sub(r'[^a-zA-Z0-9\-_]', '_', name).strip('_')
return sanitized[:100]
# After safe extraction, process files
for filename in os.listdir(extract_to):
if filename.endswith('.json'):
doc_id = sanitize_document_id(filename)
doc_ref = db.collection('uploads').document(doc_id)
with open(os.path.join(extract_to, filename), 'r') as f:
data = f.read()
# Store minimal metadata; avoid storing raw file paths as-is
doc_ref.set({
'original_filename': filename,
'stored_at': firestore.SERVER_TIMESTAMP,
'content_preview': data[:200]
})
Middleware and framework integrations
If you use the middleBrick CLI (middlebrick scan <url>) or GitHub Action, you can integrate scanning into your CI/CD to catch insecure archive handling before deployment. The scans highlight risky patterns such as unchecked user input flowing into filesystem and Firestore operations, and the dashboard lets you track these findings over time.