HIGH prototype pollutiondjangofirestore

Prototype Pollution in Django with Firestore

Prototype Pollution in Django with Firestore — how this specific combination creates or exposes the vulnerability

Prototype pollution in Django applications using Google Cloud Firestore typically arises when user-controlled input is merged into server-side objects or dictionaries that later influence Firestore document writes, updates, or query construction. Unlike traditional SQL databases, Firestore is a schemaless document store, which means application code often manipulates nested dictionaries that map directly to document fields. If input validation is weak, an attacker can inject properties that modify object prototypes or influence how downstream code interprets and writes data.

In a Django view, developers commonly construct Firestore document payloads by deserializing request data (e.g., JSON) and passing it to Firestore client methods. Consider a profile update endpoint that builds an update map from request data:

from google.cloud import firestore
def update_profile(request):
    user_id = request.GET.get('user_id')
    updates = request.POST.dict()  # Unsafe: attacker-controlled keys/values
    db = firestore.Client()
    doc_ref = db.collection('profiles').document(user_id)
    doc_ref.update(updates)
    return HttpResponse('ok')

If the request contains keys like __proto__, constructor.prototype, or other properties that JavaScript engines historically interpret as prototype modifiers, a frontend consuming the data later may execute unintended behavior. While Firestore itself does not use JavaScript prototypes, the risk manifests when exported data is used in client-side JavaScript or when the application uses Firestore data to dynamically construct objects that are later serialized to templates or APIs consumed by Node.js frontends.

Another scenario involves query construction. An attacker may supply keys that map to Firestore field paths, attempting to manipulate query behavior or access restricted fields. For example:

def list_documents(request):
    filter_arg = request.GET.dict()
    db = firestore.Client()
    docs = db.collection('records').where(filter_arg).stream()
    return HttpResponse(json.dumps([d.to_dict() for d in docs]))

If filter_arg contains unexpected keys like __where_in_0 or nested paths that map to internal handling logic, the query may expose unintended data. Although Firestore’s server-side filtering remains robust, the application layer may inadvertently construct document structures that expose sensitive fields when data is re-serialized for clients or internal services.

Additionally, Django’s form and model layer may not fully protect against prototype-style injection when using dynamic field assignment. If code does something like:

profile = ProfileModel()
for key, value in request.POST.items():
    setattr(profile, key, value)
profile.save()

an attacker can set attributes that map to Firestore document fields during a subsequent sync operation, potentially polluting the logical structure of stored data.

Because Firestore documents are often used directly in JavaScript frontends (e.g., via Firebase SDK), prototype pollution in Django can lead to prototype pollution in client-side objects when data is rendered without sanitization. This chain—Django accepting unchecked input, Firestore storing it, and client-side code consuming it—creates a realistic attack surface even though the vulnerability originates in input handling and not in Firestore itself.

Firestore-Specific Remediation in Django — concrete code fixes

Defending against prototype pollution when integrating Django with Firestore centers on strict input validation, explicit field mapping, and avoiding direct passthrough of user-controlled dictionaries to Firestore operations. The following patterns demonstrate secure approaches.

1. Explicit allowlist for Firestore updates

Define which fields are permitted for updates and construct the update payload programmatically:

ALLOWED_PROFILE_FIELDS = {'display_name', 'email', 'timezone'}
def safe_update_profile(request, user_id):
    db = firestore.Client()
    doc_ref = db.collection('profiles').document(user_id)
    updates = {}
    for key, value in request.POST.items():
        if key in ALLOWED_PROFILE_FIELDS:
            updates[key] = value
    if updates:
        doc_ref.update(updates)
    return HttpResponse('ok')

This prevents attacker-supplied keys like __proto__ or nested paths from reaching Firestore.

2. Use Pydantic or Django forms for validation and serialization

Define a schema that explicitly describes expected fields and types:

from pydantic import BaseModel, EmailStr
class ProfileUpdate(BaseModel):
    display_name: str
    email: EmailStr
    timezone: str = 'UTC'
def validated_update(request, user_id):
    data = request.POST.dict()
    validated = ProfileUpdate(**data)
    db = firestore.Client()
    doc_ref = db.collection('profiles').document(user_id)
    doc_ref.update(validated.dict())
    return HttpResponse('ok')

This ensures only known, correctly typed fields are passed to Firestore.

3. Avoid dynamic field assignment and setattr loops

Replace dynamic attribute setting with explicit mapping:

profile, created = ProfileModel.objects.get_or_create(user_id=user_id)
profile.display_name = request.POST.get('display_name', profile.display_name)
profile.email = request.POST.get('email', profile.email)
profile.timezone = request.POST.get('timezone', profile.timezone)
profile.save()

This prevents unexpected fields from being assigned and later synced to Firestore.

4. Sanitize nested field paths

If your application supports nested Firestore paths, validate path components to prevent traversal beyond intended document structures:

import re
def is_valid_field_path(path):
    # Allow only alphanumeric field names and underscores, no dots or brackets
    return re.match(r'^[a-zA-Z_][a-zA-Z0-9_]*(\.[a-zA-Z_][a-zA-Z0-9_]*)*$', path) is not None
def safe_where_construct(request):
    filters = {}
    for key, value in request.GET.items():
        if is_valid_field_path(key):
            filters[key] = value
    db = firestore.Client()
    query = db.collection('records')
    for key, value in filters.items():
        query = query.where(key, '==', value)
    results = query.stream()
    return HttpResponse(json.dumps([d.to_dict() for d in results]))

These measures reduce the risk of prototype-style injection and ensure that Firestore operations remain predictable and restricted to intended document structures.

Frequently Asked Questions

Does Firestore itself perform prototype pollution checks on data written from Django?
Firestore stores data as schemaless documents and does not interpret JavaScript prototype chains; it does not validate or reject keys like __proto__. Responsibility for sanitization and validation rests with the application layer (Django) before data reaches Firestore.
Can middleBrick detect prototype pollution risks in Django-Firestore integrations?
middleBrick scans unauthenticated attack surfaces and can identify issues such as missing input validation and unsafe data handling patterns. Use the middlebrick CLI (middlebrick scan ) or GitHub Action to include these checks in your CI/CD pipeline; findings include remediation guidance aligned with OWASP API Top 10 and compliance frameworks.