HIGH excessive data exposurefastapidynamodb

Excessive Data Exposure in Fastapi with Dynamodb

Excessive Data Exposure occurs when an API returns more data than necessary for a given operation, often revealing sensitive fields that should remain restricted. In a Fastapi application backed by DynamoDB, this typically arises from two patterns: directly serializing entire DynamoDB items to the client and constructing query responses that include unintended attributes. DynamoDB’s schemaless design encourages storing multiple logical data types in a single table, which increases the chance that a scan or query returns fields like internal identifiers, secrets, PII, or operational metadata alongside the data the client actually needs.

Consider a user profile endpoint in Fastapi that retrieves a user item from a DynamoDB table. If the implementation performs a get_item and returns the full item as JSON without filtering, fields such as password_hash, api_key, or internal_status may be exposed. Even when using an ORM-like layer, incomplete projection or accidental inclusion of all attributes can propagate sensitive data. The risk is compounded when the API supports filtering or search features that inadvertently expose related records or metadata through verbose responses.

DynamoDB’s attribute data types further amplify exposure risks. For example, a document may contain nested maps or lists that include sensitive values. If the Fastapi response serializes these structures without careful whitelisting, nested sensitive data can be returned to the client. Additionally, sparse index usage or global secondary indexes (GSIs) that include different attribute projections can expose alternate sets of data depending on which index is queried, especially if the application logic does not consistently enforce field-level restrictions across all access paths.

Another common vector involves endpoint designs that return lists of items with insufficient scoping. A search or list endpoint that queries a DynamoDB table without strict projection expressions may return every stored attribute for each matched item. Attackers can leverage such endpoints to enumerate usernames, email addresses, or other personally identifiable information, especially when the API lacks complementary protections like field-level filtering or strict pagination controls.

To detect this category, middleBrick runs 12 security checks in parallel, including Property Authorization and Data Exposure assessments. These checks examine how responses are constructed, whether projections are used, and whether returned data aligns with the principle of least privilege. Findings typically include severity-ranked guidance on tightening serialization, applying projection expressions, and validating that indexes do not leak unintended attributes.

Dynamodb-Specific Remediation in Fastapi

Remediation focuses on ensuring that only the intended subset of attributes is serialized and returned to the client. In Fastapi, this involves explicit response models, careful query construction, and disciplined handling of DynamoDB’s attribute format. Below are concrete, working examples that demonstrate secure patterns for common scenarios.

Secure get_item with Projection

When retrieving a single item, use a projection expression to fetch only required attributes and construct a clean response model. This prevents accidental exposure of fields like password_hash or internal metadata.

from fastapi import Fastapi, HTTPException
import boto3
from pydantic import BaseModel
from typing import Optional

app = Fastapi()
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.Table("users")

class UserProfile(BaseModel):
    user_id: str
    email: str
    display_name: str

@app.get("/users/{user_id}", response_model=UserProfile)
def get_user(user_id: str):
    response = table.get_item(
        Key={"user_id": user_id},
        ProjectionExpression="user_id, email, display_name"
    )
    item = response.get("Item")
    if not item:
        raise HTTPException(status_code=404, detail="User not found")
    return UserProfile(**item)

Query with Projection and Filtering

For queries that return multiple items, apply a projection expression and validate attributes before serialization. Avoid returning raw DynamoDB format (e.g., S, N) to API consumers; map to domain models instead.

from fastapi import Fastapi
import boto3
from pydantic import BaseModel
from typing import List

app = Fastapi()
table = boto3.resource("dynamodb", region_name="us-east-1").Table("products")

class ProductSummary(BaseModel):
    product_id: str
    name: str
    price: float

@app.get("/categories/{category}/products", response_model=List[ProductSummary])
def list_products(category: str):
    response = table.query(
        IndexName="category-index",
        KeyConditionExpression="category = :cat",
        ExpressionAttributeValues={":cat": category},
        ProjectionExpression="product_id, name, price"
    )
    return [ProductSummary(**item) for item in response.get("Items", [])]

Handling Sparse Attributes and GSIs

If your table uses sparse attributes or GSIs with different projections, ensure your response models exclude attributes that may exist in some items but not others. Validate presence explicitly and avoid echoing back raw item contents.

class SafeOrder(BaseModel):
    order_id: str
    total: float
    status: str

def build_order_item(raw):
    # Only include fields we intend to expose
    return {
        "order_id": raw["order_id"],
        "total": float(raw["total"]),
        "status": raw.get("status", "pending")
    }

@app.get("/orders/{order_id}", response_model=SafeOrder)
def get_order(order_id: str):
    resp = table.get_item(Key={"order_id": order_id})
    item = resp.get("Item")
    if not item:
        raise HTTPException(status_code=404)
    return SafeOrder(**build_order_item(item))

These patterns emphasize explicit selection of attributes, disciplined serialization, and avoiding the direct exposure of DynamoDB’s raw format. They align with the Property Authorization and Data Exposure checks that middleBrick performs, which evaluate whether responses are scoped correctly and whether sensitive fields are inadvertently returned.

Additionally, consider runtime protections such as validating that indexes used in queries do not include unintended attributes and ensuring that scan operations are restricted or avoided in production environments. MiddleBrick’s findings often highlight where projections are missing or overly permissive, guiding targeted fixes rather than broad changes.

Related CWEs: propertyAuthorization

CWE ID	Name	Severity
CWE-915	Mass Assignment	HIGH

Frequently Asked Questions

How can I verify that my Fastapi endpoints are not exposing extra fields from DynamoDB?

Use explicit ProjectionExpression in get_item and query calls, and validate responses against a strict Pydantic model that includes only intended fields. middleBrick’s Data Exposure checks can highlight mismatches between spec-defined responses and actual payloads.

Is it enough to rely on client-side filtering to hide sensitive fields?

No. Client-side filtering does not prevent the server from retrieving and transmitting sensitive data. Apply server-side projections and response models to ensure sensitive fields are never included in the payload.

Excessive Data Exposure in Fastapi with Dynamodb