HIGH hallucination attacksfastapicockroachdb

Hallucination Attacks in Fastapi with Cockroachdb

Hallucination Attacks in Fastapi with Cockroachdb — how this specific combination creates or exposes the vulnerability

A Hallucination Attack in a Fastapi service backed by Cockroachdb occurs when an application exposes endpoints that accept unstructured or minimally validated user input and use it to dynamically construct database queries or schema operations. Because Cockroachdb supports PostgreSQL wire protocol and standard SQL syntax, Fastapi applications often use an ORM or raw SQL with dynamic identifiers, table names, or condition keys. If these inputs are not strictly whitelisted or canonicalized, an attacker can supply values that shift query intent — for example, injecting a crafted identifier that changes which rows are returned or which columns are projected — leading to data exposure or inconsistent behavior that the application interprets as a valid response.

In Fastapi, common patterns that increase risk include using user-controlled strings for column names in ORDER BY, SELECT fields, or WHERE key lookups, or embedding table/schema names into raw SQL via Python string formatting. Cockroachdb’s compatibility with PostgreSQL drivers means typical SQL injection mitigations like parameterized queries protect data values, but they do not protect identifiers or schema objects. An attacker may leverage this to perform BOLA/IDOR-like enumeration by requesting records they should not see, or trigger excessive data exposure by manipulating which columns are returned, effectively causing the application to hallucinate additional information or structure in the response.

The LLM/AI Security checks in middleBrick are particularly effective at identifying these patterns when a Fastapi endpoint interacts with Cockroachdb. The scanner looks for indicators such as unvalidated field parameters used in SQL generation, missing authorization checks on resource identifiers, and dynamic query construction that can lead to information leakage. Because the scan runs against the live API without credentials, it can surface how an unauthenticated or low-privilege interaction with the Cockroachdb layer may expose more data than intended, aligning with checks like Property Authorization, Input Validation, and Data Exposure.

Cockroachdb-Specific Remediation in Fastapi — concrete code fixes

Remediation centers on strict whitelisting, avoiding dynamic identifiers, and using database features safely. Never concatenate user input into SQL strings for identifiers or schema objects. Instead, use a strict allowlist for column and table names, and employ parameterized queries for all data values. Below are concrete examples for Fastapi with Cockroachdb.

Safe column selection with allowlist

from fastapi import FastAPI, HTTPException, Query
import psycopg_pool

app = FastAPI()
# Use a connection pool configured for Cockroachdb
pool = psycopg_pool.ConnectionPool(conninfo="postgresql://user:password@host:26257/db?sslmode=require")

ALLOWED_COLUMNS = {"name", "email", "created_at", "status"}

@app.get("/users")
def list_users(
    column: str = Query(..., description="Field to sort by")
):
    if column not in ALLOWED_COLUMNS:
        raise HTTPException(status_code=400, detail="Invalid column")
    with pool.connection() as conn:
        with conn.cursor() as cur:
            # Safe: column is validated against allowlist
            cur.execute(f"SELECT id, {column} FROM users ORDER BY {column};")
            rows = cur.fetchall()
    return [dict(row) for row in rows]

Parameterized query with whitelisted sort direction

from fastapi import FastAPI, HTTPException, Query
import psycopg_pool

app = FastAPI()
pool = psycopg_pool.ConnectionPool(conninfo="postgresql://user:password@host:26257/db?sslmode=require")

@app.get("/users/search")
def search_users(
    status: str = Query(..., description="User status filter"),
    order_dir: str = Query("asc", description="Sort direction")
):
    if order_dir.lower() not in {"asc", "desc"}:
        raise HTTPException(status_code=400, detail="Invalid order direction")
    with pool.connection() as conn:
        with conn.cursor() as cur:
            # Safe: status uses parameter, order_dir is whitelisted
            cur.execute(
                "SELECT id, name, email FROM users WHERE status = $1 ORDER BY created_at " + order_dir.upper(),
                (status,)
            )
            rows = cur.fetchall()
    return [dict(row) for row in rows]

Avoid dynamic identifiers; use predefined views or prepared statements

from fastapi import FastAPI, Depends
import psycopg_pool

app = FastAPI()
pool = psycopg_pool.ConnectionPool(conninfo="postgresql://user:password@host:26257/db?sslmode=require")

# Prefer a stored view or a static query; if dynamic table is unavoidable, use strict validation
def get_table_name(requested: str) -> str:
    mapping = {
        "current": "user_activity_current",
        "archive": "user_activity_archive",
    }
    if requested not in mapping:
        raise ValueError("Invalid table key")
    return mapping[requested]

@app.get("/activity/{table_key}")
def get_activity(table_key: str):
    safe_table = get_table_name(table_key)
    with pool.connection() as conn:
        with conn.cursor() as cur:
            # Safe: table name derived from mapping, not raw user input
            cur.execute(f"SELECT * FROM {safe_table};")
            rows = cur.fetchall()
    return [dict(row) for row in rows]

Use ORM with explicit models and guarded queries

from fastapi import FastAPI, HTTPException
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker
from pydantic import BaseModel

app = FastAPI()
engine = create_engine("cockroachdb://user:password@host:26257/db?sslmode=require")
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

class User(BaseModel):
    id: int
    name: str
    email: str

@app.get("/users/{user_id}", response_model=User)
def read_user(user_id: int):
    with SessionLocal() as session:
        # Safe: user_id is typed int, used as parameter, not identifier
        result = session.execute(text("SELECT id, name, email FROM users WHERE id = :uid"), {"uid": user_id})
        row = result.fetchone()
        if row is None:
            raise HTTPException(status_code=404, detail="User not found")
        return User(id=row[0], name=row[1], email=row[2])

These patterns reduce the attack surface by ensuring that user input never directly shapes identifiers or schema structure, while still allowing flexible queries against Cockroachdb from Fastapi.

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

Can middleBrick detect Hallucination Attacks in a Fastapi + Cockroachdb setup without credentials?
Yes. middleBrick scans the unauthenticated attack surface and can identify patterns such as dynamic identifiers or missing authorization checks that may enable Hallucination Attacks, even when the backend uses Cockroachdb.
Does input validation alone prevent Hallucination Attacks in Fastapi endpoints that query Cockroachdb?
Not fully. Validation must be combined with strict allowlists for identifiers and schema objects, because Cockroachdb’s PostgreSQL compatibility means parameterized queries do not protect table or column names used in dynamic SQL.