HIGH hallucination attacksflaskhmac signatures

Hallucination Attacks in Flask with Hmac Signatures

Hallucination Attacks in Flask with Hmac Signatures — how this specific combination creates or exposes the vulnerability

A hallucination attack in this context occurs when an API endpoint returns fabricated or unverifiable data, and the client (or downstream system) treats that output as authoritative. When a Flask API uses Hmac signatures only for request integrity—commonly to verify that a payload originates from a trusted source—but does not also verify the integrity and authenticity of the response, an attacker can induce the server to generate responses that the client hallucinates as valid and signed.

Consider a Flask endpoint that accepts a JSON payload, computes an Hmac over selected fields, and uses that to decide whether to process the request. If the server’s response includes data derived from user-supplied input and the server does not sign or otherwise protect the response, an attacker can supply carefully crafted input that causes the server to produce misleading output. A client that trusts the server’s output simply because the incoming request was Hmac-verified may hallucinate that the response is consistent with the request context, even when the server’s processing logic has been tricked into generating an inconsistent or fabricated reply.

For example, an endpoint that applies business logic based on a price field and an Hmac-covered client-supplied identifier can be coerced, via injection or parameter manipulation, to return a different price or a non-existent resource. The client verifies the Hmac on the request and assumes the response corresponds to that verified request, creating a logical inconsistency that can be exploited for privilege escalation, incorrect billing, or unauthorized data access. This is especially relevant when the API design implicitly couples request authentication with response trust, without explicitly securing the response path.

Flask applications that rely on Hmac signatures must therefore treat the response as an independent security boundary. Unauthenticated LLM endpoints that generate textual summaries or auto-completions based on verified request context are particularly susceptible: an attacker can inject prompts that cause the model to hallucinate information that appears consistent with the verified request but is entirely fabricated. The server must validate and scope any data used to generate responses, and must not assume that Hmac-protected requests alone prevent output-level manipulation.

Hmac Signatures-Specific Remediation in Flask — concrete code fixes

Remediation focuses on ensuring that both request validation and response generation are explicitly secured. Do not rely on the presence of a valid Hmac to imply trust in response content; instead, scope all data used to construct responses and verify integrity where appropriate.

import hmac
import hashlib
import json
from flask import Flask, request, jsonify

app = Flask(__name__)
SECRET_KEY = b'your-secure-secret-key-change-this-in-production'

def verify_hmac(data, received_signature):
    computed = hmac.new(SECRET_KEY, data, hashlib.sha256).hexdigest()
    return hmac.compare_digest(computed, received_signature)

@app.route('/process', methods=['POST'])
def process():
    payload = request.get_json(force=True, silent=True)
    if payload is None:
        return jsonify({'error': 'invalid json'}), 400

    # Expect signature in header
    signature = request.headers.get('X-Request-Signature')
    if not signature:
        return jsonify({'error': 'missing signature'}), 401

    # Canonicalize the exact bytes used for signing on the sender side
    # Here we assume the sender signed the raw JSON string of specific fields
    body_bytes = json.dumps(payload, separators=(',', ':'), sort_keys=True).encode('utf-8')
    if not verify_hmac(body_bytes, signature):
        return jsonify({'error': 'invalid signature'}), 401

    # Explicitly validate and scope data used for downstream operations
    # Do not trust fields that should have been signed if they are missing or malformed
    try:
        item_id = payload['item_id']
        price_cents = int(payload['price_cents'])
        # Ensure business rules are enforced server-side
        if price_cents <= 0:
            return jsonify({'error': 'invalid price'}), 400
    except (KeyError, ValueError, TypeError):
        return jsonify({'error': 'missing or invalid fields'}), 400

    # Construct response using only validated server-side data
    response_data = {
        'item_id': item_id,
        'price_cents': price_cents,
        'status': 'processed'
    }
    response_body = json.dumps(response_data, separators=(',', ':'), sort_keys=True)
    # Optionally add a response integrity mechanism (not a replacement for transport security)
    # For example, include a timestamp and consider signing if the client needs to verify responses
    return jsonify(response_data), 200

if __name__ == '__main__':
    app.run(debug=False)

Key practices illustrated:

  • Validate the Hmac over a canonical representation of the request payload to prevent encoding-based confusion.
  • Never use Hmac verification as a substitute for output validation; enforce business rules on server-derived values (e.g., price_cents) rather than echoing unchecked client data.
  • Ensure that any data used to generate responses is scoped and validated; avoid constructing responses that depend on fields that were not explicitly verified.
  • If responses must be verifiable by clients, consider adding a server-signed integrity token or timestamp, but do not rely on this alone to prevent hallucination attacks.

For endpoints that incorporate LLM-generated content, apply strict prompt and output validation: do not allow model outputs to directly substitute for validated business data, and inspect outputs for PII, code, or credentials. middleBrick’s LLM/AI Security checks can help detect system prompt leakage, prompt injection attempts, and unsafe consumption patterns in such integrations.

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

Why does verifying an Hmac-signed request not guarantee the response is trustworthy?
Because request integrity and response integrity are separate security boundaries. An Hmac over the request ensures the server processed a specific, authenticated input, but it does not prevent the server from generating fabricated or inconsistent responses unless the response is also explicitly validated or signed.
Can middleBrick detect hallucination risks in Flask APIs that use Hmac signatures?
Yes. middleBrick scans the unauthenticated attack surface and includes checks such as Input Validation and LLM/AI Security that can surface cases where endpoints produce unverified or model-generated outputs that may hallucinate data, even when request authentication uses Hmac signatures.