Unicode Normalization in Fastapi with Hmac Signatures
Unicode Normalization in Fastapi with Hmac Signatures — how this specific combination creates or exposes the vulnerability
Unicode normalization becomes a security concern in FastAPI when an HMAC signature is computed over a request payload or a subset of headers that may contain canonically equivalent Unicode representations. If a FastAPI application normalizes incoming data differently (or not at all) compared to how the client generates the signature, the same logical content can produce different byte sequences, causing signature verification to fail or to be evaluated against a subtly altered string.
Consider a JSON body containing a user-supplied identifier or a query parameter that includes characters such as é, which can be represented as a single code point U+00E9 or as a decomposed sequence e + U+0301. Without enforcing a canonical normalization form (e.g., NFC or NFD) before computing or verifying the HMAC, an attacker can submit semantically identical but byte-wise different inputs to bypass expected signature checks, potentially leading to bypasses in authentication or integrity checks.
For example, an HMAC computed over the string "id=123" will not match an HMAC computed over its decomposed variant if normalization is applied on only one side. In FastAPI, middleware or route logic that parses and re-serializes JSON may inadvertently change the encoding of Unicode strings, especially when combining frameworks that internally normalize with those that do not. This inconsistency can weaken the integrity guarantees that HMAC is designed to provide.
In the context of API security scanning, such inconsistencies are detectable as findings related to Input Validation and Authentication, where semantically equivalent but distinct payloads produce different signature outcomes. The scanner evaluates whether the application normalizes inputs consistently before signing or verification, and flags deviations that could enable tampering or spoofing.
Because HMAC relies on exact byte-for-byte matching, any unchecked variation in Unicode representation introduces a deviation in the signed data. This is especially critical when signatures cover query strings, header values, or JSON fields that may include international characters. Ensuring deterministic normalization across all layers of the FastAPI stack is essential to preserve the integrity guarantees of HMAC-based schemes.
Hmac Signatures-Specific Remediation in Fastapi — concrete code fixes
To mitigate Unicode normalization issues when using HMAC signatures in FastAPI, enforce a consistent normalization form before computing or verifying the signature. Apply the same normalization (typically NFC or NFD) on both the client and server sides, and ensure that any data included in the signed string is normalized in exactly the same way.
Below is a server-side FastAPI example that validates an HMAC signature after normalizing relevant input. The example uses unicodedata.normalize to canonicalize the payload before verification, ensuring that equivalent Unicode representations are treated identically.
import hmac
import hashlib
from fastapi import FastAPI, Header, HTTPException
from unicodedata import normalize
app = FastAPI()
SECRET_KEY = b"super-secret-key"
def verify_hmac_signature(payload: str, signature: str) -> bool:
normalized = normalize("NFC", payload)
expected = hmac.new(SECRET_KEY, normalized.encode("utf-8"), hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature)
@app.post("/items/")
async def create_item(
data: str = Header(...),
signature: str = Header(
...,
description="HMAC-SHA256 signature of the normalized data"
),
):
if not verify_hmac_signature(data, signature):
raise HTTPException(status_code=401, detail="Invalid signature")
return {"status": "ok"}
On the client side, ensure the same normalization is applied before generating the signature. The following snippet demonstrates a matching client workflow:
import hmac
import hashlib
from unicodedata import normalize
import requests
SECRET_KEY = b"super-secret-key"
def sign_payload(payload: str) -> str:
normalized = normalize("NFC", payload)
return hmac.new(SECRET_KEY, normalized.encode("utf-8"), hashlib.sha256).hexdigest()
data = "id=123"
signature = sign_payload(data)
headers = {"data": data, "signature": signature}
requests.post("http://localhost:8000/items/", headers=headers)
For JSON payloads, normalize string values for specific keys included in the signature, or normalize the serialized JSON string if the signature covers the entire body. The following example shows how to handle a JSON object with potentially non-ASCII fields:
import hmac
import hashlib
import json
from unicodedata import normalize
def normalize_json(data: dict) -> str:
# Deterministic serialization with sorted keys
serialized = json.dumps(data, sort_keys=True, ensure_ascii=False)
return normalize("NFC", serialized)
def generate_signature(data: dict, key: bytes) -> str:
normalized = normalize_json(data)
return hmac.new(key, normalized.encode("utf-8"), hashlib.sha256).hexdigest()
payload = {"name": "café", "role": "admin"}
key = b"super-secret-key"
sig = generate_signature(payload, key)
# Transmit both payload and sig; server recomputes using the same normalization
Additionally, validate that any query parameters or path segments included in the signed string are normalized consistently. FastAPI’s dependency injection can be used to centralize this logic, ensuring that all routes requiring HMAC verification apply the same normalization rules.
By combining deterministic Unicode normalization with rigorous HMAC verification, you reduce the risk of signature bypass due to canonically equivalent inputs. This approach aligns with secure coding practices for integrity checks and helps maintain robust API authentication in the presence of internationalized data.