Unicode Normalization in Chi with Basic Auth
Unicode Normalization in Chi with Basic Auth — how this combination creates or exposes the vulnerability
Unicode normalization is a text-processing operation that ensures equivalent character sequences are represented in a single, canonical form. In the web layer, a request path or header value can reach a Chi application in multiple binary representations that compare equal but differ in byte sequence. When Basic Authentication credentials are derived from a header that has not been normalized consistently, a server that normalizes and a client that does not—or two services in the chain that apply different normalization forms—can break authentication or expose account enumeration risks.
Chi routes are typically compiled into a match tree based on the request path. If a router uses a normalized representation for route matching while the client sends a non-normalized path (for example, UTF-8 NFC vs NFD), the intended route may not match. This mismatch can cause the request to fall through to a catch‑all or default handler, which might still attempt Basic Auth parsing. An attacker can exploit this by sending crafted normalization variants to probe whether different paths lead to the same authentication check, effectively performing subtle account or resource enumeration without triggering obvious route‑based defenses.
Basic Auth credentials are transmitted in the Authorization header as Basic base64(credentials), where credentials are username:password. If the Authorization header value is normalized differently at different layers (e.g., by a load balancer, reverse proxy, or within Chi’s plug pipeline), the base64 string can change in ways that invalidate the credentials or cause decoding errors. More critically, if the username or password contains Unicode characters, normalization mismatches can make two visually identical credentials compare differently. An attacker may supply a mixed‑normalization payload to test whether the server leaks information through error messages or behaves differently for semantically equivalent credentials, which can facilitate account enumeration or bypass attempts.
Chi itself does not apply normalization automatically. Developers must ensure consistent normalization before using credentials for comparison or session derivation. For example, normalizing both the username and password to NFC before computing a key or before a constant‑time compare prevents equivalence‑based bypasses. Without this, an attacker can supply NFC and NFD variants of the same Unicode credentials and observe behavioral differences—such as different HTTP 401 responses or timing variations—that leak information about the existence of an account or the validity of a token.
To detect this class of issue, middleBrick’s Unicode Normalization and Basic Auth checks run in parallel with its 12 security scans. The scanner submits normalization variants and observes whether authentication behavior diverges, producing findings mapped to the OWASP API Top 10 and compliance frameworks. This helps teams identify inconsistent normalization in Chi routes and header handling before an attacker can weaponize subtle encoding differences.
Basic Auth-Specific Remediation in Chi — concrete code fixes
Remediation centers on normalizing the username and password to a canonical Unicode form before any comparison or cryptographic operation, and ensuring the Authorization header is handled consistently across the stack. Below are concrete, idiomatic examples for a Chi application in Clojure using the clj-http–style request map and the cheshire library for decoding JSON bodies when needed.
1. Normalize credentials before comparison
Use a Unicode library such as java.text.Normalizer to convert both the provided and stored credentials to NFC (or NFD) before comparison. This ensures that equivalent strings with different code point compositions are treated identically.
(ns myapp.auth
(:import [java.text Normalizer]))
(defn normalize-unicode [s]
(when s
(.normalize (Normalizer/getInstance "NFC") s)))
(defn valid-credentials? [provided-user provided-pass stored-user stored-pass]
(and (= (normalize-unicode provided-user) (normalize-unicode stored-user))
(= (normalize-unicode provided-pass) (normalize-unicode stored-pass))))
2. Parse and normalize in a Chi middleware
Add a small middleware that inspects the Authorization header, extracts the Basic token, decodes it, normalizes the components, and either replaces the header with a normalized version or stores the normalized credentials in a request map for downstream handlers.
(ns myapp.middleware
(:require [cheshire.core :as json]
[clojure.string :as str])
(:import [java.util Base64]
[java.text Normalizer]))
(defn normalize-unicode [s]
(when s
(.normalize (Normalizer/getInstance "NFC") s)))
(defn basic-auth-middleware [handler]
(fn [request]
(if-raw [auth-header (get-in request [:headers "authorization"])
(and (str/starts-with? auth-header "Basic ")
(subs auth-header 7))]
(let [decoded (String. (.decode (Base64/getDecoder) auth-header) "UTF-8")
[user pass] (str/split decoded #":" 2)
norm-user (normalize-unicode user)
norm-pass (normalize-unicode pass)
normalized-header (str "Basic "
(Base64/getEncoder)
(.encodeToString (.getBytes (str norm-user ":" norm-pass) "UTF-8")))]
(handler (assoc request
:normalized/basic-user norm-user
:normalized/basic-pass norm-pass
:headers (assoc (:headers request) "authorization" normalized-header))))
(handler request))))
3. Use normalized credentials for secure comparisons
In your route handlers or authentication functions, rely on the normalized values stored in the request map rather than raw input. This prevents bypass via mixed normalization forms.
(defn login-handler [request]
(let [stored-user "admin"
stored-pass "s3crët"]
(if (valid-credentials? (:normalized/basic-user request)
(:normalized/basic-pass request)
stored-user
stored-pass)
{:status 200 :body "OK"}
{:status 401 :body "Unauthorized"})))
4. Consistent header handling across proxies
If your Chi service sits behind a load balancer or API gateway, ensure that the proxy does not alter the Authorization header in a way that changes Unicode normalization. Configure the proxy to pass the header through verbatim and perform normalization on the backend only. This removes ambiguity and prevents the server from seeing different forms for the same credential.
5. Test with Unicode variants
Validate your fix by sending requests with intentionally composed and decomposed forms. For example, the character "ë" can be U+00EB (single code point, NFC) or U+0065 U+0308 (e + combining diaeresis, NFD). Ensure both forms authenticate successfully when credentials are normalized to NFC before comparison.
These steps align with middleBrick’s findings by providing concrete remediation guidance. The scanner can verify whether normalization is applied consistently and whether Basic Auth handling introduces equivalence-based weaknesses.