HIGH unicode normalizationaxumbasic auth

Unicode Normalization in Axum with Basic Auth

Unicode Normalization in Axum with Basic Auth — how this specific combination creates or exposes the vulnerability

When an Axum service validates user credentials using Basic Authentication, the username and password extracted from the Authorization header are often compared directly to stored values. If those stored credentials are normalized into a canonical form (such as NFC or NFD) but the runtime comparison does not apply the same normalization to the submitted credentials, an attacker can bypass authentication by supplying equivalent Unicode representations that normalize to the same string.

For example, the character é can be represented as a single code point U+00E9 (LATIN SMALL LETTER E WITH ACUTE) or as the two-code-point sequence U+0065 U+0301 (e followed by combining acute accent). These two forms are canonically equivalent but have different byte-level representations. An attacker can use the combining form in a Basic Auth credential to pass an equality check that only compares raw bytes, while a server-side check that normalizes stored passwords before comparison may fail to normalize the incoming credentials, leading to authentication bypass or inconsistent authorization decisions.

This inconsistency is particularly risky when Axum services integrate with external identity stores or legacy systems that store credentials in a different normalization form. An attacker can enumerate valid users by probing endpoints with specially crafted Unicode credentials that produce the same normalized identifier but differ in raw representation, enabling account enumeration without triggering outright failed-authentication alerts. In some configurations, this can also intersect with path-based routing or middleware that relies on header values for routing or logging, where normalization differences can affect how requests are processed or logged.

The risk is compounded when Basic Auth credentials are decoded and used in security-sensitive contexts such as token issuance, session creation, or role assignment. If the application normalizes only one side of the comparison, it may inadvertently treat two semantically identical credentials as distinct, leading to authorization errors or fallback to less secure authentication flows. Attackers can exploit this by chaining Unicode-based bypass with other techniques, such as credential stuffing using homograph or confusable characters, to increase the likelihood of successful unauthorized access.

Because middleBrick tests unauthenticated attack surfaces, it can surface inconsistencies in how endpoints handle Unicode input within authentication flows. Findings may highlight missing normalization on both client and server sides, or mismatched normalization policies across services, providing prioritized remediation guidance that maps to relevant authentication weaknesses in frameworks such as OWASP API Top 10.

Basic Auth-Specific Remediation in Axum — concrete code fixes

To mitigate Unicode normalization issues in Axum with Basic Authentication, ensure that both the submitted credentials and the stored credentials are normalized using the same Unicode form before comparison. Use a well-maintained Unicode normalization library to perform this transformation consistently across all authentication paths.

The following example demonstrates a secure Axum Basic Auth handler that normalizes the user-provided username using the unicode-normalization crate before comparing it to a pre-normalized stored value:

use axum::{
    async_trait,
    extract::{self, Request}},
    http::{
        request::Parts,
        HeaderValue},
};
use std::convert::TryFrom;
use unicode_normalization::UnicodeNormalization;

async fn normalize_basic_auth_credentials(
    header_value: &str,
) -> Option<(String, String)> {
    // Basic Auth format: "Basic base64(credentials)"
    let decoded = base64::decode(header_value.strip_prefix("Basic ")?)?;
    let decoded_str = String::from_utf8(decoded).ok()?;
    let parts: Vec<&str> = decoded_str.splitn(2, ':').collect();
    if parts.len() != 2 {
        return None;
    }
    // Normalize both username and password using NFC
    let username = parts[0].nfc().collect::();
    let password = parts[1].nfc().collect::();
    Some((username, password))
}

#[tokio::main]
async fn main() {
    let app = axum::Router::new().route(
        "/protected",
        axum::routing::get(|headers: extract::HeaderMap| async move {
            let auth_header = headers.get("authorization")?.to_str().ok()?;
            let (stored_user, stored_pass) = get_stored_credentials()?;
            let (input_user, input_pass) = normalize_basic_auth_credentials(auth_header)?;
            if input_user == stored_user && input_pass == stored_pass {
                Some("authenticated")
            } else {
                None
            }
        }),
    );
}

In this pattern, both the username and password are normalized using NFC before comparison. The stored credentials must also be normalized using the same form when they are initially saved or indexed, ensuring that equality checks are meaningful and resistant to Unicode-based bypass techniques.

Additionally, consider rejecting credentials that contain disallowed code points or non-standard normalization forms at the parsing stage. This reduces the risk of edge cases where the normalization library may not handle certain legacy or malformed input gracefully. middleBrick can detect whether endpoints consistently apply normalization by analyzing authentication behavior across equivalent Unicode representations, surfacing findings that indicate missing or inconsistent handling.

For deployments using the middleBrick CLI (middlebrick scan <url>) or GitHub Action integration, scans will flag authentication endpoints where Unicode normalization inconsistencies are detectable, allowing teams to align their handling with secure comparison practices. Teams on the Pro plan can enable continuous monitoring to detect regressions in normalization handling after code changes.

Frequently Asked Questions

Why is Unicode normalization relevant for Basic Auth in Axum?
Unicode normalization matters because equivalent characters can have multiple binary representations. If stored credentials are normalized but runtime comparisons are not, attackers can bypass authentication using canonically equivalent but differently encoded usernames or passwords.
Does middleBrick test for Unicode normalization bypass in Basic Auth?
Yes. middleBrick runs authentication checks that can surface inconsistencies in how endpoints handle equivalent Unicode representations, and findings are reported with remediation guidance mapped to authentication weaknesses.