MEDIUM unicode normalizationfiberfirestore

Unicode Normalization in Fiber with Firestore

Unicode Normalization in Fiber with Firestore — how this specific combination creates or exposes the vulnerability

Unicode Normalization is relevant in web APIs when user-controlled strings are compared, indexed, or used as document keys in databases. In a Fiber application using Firestore as a backend, normalization mismatches can create authorization bypasses or data integrity issues. For example, two strings that appear identical may have different binary representations—one using composed characters (é as U+00E9) and the other using decomposed form (e followed by combining acute accent). If Firestore document IDs or query values are compared without normalization, an attacker can exploit this to access or modify records they should not reach.

When Firestore rules rely on string equality checks against user input, and the backend does not normalize consistently, an attacker can supply a specially crafted payload that matches a different internal key. This can intersect with BOLA/IDOR checks that are based on string comparison rather than canonical resource identifiers. Because Fiber routes often bind parameters directly to Firestore document paths, missing normalization can lead to path confusion where one user’s data is exposed under another user’s namespace.

The risk is compounded when Firestore indexes are used for queries that involve user-controlled strings. Without normalization, collation and indexing behavior may vary, leading to unexpected query results or injection-style confusion where an attacker manipulates normalization forms to bypass intended filters. In an API security scan, such issues can appear as insecure direct object references or property authorization weaknesses, especially when document IDs or filter fields are derived from unvalidated input.

middleBrick detects these patterns by correlating OpenAPI specs with runtime behavior, including how string-based identifiers are handled across authentication and authorization checks. Findings may highlight missing normalization in parameters that map to Firestore document paths or collection/query keys. Remediation involves canonicalizing inputs before they are used in Firestore operations and ensuring consistent normalization across the application stack.

Firestore-Specific Remediation in Fiber — concrete code fixes

To prevent Unicode Normalization issues in a Fiber application using Firestore, normalize all incoming string data before using it in document paths, queries, or security rules. Use a standard normalization form such as NFC or NFD consistently across the backend and Firestore rules. Below is an example using Go with the Fiber framework and the Firebase Admin SDK, where user-supplied identifiers are normalized before being used in Firestore operations.

import (
    "github.com/gofiber/fiber/v2"
    "golang.org/x/text/unicode/norm"
    firebase "firebase.google.com/go"
    "cloud.google.com/go/firestore"
    "context"
    "strings"
)

// normalizeNFC returns the NFC form of a string.
func normalizeNFC(s string) string {
    return norm.String(norm.NFC, s)
}

func GetUserProfile(c *fiber.Ctx) error {
    app, _ := firebase.NewApp(context.Background(), nil)
    client, _ := app.Firestore(context.Background())
    
    userID := c.Params("userID")
    safeID := normalizeNFC(userID)
    
    docRef := client.Collection("users").Doc(safeID)
    docSnap, err := docRef.Get(context.Background())
    if err != nil || !docSnap.Exists() {
        return c.Status(fiber.StatusNotFound).JSON(fiber.Map{"error": "not found"})
    }
    
    var profile map[string]interface{}
    docSnap.DataTo(&profile)
    return c.JSON(profile)
}

In this example, the user-supplied userID parameter is passed through normalizeNFC before being used as a Firestore document key. This ensures that composed and decomposed forms are treated identically, preventing attackers from exploiting normalization differences to access other users’ documents.

For Firestore security rules, apply normalization on the server side or enforce it in your data ingestion pipeline so that stored document IDs are consistently normalized. If you store usernames or handles as document IDs, normalize them at write time and require the same normalization when reading or querying. This approach aligns with findings from middleBrick scans that flag inconsistent string handling as a potential BOLA/IDOR risk.

In a Pro deployment, continuous monitoring can alert you when new endpoints introduce unnormalized parameters that map to Firestore paths. The CLI can be integrated into your development workflow to catch these patterns early, and the GitHub Action can fail builds if normalization is missing in API definitions that interact with Firestore.

Frequently Asked Questions

Why does Unicode normalization matter when using Firestore document IDs?
Firestore document IDs are strings compared directly. If your API accepts user input to construct document paths without normalization, attackers can use different Unicode representations of the same visual string to access or modify records they should not reach, leading to IDOR or BOLA issues.
Can Firestore security rules handle normalization automatically?
Firestore rules can use normalization functions for string comparisons, but it is safer to normalize data before writing documents and when constructing queries. This ensures consistent behavior across reads, writes, and rule evaluations, reducing the risk of bypasses due to encoding differences.