MEDIUM unicode normalizationfeathersjsfirestore

Unicode Normalization in Feathersjs with Firestore

Unicode Normalization in Feathersjs with Firestore — how this specific combination creates or exposes the vulnerability

FeathersJS is a framework for creating JavaScript APIs. When it receives user input such as identifiers, search fields, or document keys, it often passes those values to a database like Google Cloud Firestore. Firestore stores strings as UTF‑8, but comparison behavior depends on how the string is normalized. Unicode normalization matters because visually identical characters can have multiple binary representations. For example, é can be a single code point U+00E9 or a combination of U+0065 followed by U+0301 (combining acute accent). If a FeathersJS service uses user input to look up Firestore documents without normalizing, an attacker can bypass lookups or trigger inconsistent authorization checks by supplying a different but equivalent representation.

Consider an endpoint that retrieves a user profile by username. If the client sends a normalized form and the stored document uses a different normalization form, the query may return no results. An attacker could exploit this mismatch to enumerate valid usernames or to escalate privileges by causing the application to fall back to a default or guest identity. In addition, inconsistent normalization can affect index usage in Firestore, potentially causing queries to behave differently under certain conditions, which may be observable to an attacker.

Another risk arises when strings are used in map keys or document IDs. If normalization is not applied consistently before generating or looking up these keys, the application might create duplicate entries or fail to locate existing data, leading to logic flaws such as insecure direct object references (BOLA/IDOR). For example, two client submissions that appear identical visually might map to different Firestore document IDs, allowing one user to access another’s data if the application does not enforce a canonical form.

The issue is compounded when FeathersJS hooks transform input before it reaches Firestore. If a hook lowercases or modifies strings without normalizing, it can introduce subtle discrepancies. Moreover, searching through array fields or object properties that contain user-controlled strings can yield inconsistent results if the normalization form varies across entries. Because Firestore queries are limited to indexed fields and specific comparison rules, failing to normalize early can lead to unpredictable query results and potential authorization gaps.

To detect this class of issue, security scans can check whether FeathersJS services normalize incoming strings before using them in Firestore operations. The absence of explicit normalization at the service or hook layer is a sign that equivalent representations might be treated as distinct, creating confusion in lookup logic and access control. Remediation involves applying a consistent Unicode normalization form, such as NFC or NFD, at the point where user input enters the FeathersJS application, and ensuring that all Firestore reads, writes, and queries use the same normalized values.

Firestore-Specific Remediation in Feathersjs — concrete code fixes

Remediation centers on normalizing strings before they are used in Firestore operations. Choose a single normalization form—NFC is generally recommended for compatibility—and apply it consistently in FeathersJS hooks and services. Below is a complete, realistic example showing how to integrate Unicode normalization into a FeathersJS service that stores and retrieves user profiles in Firestore.

First, install a normalization utility. The built-in unorm package is suitable for this purpose:

npm install unorm

Next, define a hook that normalizes relevant fields. This hook runs before create and update operations to ensure stored data uses the chosen normalization form:

// src/hooks/normalize.js
const unorm = require('unorm');

module.exports = function normalizeFields(fields) {
  return context => {
    if (context.data && typeof context.data === 'object') {
      fields.forEach(field => {
        if (context.data[field] && typeof context.data[field] === 'string') {
          // Apply NFC normalization to the string
          context.data[field] = unorm.nfc(context.data[field]);
        }
      });
    }
    return context;
  };
};

Apply this hook to your FeathersJS service. The following example configures a user profile service that normalizes the username and email fields before writing to Firestore. It also ensures queries use normalized values by normalizing the query parameter before calling Firestore:

// src/services/users/users.class.js
const { Service } = require('feathersjs');
const normalizeFields = require('./hooks/normalize');

class UsersService extends Service {
  constructor(options) {
    super(options);
    this.options = options;
  }

  async find(params) {
    const normalizedQuery = { ...params.query };
    if (normalizedQuery.username) {
      normalizedQuery.username = require('unorm').nfc(normalizedQuery.username);
    }
    if (normalizedQuery.email) {
      normalizedQuery.email = require('unorm').nfc(normalizedQuery.email);
    }
    // Pass normalized query to the Firestore adapter
    return super.find({ query: normalizedQuery });
  }

  async get(id, params) {
    const normalizedId = require('unorm').nfc(id);
    return super.get(normalizedId, params);
  }

  async create(data, params) {
    // Hooks will also normalize, but normalizing here ensures consistency
    if (data.username) data.username = require('unorm').nfc(data.username);
    if (data.email) data.email = require('unorm').nfc(data.email);
    return super.create(data, params);
  }
}

module.exports = function init() {
  const app = this;
  const options = {
    Model: require('./models/user.model'),
    paginate: { default: 10, max: 50 }
  };

  app.use('/users', new UsersService(options));
  app.service('/users').hooks({
    before: {
      create: [normalizeFields(['username', 'email'])],
      update: [normalizeFields(['username', 'email'])],
      patch: [normalizeFields(['username', 'email'])]
    }
  });
};

In this setup, all incoming strings are normalized to NFC before being sent to Firestore. The query normalization in the find and get methods ensures that lookups use the same canonical form as the stored data, preventing mismatches that could lead to BOLA/IDOR or data exposure. By applying normalization at the hook layer and within service methods, you cover both direct service calls and external API invocations, reducing the risk of inconsistent handling.

For Firestore specifically, ensure that any index definitions align with the normalized form. If you query on normalized fields, composite indexes that include those fields will work correctly. Avoid relying on automatic collation rules that may differ between client libraries, and instead enforce normalization explicitly within FeathersJS.

Frequently Asked Questions

Why is Unicode normalization necessary when using FeathersJS with Firestore?
Unicode normalization ensures that visually identical strings with different binary representations are treated as equal. Without normalization, FeathersJS applications may create duplicate entries or fail to locate documents in Firestore, leading to inconsistent authorization and potential BOLA/IDOR vulnerabilities.
Can normalization be applied globally in FeathersJS to avoid per-method checks?
Yes, by implementing a FeathersJS hook that normalizes specific fields for create, update, and patch operations, you can enforce a consistent normalization form across all service methods. Queries should also normalize input parameters before interacting with Firestore to ensure lookup consistency.