HIGH unicode normalizationfeathersjsdynamodb

Unicode Normalization in Feathersjs with Dynamodb

Unicode Normalization in Feathersjs with Dynamodb — how this specific combination creates or exposes the vulnerability

Unicode normalization inconsistencies between Feathersjs service logic and Amazon DynamoDB storage can lead to authentication bypass, IDOR, and data exposure. When a Feathersjs application accepts user input (e.g., username, handle, or resource identifier) and stores it in DynamoDB without normalizing to a canonical form, equivalent strings may hash or compare differently depending on whether comparison happens in application code or during conditional writes in DynamoDB.

For example, the string café can be represented as U+00E9 (LATIN SMALL LETTER E WITH ACUTE) or as the decomposed sequence c + a + f + U+0301 (e followed by combining acute accent). If a Feathersjs service normalizes incoming identifiers to NFC before constructing DynamoDB keys, but does not enforce normalization on queries or conditional checks, an attacker can supply the decomposed form to bypass equality checks, match different rows, or exploit weakly designed ownership checks that rely on string equality rather than normalized canonical IDs.

DynamoDB does not enforce Unicode normalization on its own; comparisons for key conditions, filter expressions, and attribute existence are byte-wise based on the encoded UTF-8 bytes. This means two logically equivalent Unicode strings with different binary representations are treated as distinct. In Feathersjs, if service hooks do not normalize identifiers used in params.query before building DynamoDB expressions, an IDOR may occur when one user’s normalized ID matches another’s record due to inconsistent casing or combining marks. Additionally, normalization affects uniqueness constraints: a composite key built from user-controlled fields may unintentionally allow collisions if normalization is applied inconsistently across index definitions and runtime queries.

These issues intersect with the 12 security checks run by middleBrick, particularly Input Validation and Property Authorization. An unauthenticated scan can surface discrepancies between documented schemas and runtime behavior when equivalent identifiers resolve to different resources. For LLM/AI Security, system prompt leakage patterns may include user-controlled identifiers that are not normalized, increasing the risk of indirect prompt injection via crafted strings that exploit comparison logic.

Dynamodb-Specific Remediation in Feathersjs — concrete code fixes

Remediation centers on enforcing a single Unicode normalization form at the boundary where user input enters Feathersjs services and ensuring all DynamoDB expressions use the same form. Use NFC consistently for keys, query conditions, and filter expressions. Apply normalization before constructing keys, before conditional writes, and before comparison logic. Below are concrete examples for Feathersjs using the AWS SDK for JavaScript v3.

1. Normalize on input in a Feathersjs hook

Use normalize from unorm in a before hook to ensure consistent storage and query keys.

// src/hooks/normalize-identifier.hook.js
const unorm = require('unorm');

module.exports = function normalizeIdentifier(options = {}) {
  return async context => {
    const { data, params } = context;
    // Normalize identifiers in payload and query
    if (data && typeof data.userId === 'string') {
      data.userId = unorm.nfkc(data.userId);
    }
    if (params.query && typeof params.query.userId === 'string') {
      params.query.userId = unorm.nfkc(params.query.userId);
    }
    // Also normalize any attribute used in DynamoDB key condition
    if (data && data.handle) {
      data.handle = unorm.nfkc(data.handle);
    }
    return context;
  };
};

2. Use normalized keys in DynamoDB expressions

Construct key conditions using the normalized value to avoid mismatches between query form and stored form.

// src/services/items/service.js (using AWS SDK v3)
const { DynamoDBClient, QueryCommand } = require('@aws-sdk/client-dynamodb');
const unorm = require('unorm');
const client = new DynamoDBClient({ region: 'us-east-1' });

async function getItemsByHandle(userId, handle) {
  const normalizedUserId = unorm.nfkc(userId);
  const normalizedHandle = unorm.nfkc(handle);

  const command = new QueryCommand({
    TableName: 'Items',
    KeyConditionExpression: 'userId = :uid AND begins_with(handle, :hdl)',
    ExpressionAttributeValues: {
      ':uid': { S: normalizedUserId },
      ':hdl': { S: normalizedHandle }
    }
  });

  const response = await client.send(command);
  return response.Items;
}

3. Normalize before conditional writes to enforce uniqueness

When using conditional writes (e.g., preventing overwrites), normalize both the stored key and the condition key to the same form.

const { DynamoDBClient, PutCommand } = require('@aws-sdk/client-dynamodb');
const unorm = require('unorm');
const client = new DynamoDBClient({ region: 'us-east-1' });

async function createItemIfNotExists(item) {
  const normalizedOwner = unorm.nfkc(item.owner);
  const normalizedSlug = unorm.nfkc(item.slug);
  const compositeKey = `${normalizedOwner}#${normalizedSlug}`;

  const command = new PutCommand({
    TableName: 'Items',
    Item: {
      compositeKey: { S: compositeKey },
      owner: { S: normalizedOwner },
      slug: { S: normalizedSlug },
      data: { S: item.data }
    },
    ConditionExpression: 'attribute_not_exists(compositeKey)'
  });

  try {
    await client.send(command);
    return { created: true };
  } catch (err) {
    if (err.name === 'ConditionalCheckFailedException') {
      return { created: false };
    }
    throw err;
  }
}

4. Apply normalization across related indices

If you use global or local secondary indexes, ensure the index key expressions reference normalized attributes or are computed from normalized source attributes. DynamoDB does not re-normalize values stored in indexes; they are stored as provided. Therefore, design your index keys to use normalized source fields to prevent index misalignment and query mismatches.

Frequently Asked Questions

Why does Unicode normalization matter when DynamoDB stores strings as UTF-8?

DynamoDB compares UTF-8 bytes directly for keys and expressions. Equivalent Unicode characters can have multiple binary representations (e.g., composed vs. decomposed forms). Without normalization, logically identical identifiers may not match, enabling IDOR or allowing duplicate keys that violate uniqueness constraints.

Can middleBrick detect Unicode normalization issues in a Feathersjs + DynamoDB setup?

Yes. middleBrick scans the unauthenticated attack surface and can surface inconsistencies in input validation and property authorization that suggest normalization mismatches. Findings include risk scores and remediation guidance to enforce canonical forms like NFC.

Unicode Normalization in Feathersjs with Dynamodb