HIGH llm data leakageadonisjsdynamodb

Llm Data Leakage in Adonisjs with Dynamodb

Q: How can I test if my AdonisJS + DynamoDB endpoint leaks sensitive data to an LLM?

Use the middleBrick CLI to scan the endpoint: middlebrick scan https://your-api.example.com/llm . Review the findings for system prompt leakage, output exposure, and prompt injection risks. Combine this with manual validation by checking whether DynamoDB query results are sanitized before being included in LLM prompts.

Llm Data Leakage in Adonisjs with Dynamodb — how this specific combination creates or exposes the vulnerability

AdonisJS applications that integrate with Amazon DynamoDB can inadvertently expose sensitive data through LLM endpoints when response handling, prompt construction, or model outputs are not carefully controlled. The risk arises when application logic passes raw DynamoDB query results, item attributes, or debug information into LLM-related functions or logs without sanitization. Because DynamoDB stores schemaless attribute-value pairs, developers may inadvertently serialize nested or sensitive fields (e.g., internal IDs, email addresses, or tokens) into context that is later supplied to an LLM.

In AdonisJS, route handlers or service classes commonly retrieve items using the AWS SDK for JavaScript (v3). If these items are directly forwarded to an LLM client—such as for generating responses or enriching prompts—fields like user_id, api_key, or internal metadata can leak into LLM inputs. This can lead to system prompt leakage if the LLM echoes or reflects back tokens or configuration details present in the supplied context. For example, passing a full DynamoDB item into a system prompt can expose the structure of your data layer or reveal operational details through the LLM’s output.

Additionally, LLM output scanning is required because models can return sensitive data they were not intended to generate, including API keys or PII that may have been inadvertently present in prompt context derived from DynamoDB. If an application uses LLMs to summarize or transform data retrieved from DynamoDB, and the output is not validated, credentials or personal information could be exposed to end users or logged insecurely. This is particularly relevant when using features such as output scanning for API keys and executable code, which are part of advanced LLM security checks.

Another vector involves excessive agency detection. If an AdonisJS service configures an LLM client with tool-use capabilities (e.g., function calling or LangChain agent patterns) and passes DynamoDB-derived parameters that control tool selection or execution, an attacker might manipulate inputs to induce the LLM to call unintended functions or access additional data. Even without authenticated endpoints, unauthenticated LLM endpoints can be probed, and poorly scoped DynamoDB queries can provide the contextual data needed to guide those probes.

Finally, because DynamoDB queries in AdonisJS often rely on parameters such as partition keys and sort keys constructed from user input, improper validation can lead to IDOR-like conditions that indirectly inform LLM behavior. An LLM security assessment should include system prompt leakage detection using regex patterns aligned with ChatML, Llama 2, Mistral, and Alpaca formats, active prompt injection testing, and output validation to ensure no sensitive data derived from DynamoDB traverses the LLM pipeline.

Dynamodb-Specific Remediation in Adonisjs — concrete code fixes

To mitigate LLM data leakage when using DynamoDB in AdonisJS, apply strict output filtering, context minimization, and validation before any data reaches an LLM client. Below are concrete, DynamoDB-specific remediation steps with working code examples.

1. Retrieve only necessary attributes

Use ProjectionExpression to limit DynamoDB responses to fields required for the LLM task, excluding sensitive metadata. This reduces the data footprint that could be exposed through prompts or logs.

import { DynamoDBClient, GetItemCommand } from "@aws-sdk/client-dynamodb";
import { unmarshall } from "@aws-sdk/util-dynamodb";

const client = new DynamoDBClient({ region: "us-east-1" });

export async function getUserPublicProfile(userId) {
  const command = new GetItemCommand({
    TableName: "Users",
    Key: { user_id: { S: userId } },
    ProjectionExpression: "user_id, display_name, avatar_url",
  });

  const response = await client.send(command);
  return unmarshall(response.Item || {});
}

2. Sanitize data before LLM context construction

Strip or hash sensitive fields before incorporating DynamoDB results into prompts. Avoid passing raw API keys, internal IDs, or PII into system or user messages.

function sanitizeForLlm(item) {
  const { api_key, internal_token, ...safeItem } = item;
  return safeItem;
}

const rawItem = await getUserPublicProfile("user-123");
const safeContext = sanitizeForLlm(rawItem);

// Safe prompt construction
const prompt = `Summarize the profile: ${JSON.stringify(safeContext)}`;

3. Validate and scan LLM outputs

Implement output scanning to detect accidental inclusion of API keys, PII, or executable code. Use regex-based checks aligned with common data patterns and enforce allowlists for expected output formats.

function containsApiKey(text) {
  const apiKeyPattern = /(ai|sk|pk)-[a-zA-Z0-9]{20,}/g;
  return apiKeyPattern.test(text);
}

const llmResponse = await generateWithLlm(prompt);
if (containsApiKey(llmResponse)) {
  throw new Error("LLM output contains potential API key; reject and log.");
}

4. Limit tool-use scope and parameterization

When using LLM functions that rely on DynamoDB-derived parameters, tightly constrain which tools can be called and validate all inputs. Avoid dynamic tool selection based on raw user data.

const allowedTools = ["search_users", "get_profile"];

function validateToolSelection(toolName, userSupplied) {
  if (!allowedTools.includes(toolName)) {
    throw new Error("Unauthorized tool call.");
  }
  // further validation on userSupplied arguments
}

5. Enforce least-privilege IAM and logging hygiene

Ensure the IAM role associated with your AdonisJS service has read-only permissions for tables used in LLM contexts, and avoid logging full DynamoDB responses. Log only sanitized identifiers and request metadata.

6. Integrate with middleBrick for continuous detection

Use the CLI to scan endpoints that interact with DynamoDB and LLMs: middlebrick scan <url>. The CLI provides JSON output that can be integrated into scripts or the GitHub Action to fail builds if risky patterns are detected. For ongoing assurance, the Pro plan supports continuous monitoring and can enforce security thresholds in CI/CD pipelines.

Related CWEs: llmSecurity

CWE ID	Name	Severity
CWE-754	Improper Check for Unusual or Exceptional Conditions	MEDIUM

Frequently Asked Questions

How can I test if my AdonisJS + DynamoDB endpoint leaks sensitive data to an LLM?

Use the middleBrick CLI to scan the endpoint: middlebrick scan https://your-api.example.com/llm. Review the findings for system prompt leakage, output exposure, and prompt injection risks. Combine this with manual validation by checking whether DynamoDB query results are sanitized before being included in LLM prompts.

Does middleBrick fix data leakage from DynamoDB in AdonisJS?

middleBrick detects and reports potential data leakage vectors, including system prompt leakage and output exposure involving DynamoDB-sourced data. It provides remediation guidance but does not automatically fix or block data transmission. Apply the suggested code fixes and validation controls in your AdonisJS application.