Prompt Injection in Grape with Bearer Tokens
Prompt Injection in Grape with Bearer Tokens — how this specific combination creates or exposes the vulnerability
Grape is a Ruby DSL for building REST-like APIs. When an endpoint accepts a Bearer token in the Authorization header and passes user-controlled data into an LLM call, the combination can enable prompt injection. A typical vulnerable pattern is building a prompt from headers, params, or parsed tokens without treating the header as untrusted input.
Consider an endpoint that forwards an Authorization header value to an LLM client:
require 'grape'
require 'openai'
class MyAPI < Grape::API
format :json
resource :chat do
desc 'Chat with an LLM, passing the Bearer token into the prompt'
params do
requires :message, type: String, desc: 'User message'
end
post do
token = request.env['HTTP_AUTHORIZATION']&.to_s.gsub(/^Bearer\s+/, '')
client = OpenAI::Client.new(access_token: token)
response = client.chat(
parameters: {
model: 'gpt-3.5-turbo',
messages: [
{ role: 'user', content: "User token context: #{token}. #{params[:message]}" }
]
}
)
{ response: response.dig('choices', 0, 'message', 'content') }
end
end
end
In this example, the Bearer token is extracted from the header and interpolated directly into the LLM prompt. An attacker who can influence the request can supply a token designed to leak the system prompt or override instructions (e.g., by including specific jailbreak patterns or role instructions). Because the token is used both for authentication and as part of the LLM input, the boundary between authentication context and LLM context blurs, creating a prompt injection surface.
middleBrick’s LLM/AI Security checks detect this scenario by probing for system prompt leakage, attempting instruction overrides, and identifying when tokens appear in model inputs. The scanner runs sequential probes including system prompt extraction, DAN jailbreak attempts, and data exfiltration tries, flagging cases where authentication data improperly influences LLM behavior. Unauthenticated LLM endpoint detection also highlights endpoints that accept tokens without enforcing strict validation, further widening the injection risk.
Even when tokens are validated for API access, using them in prompts can expose sensitive patterns or enable token harvesting via crafted outputs. Output scanning for API keys and PII helps identify whether leaked tokens or other secrets appear in LLM responses. Because Grape endpoints often integrate multiple services, failing to isolate authentication state from LLM prompts can lead to chained exploits, such as using a stolen token to escalate abuse across integrated systems.
Bearer Tokens-Specific Remediation in Grape — concrete code fixes
Remediation focuses on strict separation: treat the Authorization header solely for authentication, and never include raw header values in LLM prompts. Validate and scope the token early, then use a controlled, sanitized identity representation for downstream logic.
First, validate the token against your authentication service and extract a minimal identity (e.g., user ID or scopes). Do not forward the raw token to the LLM:
require 'grape'
require 'openai'
class MyAPI < Grape::API
format :json
helpers do
def current_user
auth_header = request.env['HTTP_AUTHORIZATION']&.to_s
return nil unless auth_header&.start_with?('Bearer ')
token = auth_header.slice(7..)
# Validate token with your auth provider; return user or nil
User.find_by_token(token)
end
end
resource :chat do
desc 'Chat with an LLM, using identity, not raw token'
params do
requires :message, type: String, desc: 'User message'
end
before { error!('Unauthorized', 401) unless current_user }
post do
user = current_user
# Build prompt without embedding raw token
messages = [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: "User ID: #{user.id}. #{params[:message]}" }
]
client = OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
response = client.chat(
parameters: {
model: 'gpt-3.5-turbo',
messages: messages
}
)
{ response: response.dig('choices', 0, 'message', 'content') }
end
end
end
This approach keeps authentication data out of the LLM context. The token is used only to authenticate and authorize the request, while the prompt includes a safe user identifier. For production, rotate OPENAI_API_KEY via environment management and avoid logging raw tokens or full headers.
middleBrick’s CLI and GitHub Action integrations can help enforce this separation in CI/CD. Use middlebrick scan <url> to detect prompt injection risks, and add the GitHub Action to fail builds if findings include LLM prompt injection patterns. The dashboard tracks scans over time, and the Pro plan provides continuous monitoring so new regressions in header handling are flagged promptly.
When using the MCP Server in AI coding assistants, you can scan endpoints directly from your IDE to catch risky prompt construction before code is merged. These workflows complement runtime protections like strict input validation and output scanning for PII or credentials, which are also part of middleBrick’s LLM/AI Security checks.
Related CWEs: llmSecurity
| CWE ID | Name | Severity |
|---|---|---|
| CWE-754 | Improper Check for Unusual or Exceptional Conditions | MEDIUM |