HIGH prompt injectiongrapebearer tokens

Prompt Injection in Grape with Bearer Tokens

Prompt Injection in Grape with Bearer Tokens — how this specific combination creates or exposes the vulnerability

Grape is a Ruby DSL for building REST-like APIs. When an endpoint accepts a Bearer token in the Authorization header and passes user-controlled data into an LLM call, the combination can enable prompt injection. A typical vulnerable pattern is building a prompt from headers, params, or parsed tokens without treating the header as untrusted input.

Consider an endpoint that forwards an Authorization header value to an LLM client:

require 'grape'
require 'openai'

class MyAPI < Grape::API
  format :json

  resource :chat do
    desc 'Chat with an LLM, passing the Bearer token into the prompt'
    params do
      requires :message, type: String, desc: 'User message'
    end
    post do
      token = request.env['HTTP_AUTHORIZATION']&.to_s.gsub(/^Bearer\s+/, '')
      client = OpenAI::Client.new(access_token: token)
      response = client.chat(
        parameters: {
          model: 'gpt-3.5-turbo',
          messages: [
            { role: 'user', content: "User token context: #{token}. #{params[:message]}" }
          ]
        }
      )
      { response: response.dig('choices', 0, 'message', 'content') }
    end
  end
end

In this example, the Bearer token is extracted from the header and interpolated directly into the LLM prompt. An attacker who can influence the request can supply a token designed to leak the system prompt or override instructions (e.g., by including specific jailbreak patterns or role instructions). Because the token is used both for authentication and as part of the LLM input, the boundary between authentication context and LLM context blurs, creating a prompt injection surface.

middleBrick’s LLM/AI Security checks detect this scenario by probing for system prompt leakage, attempting instruction overrides, and identifying when tokens appear in model inputs. The scanner runs sequential probes including system prompt extraction, DAN jailbreak attempts, and data exfiltration tries, flagging cases where authentication data improperly influences LLM behavior. Unauthenticated LLM endpoint detection also highlights endpoints that accept tokens without enforcing strict validation, further widening the injection risk.

Even when tokens are validated for API access, using them in prompts can expose sensitive patterns or enable token harvesting via crafted outputs. Output scanning for API keys and PII helps identify whether leaked tokens or other secrets appear in LLM responses. Because Grape endpoints often integrate multiple services, failing to isolate authentication state from LLM prompts can lead to chained exploits, such as using a stolen token to escalate abuse across integrated systems.

Bearer Tokens-Specific Remediation in Grape — concrete code fixes

Remediation focuses on strict separation: treat the Authorization header solely for authentication, and never include raw header values in LLM prompts. Validate and scope the token early, then use a controlled, sanitized identity representation for downstream logic.

First, validate the token against your authentication service and extract a minimal identity (e.g., user ID or scopes). Do not forward the raw token to the LLM:

require 'grape'
require 'openai'

class MyAPI < Grape::API
  format :json

  helpers do
    def current_user
      auth_header = request.env['HTTP_AUTHORIZATION']&.to_s
      return nil unless auth_header&.start_with?('Bearer ')
      token = auth_header.slice(7..)
      # Validate token with your auth provider; return user or nil
      User.find_by_token(token)
    end
  end

  resource :chat do
    desc 'Chat with an LLM, using identity, not raw token'
    params do
      requires :message, type: String, desc: 'User message'
    end
    before { error!('Unauthorized', 401) unless current_user }
    post do
      user = current_user
      # Build prompt without embedding raw token
      messages = [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: "User ID: #{user.id}. #{params[:message]}" }
      ]
      client = OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
      response = client.chat(
        parameters: {
          model: 'gpt-3.5-turbo',
          messages: messages
        }
      )
      { response: response.dig('choices', 0, 'message', 'content') }
    end
  end
end

This approach keeps authentication data out of the LLM context. The token is used only to authenticate and authorize the request, while the prompt includes a safe user identifier. For production, rotate OPENAI_API_KEY via environment management and avoid logging raw tokens or full headers.

middleBrick’s CLI and GitHub Action integrations can help enforce this separation in CI/CD. Use middlebrick scan <url> to detect prompt injection risks, and add the GitHub Action to fail builds if findings include LLM prompt injection patterns. The dashboard tracks scans over time, and the Pro plan provides continuous monitoring so new regressions in header handling are flagged promptly.

When using the MCP Server in AI coding assistants, you can scan endpoints directly from your IDE to catch risky prompt construction before code is merged. These workflows complement runtime protections like strict input validation and output scanning for PII or credentials, which are also part of middleBrick’s LLM/AI Security checks.

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

Why is including a Bearer token in an LLM prompt considered a prompt injection risk?
Including a raw Bearer token in a prompt blurs the boundary between authentication context and LLM instructions. An attacker who can influence the prompt may craft inputs that cause the model to reveal the token or follow injected instructions, leading to system prompt leakage or unauthorized actions.
How can I safely use authentication data with LLM endpoints in Grape?
Validate the token early, extract a minimal identity (e.g., user ID or scopes), and avoid interpolating raw header values into prompts. Use environment variables for LLM API keys and ensure tokens are never exposed to the LLM context or logged inadvertently.