HIGH xml external entitiesrailsdynamodb

Xml External Entities in Rails with Dynamodb

Xml External Entities in Rails with Dynamodb — how this specific combination creates or exposes the vulnerability

An XML External Entity (XXE) attack occurs when an application processes XML input that references external entities, allowing an attacker to read local files, perform SSRF, or cause denial of service. In a Ruby on Rails application that uses the AWS SDK for Ruby to interact with Amazon DynamoDB, XXE can be introduced if the app parses untrusted XML data before constructing requests to DynamoDB.

Consider a scenario where Rails receives an XML upload or SOAP message that is parsed with a vulnerable XML parser (e.g., Nokogiri with default settings). If the XML includes a malicious DOCTYPE that defines external entities, the parser may resolve those entities during processing. This can lead to file reads on the Rails server or SSRF to internal metadata services. Even though DynamoDB itself does not process XML, the vulnerability exists in the Rails layer before data is sent to DynamoDB. For example, an attacker might provide an XML payload that causes the Rails app to read /etc/passwd or to make internal HTTP requests to the instance metadata service at 169.254.169.254, potentially discovering credentials used by the DynamoDB client.

Moreover, if the Rails application uses XML-based protocols or document-style SOAP APIs that eventually store or query data in DynamoDB, unsafe parsing of the inbound XML can expose sensitive data or enable server-side request forgery into internal services. The DynamoDB requests built from parsed XML content may inadvertently include sensitive information extracted via external entities, or the SSRF induced by XXE can allow probing of internal AWS metadata endpoints where temporary DynamoDB credentials are available.

Compounded by the fact that Rails may automatically parse parameters from XML in certain configurations (e.g., when using ActionController::ParamsWrapper or legacy SOAP handlers), developers might not realize that untrusted XML reaches the parsing layer. This creates a path where XXE leads to exposure of environment variables or configuration files that contain AWS access keys, which can then be abused to make unauthorized DynamoDB calls. Therefore, the combination of Rails’ XML parsing and DynamoDB integrations increases the risk surface if input validation and parser hardening are not applied.

Dynamodb-Specific Remediation in Rails — concrete code fixes

To prevent XXE in a Rails application that interacts with DynamoDB, ensure XML parsing is hardened and untrusted XML is never processed by the default Nokogiri resolver. Use safe parsing options and avoid enabling external entities or DTDs entirely.

1. Safe XML parsing with Nokogiri

Configure Nokogiri to disable external entities and network access. Avoid the default Nokogiri::XML constructor for untrusted input. Instead, use Nokogiri::XML::Document.parse with explicit options or the SAX parser with a controlled resolver.

# Safe parsing for untrusted XML in a Rails controller or service
require 'nokogiri'

xml_input = params[:xml] || request.body.read
begin
  # Disable external subsets and network access
  parser = Nokogiri::XML::SAX::Parser.new(MySafeHandler.new)
  parser.parse(xml_input)
rescue Nokogiri::XML::SyntaxError => e
  Rails.logger.error("Invalid XML: #{e.message}")
  head :bad_request
end

# Example safe DOM parsing
secure_doc = Nokogiri::XML(xml_input) do |config|
  config.options = Nokogiri::XML::ParseOptions::NOENT | Nokogiri::XML::ParseOptions::NONET
  config.strict_reader = true
  config.noblanks
end
# Validate/sanitize before using data to build DynamoDB requests

2. Avoid XML parsing when possible

If your API uses JSON, disable XML parsing in Rails to remove the attack surface. Configure your controllers to respond to JSON only, and ensure you do not rely on XML-based parameter parsing.

# config/initializers/mime_types.rb or in a controller
class ApiController < ApplicationController
  before_action :set_request_format

  private

  def set_request_format
    request.format = :json unless params[:format] == 'xml'
  end
end

# In routes, prefer JSON-only endpoints
Rails.application.routes.draw do
  namespace :api do
    resources :items, only: [:index, :show], defaults: { format: 'json' }
  end
end

3. Secure DynamoDB interactions

When sending data to DynamoDB, validate and sanitize all inputs. Do not directly use raw parsed XML fields in DynamoDB requests. Use strong parameter filtering and schema validation.

# Example Rails model method that safely writes to DynamoDB using the AWS SDK for Ruby
require 'aws-sdk-dynamodb'

class ItemSyncService
  def initialize
    @dynamodb = Aws::DynamoDB::Client.new(region: 'us-east-1')
  end

  def save_item(params)
    # Whitelist and validate attributes; do not trust XML-derived hashes
    item = {
      id: params.require(:id),
      name: params.fetch(:name, 'unnamed'),
      metadata: sanitize_metadata(params[:metadata])
    }

    @dynamodb.put_item(
      table_name: 'Items',
      item: {
        'id' => { s: item[:id] },
        'name' => { s: item[:name] },
        'metadata' => { s: item[:metadata] }
      }
    )
  rescue Aws::DynamoDB::Errors::ServiceError => e
    Rails.logger.error("DynamoDB error: #{e.message}")
    raise
  end

  private

  def sanitize_metadata(raw)
    return '{}' unless raw.is_a?(String)
    # Basic sanitization example; use strong parameters in practice
    raw.gsub(/<[^<]*>/, '')
  end
end

4. CI/CD and scanning integration

Use the middleBrick GitHub Action to add API security checks to your CI/CD pipeline. This helps catch insecure XML handling or missing parser hardening before deployment. Pair this with the middleBrick CLI to scan endpoints during development and the middleBrick Web Dashboard to track security scores over time.

# Example GitHub Action step
- name: Run middleBrick API security scan
  uses: middlebrick/action@v1
  with:
    url: ${{ secrets.STAGING_API_URL }}
    threshold: C

Frequently Asked Questions

Does middleBrick fix XXE vulnerabilities in Rails apps?
middleBrick detects and reports XXE and related findings with severity, guidance, and framework mappings (e.g., OWASP API Top 10). It does not fix or patch; developers must apply safe parsing and input validation based on the remediation guidance.
Can middleBrick scan APIs that integrate with DynamoDB?
Yes. middleBrick scans the unauthenticated attack surface of any reachable API endpoint. If your Rails app exposes endpoints that eventually write to DynamoDB, middleBrick will test those endpoints for issues such as XXE, regardless of the backend datastore.