HIGH llm data leakagefiber

Llm Data Leakage in Fiber

How Llm Data Leakage Manifests in Fiber

Llm Data Leakage in Fiber applications typically occurs through misconfigured API endpoints that expose sensitive model parameters, system prompts, or training data. Fiber's lightweight design and rapid development workflow can inadvertently create exposure points that attackers exploit.

The most common manifestation involves Fiber endpoints that return raw LLM responses without proper sanitization. Consider a chatbot endpoint that logs and returns the full conversation history including system prompts:

router := fiber.New()
router.Post("/chat", func(c *fiber.Ctx) error {
    var request ChatRequest
    if err := c.BodyParser(&request); err != nil {
        return c.Status(400).JSON(fiber.Map{"error": err.Error()})
    }
    
    response := llm.Chat(request.Message, request.Context)
    log.Println("Chat response:", response) // Logs full system prompt
    return c.JSON(fiber.Map{"response": response})
})

This pattern exposes the system prompt and conversation history to anyone who can access the endpoint. Attackers can extract proprietary instructions, API keys embedded in prompts, or sensitive training data through repeated requests.

Another Fiber-specific vulnerability involves improper error handling that reveals model configuration. When exceptions occur during LLM processing, Fiber applications might return stack traces containing model names, version information, or API endpoint details:

router := fiber.New()
router.Post("/generate", func(c *fiber.Ctx) error {
    var request GenerateRequest
    if err := c.BodyParser(&request); err != nil {
        return err // Returns raw error with model details
    }
    
    result, err := llm.Generate(request.Prompt, request.Options)
    if err != nil {
        return c.Status(500).JSON(fiber.Map{"error": err.Error()})
    }
    return c.JSON(fiber.Map{"result": result})
})

Property authorization bypasses in Fiber's middleware chain can also lead to data leakage. If authentication middleware is bypassed or improperly configured, unauthenticated users might access endpoints that should require authentication:

router := fiber.New()
router.Use(authMiddleware) // Missing on certain routes
router.Post("/admin/generate", func(c *fiber.Ctx) error {
    // Should require auth but might be accessible
    result := llm.GenerateAdminPrompt()
    return c.JSON(fiber.Map{"result": result})
})

Fiber's default CORS configuration can also contribute to data leakage when not properly restricted. Wide-open CORS policies allow any origin to make requests, potentially exposing sensitive LLM responses to unauthorized domains.

Fiber-Specific Detection

Detecting Llm Data Leakage in Fiber applications requires both static code analysis and dynamic runtime scanning. Start with a comprehensive code review focusing on LLM-related endpoints and their response handling.

Static analysis should identify these patterns:

# Search for raw LLM response returns
grep -r "return c\.JSON(" . | grep -E "(llm|model|prompt)"

# Find endpoints without proper authentication
grep -r "router\.(Get|Post|Put|Delete)" . | grep -v "authMiddleware"

# Look for exposed system prompts
grep -r "Chat(" . | grep -E "(System|system)"

Dynamic scanning with middleBrick provides automated detection of LLM-specific vulnerabilities. The scanner tests for system prompt leakage using 27 regex patterns that match common LLM prompt formats:

{
  "llm_security": {
    "system_prompt_leakage": {
      "detected": true,
      "patterns_matched": ["ChatML", "Llama-2", "Alpaca"],
      "severity": "high",
      "remediation": "Sanitize LLM responses before returning to clients"
    }
  }
}

middleBrick's active prompt injection testing is particularly effective for Fiber applications. The scanner sends five sequential probes to test for prompt injection vulnerabilities:

1. System prompt extraction probe
2. Instruction override test
3. DAN jailbreak attempt
4. Data exfiltration check
5. Cost exploitation test

During runtime testing, monitor Fiber application logs for unexpected patterns. Enable detailed logging and watch for:

# Check logs for prompt injection attempts
tail -f /var/log/fiber-app.log | grep -E "(SYSTEM|INJECT|PROMPT)"

# Monitor for excessive agency patterns
journalctl -u fiber-app | grep -E "(tool_call|function_call|agent)"

API specification analysis is crucial for Fiber applications using OpenAPI specs. middleBrick resolves $ref references and cross-references spec definitions with runtime findings to identify mismatches between documented and actual behavior.

Fiber-Specific Remediation

Remediating Llm Data Leakage in Fiber requires a multi-layered approach combining input validation, response sanitization, and proper authentication controls. Start by implementing comprehensive response filtering before sending LLM data to clients.

func sanitizeLLMResponse(response string) string {
    // Remove system prompts and sensitive content
    patterns := []string{
        `(?i)(system|instruction):.*?(?=\n\w)`,
        `(?i)(dan|jailbreak|ignore previous).*?\n`,
        `\b(API_KEY|SECRET|TOKEN)\b.*?(?=\s|$)`,
    }
    
    sanitized := response
    for _, pattern := range patterns {
        re := regexp.MustCompile(pattern)
        sanitized = re.ReplaceAllString(sanitized, "[REDACTED]")
    }
    return sanitized
}

router := fiber.New()
router.Post("/chat", func(c *fiber.Ctx) error {
    var request ChatRequest
    if err := c.BodyParser(&request); err != nil {
        return c.Status(400).JSON(fiber.Map{"error": "Invalid request"})
    }
    
    // Process LLM request
    rawResponse := llm.Chat(request.Message, request.Context)
    
    // Sanitize before returning
    sanitized := sanitizeLLMResponse(rawResponse)
    
    return c.JSON(fiber.Map{"response": sanitized})
})

Implement proper authentication middleware for all LLM endpoints. Use Fiber's Next() method to ensure middleware chains are properly configured:

func authMiddleware(c *fiber.Ctx) error {
    token := c.Get("Authorization")
    if token == "" || !validateToken(token) {
        return c.Status(401).JSON(fiber.Map{"error": "Unauthorized"})
    }
    return c.Next()
}

router := fiber.New()
router.Use(authMiddleware) // Apply to all routes

// All routes now require authentication
router.Post("/chat", chatHandler)
router.Post("/generate", generateHandler)
router.Post("/admin/generate", adminGenerateHandler)

Configure strict CORS policies to prevent unauthorized cross-origin requests:

router := fiber.New()
router.Use(cors.New(cors.Config{
    AllowOrigins: []string{"https://yourdomain.com"},
    AllowMethods: []string{"GET", "POST", "PUT", "DELETE"},
    AllowHeaders: []string{"Authorization", "Content-Type"},
    ExposeHeaders: []string{"Content-Length"},
    MaxAge: 86400,
}))

Implement rate limiting to prevent automated data extraction attempts:

router := fiber.New()
router.Use(rateLimit.New(rateLimit.Config{
    Filter: func(c *fiber.Ctx) bool {
        // Apply rate limiting to LLM endpoints only
        return strings.Contains(c.Path(), "llm")
    },
    TimeWindow: 1 * time.Minute,
    MaxRequests: 10,
    Next: func(c *fiber.Ctx) bool {
        // Skip rate limiting for authenticated admin users
        return isAdmin(c)
    },
}))

Add comprehensive error handling that doesn't reveal sensitive information:

func safeGenerateHandler(c *fiber.Ctx) error {
    var request GenerateRequest
    if err := c.BodyParser(&request); err != nil {
        return c.Status(400).JSON(fiber.Map{"error": "Invalid input"})
    }
    
    defer func() {
        if r := recover(); r != nil {
            log.Println("Recovered in f", r)
            c.Status(500).JSON(fiber.Map{"error": "Internal server error"})
        }
    }()
    
    result, err := llm.Generate(request.Prompt, request.Options)
    if err != nil {
        log.Printf("LLM error: %v", err) // Log details server-side only
        return c.Status(500).JSON(fiber.Map{"error": "Generation failed"})
    }
    
    return c.JSON(fiber.Map{"result": result})
}

Integrate middleBrick scanning into your CI/CD pipeline to catch these issues before deployment:

name: Security Scan
on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Run middleBrick Scan
      run: |
        npm install -g middlebrick
        middlebrick scan https://staging.yourapp.com/api/llm --threshold B
    - name: Fail on high risk
      if: failure()
      run: exit 1

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

How can I test my Fiber application for LLM data leakage without exposing production data?
Use middleBrick's self-service scanning on a staging environment. The scanner tests unauthenticated attack surfaces without requiring credentials or access to production data. Run scans against your staging API endpoints to identify vulnerabilities before they reach production.
Does middleBrick's LLM security scanning work with all LLM providers in Fiber applications?
Yes, middleBrick's scanner tests the HTTP API endpoints regardless of the underlying LLM provider (OpenAI, Anthropic, local models, etc.). The scanner focuses on the exposed API surface and response patterns, not the specific model implementation.