HIGH llm data leakageecho gocockroachdb

Llm Data Leakage in Echo Go with Cockroachdb

Llm Data Leakage in Echo Go with Cockroachdb — how this specific combination creates or exposes the vulnerability

When building a Go API service with Echo and CockroachDB, data leakage to an LLM can occur through both accidental exposure in application behavior and insecure integration patterns. The risk is not that CockroachDB itself sends data to an LLM, but that developer workflows, debugging endpoints, or unsupervised logging inadvertently surface sensitive database records in contexts where an LLM service is reachable or invoked.

Echo is a lightweight HTTP framework; if routes such as /debug/requests or /health reflect query parameters or full request bodies into logs or responses, and those logs or responses are ingested by an LLM service (for example, through a connected agent or telemetry pipeline), sensitive data can be exposed. CockroachDB, as a distributed SQL database, often stores structured personal or financial information. If application code constructs log messages that include rows returned from CockroachDB without redaction, and those logs are forwarded to an LLM endpoint, the LLM may receive real data.

A second vector is through generated code or configuration. Some developer tools that integrate with LLMs may capture database schema or sample queries to suggest improvements. If such tools connect to a CockroachDB instance and transmit schema information or queries containing sensitive values to an external LLM, leakage occurs. For example, a tool that introspects tables and sends SHOW CREATE TABLE output to an LLM for optimization advice can expose table structures and column semantics, especially if comments contain sensitive context.

Common insecure patterns include:

  • Printing request parameters and database rows to standard output in development mode without filtering.
  • Using verbose logging middleware that captures full payloads and response bodies, including fields that may contain personally identifiable information (PII).
  • Connecting a code suggestion tool to CockroachDB and allowing it to contact an external LLM without sanitization.

These patterns can cause credentials, connection strings, or row data to appear in LLM training data or logs, violating data minimization principles and potentially exposing regulated data. middleBrick’s LLM/AI Security checks detect system prompt leakage, unsafe tool usage patterns, and outputs that may contain PII or secrets, helping identify such integration risks in an API’s observable behavior.

Cockroachdb-Specific Remediation in Echo Go — concrete code fixes

Remediation focuses on preventing sensitive data from reaching LLM-related pathways and ensuring database interactions follow least-privilege and redaction practices. Below are concrete Go examples using the Echo framework and the CockroachDB Go driver (github.com/lib/pq or the CockroachDB-compatible github.com/jackc/pgx/v5/stdlib).

1. Redact sensitive fields before logging or external transmission

Do not log full rows. Instead, log only necessary, sanitized fields.

package main

import (
	"log"
	"net/http"

	"github.com/labstack/echo/v4"
)

type User struct {
	ID       int    `json:"id"`
	Email    string `json:"email"`
	SSN      string `json:"-"` // exclude from JSON
	Password string `json:"-"` // never log or return
}

// sanitizeUser returns a safe version for logging/output
func sanitizeUser(u User) map[string]interface{} {
	return map[string]interface{}{
		"id":    u.ID,
		"email": u.Email,
	}
}

func getUserHandler(c echo.Context) error {
	var user User
	// Assume db is a *sql.DB connected to CockroachDB
	row := db.QueryRow("SELECT id, email, ssn, password FROM users WHERE id = $1", c.Param("id"))
	if err := row.Scan(&user.ID, &user.Email, &user.SSN, &user.Password); err != nil {
		return c.JSON(http.StatusInternalServerError, map[string]string{"error": "unable to fetch user"})
	}

	// Safe: only sanitized data is logged or sent outward
	log.Printf("user accessed: %+v", sanitizeUser(user))
	return c.JSON(http.StatusOK, sanitizeUser(user))
}

2. Use context-aware logging middleware that filters sensitive headers and bodies

Avoid logging request bodies that may contain tokens or PII. Configure Echo’s middleware to skip body capture for sensitive routes.

package main

import (
	"github.com/labstack/echo/v4"
	"github.com/labstack/echo/v4/middleware"
)

func setupServer() *echo.Echo {
	e := echo.New()

	// Use Secure middleware with minimal data exposure
	e.Use(middleware.SecureWithConfig(middleware.SecureConfig{
		XSSProtection:                "1; mode=block",
		ContentTypeNosniff:          "nosniff",
		XFrameOptions:               "DENY",
		ReferrerPolicy:              "no-referrer-when-downgrade",
	}))

	// Custom logging that excludes body for selected paths
	e.Use(middleware.RequestLoggerWithConfig(middleware.RequestLoggerConfig{
		LogURI:      true,
		LogStatus:   true,
		LogLatency:  true,
		LogErrorOnly: false,
		ContextKey:  "requestLog",
		HandleError: func(err error, c echo.Context) {
			log.Printf("request error: %v", err)
		},
		LogRequestBody: false, // prevent echoing bodies that may contain sensitive data
	}))

	return e
}

3. Apply principle of least privilege to CockroachDB connections

Ensure the Go service’s database user has only the permissions needed, and avoid embedding secrets in code or logs.

package main

import (
	"context"
	"database/sql"
	"log"

	_ "github.com/jackc/pgx/v5/stdlib"
)

func main() {
	connStr := "postgresql://readonly_user:secure_password@host:26257/dbname?sslmode=require"
	db, err := sql.Open("pgx", connStr)
	if err != nil {
		log.Fatalf("failed to connect: %v", err)
	}
	defer db.Close()

	// Verify connectivity with minimal privileges
	var version string
	if err := db.QueryRow("SELECT version()").Scan(&version); err != nil {
		log.Fatalf("db query failed: %v", err)
	}
	log.Printf("connected to: %s", version)
}

4. Disable or sandbox external LLM tooling in production

If your development workflow uses LLM-assisted tooling that introspects CockroachDB, ensure those tools run in isolated environments and never transmit production data. Prefer local or on-premise instances for sensitive schema analysis, or disable the integration entirely in production services.

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

Can middleBrick detect LLM data leakage risks in an Echo + CockroachDB API?
Yes. middleBrick’s LLM/AI Security checks look for system prompt leakage, unsafe tool usage patterns, and outputs that may contain PII or secrets, which can indicate data exposure in integrations.
Does middleBrick fix data leakage issues automatically?
No. middleBrick detects and reports findings with remediation guidance. It does not fix, patch, block, or remediate. You should apply secure coding practices and redaction based on its guidance.