Excessive Data Exposure in Cockroachdb
How Excessive Data Exposure Manifests in Cockroachdb
Excessive Data Exposure in Cockroachdb environments typically occurs when applications inadvertently return more data than necessary through API endpoints. This vulnerability is particularly prevalent in Cockroachdb due to its distributed architecture and flexible query capabilities.
One common manifestation involves SELECT * queries that return entire table rows when only specific columns are needed. For example, a user profile endpoint might execute:
SELECT * FROM users WHERE id = $1This returns all columns including sensitive fields like password hashes, internal IDs, or audit timestamps that should never reach the client. Cockroachdb's distributed nature means these queries can span multiple nodes, potentially exposing data across the cluster.
Another Cockroachdb-specific pattern involves JSONB column expansion. When applications serialize entire JSONB objects without filtering:
SELECT id, profile FROM users WHERE id = $1Developers often forget that JSONB columns can contain nested sensitive data like API keys, internal configuration, or PII that was stored for operational purposes but shouldn't be exposed.
Cockroachdb's INTERLEAVE functionality creates another exposure vector. When tables are interleaved for performance:
CREATE TABLE users (id UUID PRIMARY KEY, email STRING) INTERLEAVE IN PARENT accounts (id)Joins across interleaved tables can inadvertently return parent table data alongside child table results, exposing account-level information when only user-level data was intended.
Time-travel queries (AS OF SYSTEM TIME) present unique risks in Cockroachdb. Developers might use these for debugging:
SELECT * FROM orders AS OF SYSTEM TIME '-30s' WHERE order_id = $1Without realizing this exposes historical data that may include PII or sensitive business information from previous states.
Range queries on indexed columns can also leak excessive data. A seemingly innocuous:
SELECT * FROM products WHERE price BETWEEN 10 AND 20might return internal cost data, supplier information, or margin calculations stored alongside product data.
Cockroachdb-Specific Detection
Detecting Excessive Data Exposure in Cockroachdb requires both static analysis of query patterns and runtime monitoring of data flows. middleBrick's API security scanner excels at identifying these issues through its black-box scanning approach.
middleBrick tests Cockroachdb endpoints by sending requests and analyzing responses for excessive data exposure. The scanner looks for:
- Unexpected columns in JSON responses that match Cockroachdb's internal schemas
- Timestamp fields with system-level precision (nanosecond timestamps common in Cockroachdb)
- UUIDs in predictable patterns (Cockroachdb's distributed UUID generation)
- Array fields containing more data than the API contract specifies
The scanner's 12 parallel security checks include Property Authorization testing specifically designed to catch when Cockroachdb queries return unauthorized data. For example, it might detect that a user profile endpoint returns:
{
"id": "uuid-here",
"email": "user@example.com",
"password_hash": "$2b$12$abc...",
"created_at": "2024-01-15 10:30:45.123456789Z",
"updated_at": "2024-01-20 14:22:11.987654321Z",
"internal_notes": "Sensitive internal data..."
}middleBrick's LLM/AI Security checks are particularly relevant for Cockroachdb applications using AI features. The scanner tests for system prompt leakage that might contain database credentials or Cockroachdb-specific configuration data.
For OpenAPI spec analysis, middleBrick cross-references your API definitions with actual runtime responses. If your spec defines a minimal user object but the Cockroachdb query returns 20+ fields, the scanner flags this discrepancy.
Continuous monitoring in the Pro plan automatically rescans your Cockroachdb APIs on a schedule, alerting you when new excessive data exposure vulnerabilities appear due to schema changes or query modifications.
Cockroachdb-Specific Remediation
Remediating Excessive Data Exposure in Cockroachdb requires both query-level fixes and architectural changes. Here are Cockroachdb-specific remediation strategies:
1. Explicit Column Selection
-- Bad: SELECT *
-- Good: Explicit columns only
SELECT id, email, name, created_at FROM users WHERE id = $1
-- For JSONB columns, use explicit field selection
SELECT id, profile->'public_info' AS profile FROM users WHERE id = $1
2. Cockroachdb's Computed Columns for Data Masking
CREATE TABLE users (
id UUID PRIMARY KEY,
email STRING,
sensitive_data STRING,
-- Computed column that masks sensitive data
public_profile AS (
jsonb_build_object(
'id', id,
'email', email,
'created_at', created_at
)
) STORED
);
-- Query only the computed column
SELECT public_profile FROM users WHERE id = $1
3. Row-Level Security (RLS) with Cockroachdb
ALTER TABLE users
ENABLE ROW LEVEL SECURITY;
-- Policy to restrict data exposure
CREATE POLICY user_access ON users
FOR SELECT
USING (id = crdb_internal.current_session_user());
-- Alternatively, use application-based filtering
CREATE POLICY app_filter ON users
FOR SELECT
USING (email NOT LIKE '%internal%');
4. Cockroachdb's INTERLEAVE Data Isolation
-- Avoid exposing parent table data
CREATE TABLE user_profiles (
user_id UUID PRIMARY KEY,
email STRING,
-- Only include necessary fields
profile_data JSONB,
-- No interleaving with accounts table
CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id)
);
-- Use explicit joins instead of interleaving when parent data isn't needed
SELECT up.email, up.profile_data
FROM user_profiles up
WHERE up.user_id = $1;
5. Time-Travel Query Restrictions
-- Create a wrapper function to control AS OF SYSTEM TIME usage
CREATE OR REPLACE FUNCTION safe_select_user(user_id UUID)
RETURNS TABLE (
id UUID,
email STRING,
name STRING
) AS $$
BEGIN
-- Disallow historical queries in production
IF current_setting('cluster.settings.time_travel_enabled', true) = 'true' THEN
RAISE EXCEPTION 'Time-travel queries disabled in production';
END IF;
RETURN QUERY
SELECT id, email, name
FROM users
WHERE id = user_id;
END;
$$ LANGUAGE plpgsql;
6. API Response Filtering
-- Use Cockroachdb's JSON functions to filter responses
CREATE OR REPLACE FUNCTION filter_user_response(user_id UUID)
RETURNS JSONB AS $$
DECLARE
user_data JSONB;
BEGIN
SELECT row_to_json(t) INTO user_data
FROM (
SELECT id, email, name, created_at
FROM users
WHERE id = user_id
) t;
-- Remove any fields that shouldn't be exposed
RETURN user_data - 'password_hash' - 'internal_notes';
END;
$$ LANGUAGE plpgsql;
Related CWEs: propertyAuthorization
| CWE ID | Name | Severity |
|---|---|---|
| CWE-915 | Mass Assignment | HIGH |