Logging Monitoring Failures in Cockroachdb
How Logging Monitoring Failures Manifests in CockroachDB
When a CockroachDB‑backed API lacks sufficient logging and monitoring, attackers can operate unseen. Typical manifestations include:
- Brute‑force or credential‑stuffing attempts against the SQL interface are not recorded, allowing an attacker to guess passwords without triggering alerts.
- Data‑exfiltration via SQL injection or unauthorized SELECT statements leaves no trace in audit logs, so the breach is discovered only after data appears elsewhere.
- Privilege‑escalation through misconfigured roles or unauthorized DDL (e.g., CREATE ROLE, GRANT) goes unnoticed because the cluster setting that records audit events is disabled.
- Long‑running or runaway queries that consume CPU or I/O are not monitored, enabling a denial‑of‑service condition that degrades service before anyone notices.
- Node failures or network partitions are not alerted, leading to silent loss of availability and potential split‑brain scenarios.
These issues map to the OWASP API Security Top 10 M8: Insufficient Logging & Monitoring. In CockroachDB the relevant code paths are the cluster settings that control audit and statement tracing, and the internal tables that store events (system.eventlog for enterprise editions). If those settings are left at their defaults (sql.audit.log.enabled = false and sql.trace.log_statement_execute = false), the database produces minimal operational visibility.
CockroachDB-Specific Detection
Detecting insufficient logging and monitoring in a CockroachDB deployment involves checking both the database configuration and the observable telemetry exposed to the outside world. Key indicators are:
- Audit logging disabled: Run
SHOW CLUSTER SETTING sql.audit.log.enabled; a return offalsemeans no audit entries are written tosystem.eventlog. - Statement tracing off:
SHOW CLUSTER SETTING sql.trace.log_statement_executereturningfalseindicates that individual SQL statements are not logged to the CockroachDB error log. - Missing eventlog entries: Query
SELECT COUNT(*) FROM system.eventlog WHERE timestamp > now() - interval '1 hour'; a count of zero (when traffic is expected) signals a logging gap. - Absent Prometheus metrics: Endpoints such as
/_status/varsor the Prometheus scrape endpoint should expose metrics likecockroachdb_sql_latencies_countandcockroachdb_node_status. If these metrics are absent or consistently zero, monitoring is not functioning. - No alerts on node health: Checking the
cockroach node statusoutput or the/_status/varsfieldlive_node_countfor sudden drops without corresponding alert notifications reveals monitoring failures.
When middleBrick scans an API endpoint that interacts with CockroachDB, it looks for these signs indirectly. For example, if the API exposes a health‑check that returns the CockroachDB status JSON, middleBrick will verify whether fields indicating audit logging or statement tracing are present and set to true. Missing or false values trigger a finding under the “Insufficient Logging & Monitoring” category, with severity mapped to the OWASP M8 risk level.
CockroachDB-Specific Remediation
Remediation focuses on enabling CockroachDB’s native logging and monitoring features and then wiring them into alerting systems. Below are concrete, syntactically correct examples.
Enable Audit Logging (Enterprise)
-- Turn on audit logging for the entire cluster
SET CLUSTER SETTING sql.audit.log.enabled = true;
-- Optionally, restrict audit to specific tables
ALTER TABLE accounts SET audit = true;
After enabling, audit records appear in system.eventlog and can be queried or forwarded to a SIEM.
Enable Statement Tracing (All editions)
-- Log every executed SQL statement to the CockroachDB error log
SET CLUSTER SETTING sql.trace.log_statement_execute = true;
-- To reduce verbosity, trace only statements exceeding a duration
SET CLUSTER SETTING sql.trace.log_slow_statements = true;
SET CLUSTER SETTING sql.trace.threshold = '500ms';
These settings cause CockroachDB to emit detailed log lines that include the statement text, parameters, and execution time.
Configure Logging to File (instead of stderr)
# In the cockroach start command
cockroach start \
--certs-dir=$HOME/certs \
--store=path/to/store \
--logtostderr=false \
--log-file=/var/log/cockroachdb/db.log \
--log-file-max-size=100MB \
--log-file-max-group=5 \
--log-file-severity=INFO
Persisted logs can be shipped to Elasticsearch, Splunk, or any log‑aggregation tool.
Expose and Monitor Prometheus Metrics
# Start CockroachDB with HTTP endpoint for metrics
cockroach start \
--http-addr=0.0.0.0:8080 \
--prometheus-addr=0.0.0.0:9090
# Example alerting rule (Prometheus)
groups:
- name: cockroachdb.rules
rules:
- alert: NodeDown
expr: cockroachdb_node_status{status="live"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "CockroachDB node {{ $labels.instance }} is down"
description: "No live nodes reported for more than 2 minutes."
- alert: HighSQLLatency
expr: histogram_quantile(0.95, sum(rate(cockroachdb_sql_latencies_seconds_bucket[5m])) by (le)) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "High 95th percentile SQL latency on {{ $labels.instance }}"
description: "SQL queries are slower than 500ms for the last 5 minutes."
These rules ensure that operational anomalies trigger notifications via email, Slack, PagerDuty, etc.
Application‑Level Assurance (Go example)
package main
import (
"context"
"database/sql"
_ "github.com/cockroachdb/cockroach-go/v2/crdb"
)
func main() {
db, err := sql.Open("postgres", "postgresql://root@localhost:26257/defaultdb?sslmode=disable")
if err != nil {
panic(err)
}
defer db.Close()
ctx := context.Background()
// Ensure statement tracing is enabled for this connection
if _, err := db.ExecContext(ctx, "SET CLUSTER SETTING sql.trace.log_statement_execute = true"); err != nil {
panic(err)
}
// Example query – will now be traced in the CockroachDB log
rows, err := db.QueryContext(ctx, "SELECT id, balance FROM accounts WHERE id = $1", 42)
if err != nil {
panic(err)
}
// …process rows…
}
Similar patterns apply to Java, Python, or any language using the libpq‑compatible driver.
By activating audit logging, statement tracing, file‑based logging, and Prometheus‑based monitoring, and then verifying those settings through health‑checks or direct SQL queries, you eliminate the logging and monitoring gap that attackers could exploit.
Frequently Asked Questions
Does enabling audit logging in CockroachDB impact performance?
system.eventlog table. In most workloads the impact is negligible (<5% latency increase). For high‑throughput scenarios you can limit auditing to specific tables or enable it only during investigation windows.How can I verify that middleBrick detected a logging‑monitoring failure in my API?
sql.audit.log.enabled or sql.trace.log_statement_execute, and will provide remediation guidance specific to CockroachDB.