HIGH symlink attackcassandra

Symlink Attack in Cassandra

How Symlink Attack Manifests in Cassandra

Cassandra's distributed architecture relies heavily on filesystem operations for data persistence, including commit logs, SSTables, and snapshots. A symlink attack exploits this by manipulating symbolic links to redirect Cassandra's file operations to unintended locations. This typically occurs when an application layer (often an API) accepts user-controlled file paths without proper validation and passes them to Cassandra's administrative operations.

Attack Vector via API Endpoints: Consider a REST API endpoint designed to restore a database snapshot. A vulnerable implementation might directly use a user-supplied path parameter with Cassandra's nodetool refresh or programmatic snapshot restoration APIs. An attacker could provide a path like /var/lib/cassandra/data/keyspace1/../../../../etc/passwd or, more insidiously, replace a legitimate snapshot directory with a symlink pointing to a sensitive system file. When Cassandra attempts to read the snapshot, it follows the symlink and may expose sensitive data (e.g., /etc/shadow) in error messages or logs, or even overwrite critical system files if the operation involves writes.

Cassandra-Specific Code Paths: The danger surfaces in several Cassandra subsystems:

Snapshot Operations: org.apache.cassandra.service.StorageService methods like loadNewSSTables or restoreSnapshot that take filesystem paths. If these paths are attacker-controlled, symlinks can cause arbitrary file read/write.
Commit Log Recovery: During startup, Cassandra replays commit logs from commitlog_directory. A symlink here could redirect writes to arbitrary locations, potentially filling up critical partitions or overwriting files.
SSTable Loading: When adding a new data directory or during repair, Cassandra scans directories for SSTables. A symlink within a data directory could cause Cassandra to read files outside the intended data path, leading to data corruption or information disclosure.

Realistic Attack Scenario: An API endpoint POST /api/v1/snapshots/restore accepts JSON: { "keyspace": "user_data", "snapshot_name": "backup_2024" }. The backend constructs a path: String snapshotPath = "/var/lib/cassandra/data/" + keyspace + "/snapshots/" + snapshotName; and calls StorageService.restoreSnapshot(keyspace, snapshotPath). An attacker submits snapshot_name as ../../../..//etc/cron.d/malicious. If the attacker has previously placed a symlink at the expected location pointing to /etc/cron.d/malicious, Cassandra might parse the file as an SSTable (causing errors) or, in a write scenario, overwrite the cron file. Even without prior placement, if the API allows specifying an absolute path (e.g., via a misconfigured snapshot_path parameter), direct symlink targeting is possible.

Cassandra-Specific Detection

Detecting symlink vulnerabilities in Cassandra requires examining both the API layer and Cassandra's operational behavior. The core issue is a lack of canonical path validation before filesystem operations. middleBrick's security scan includes an Input Validation check that actively probes API endpoints for path traversal and symlink handling weaknesses. During a scan, middleBrick tests endpoints that accept file paths or identifiers that influence storage operations (e.g., snapshot names, data directory parameters, import/export paths).

Scanning with middleBrick: Submit the target API URL to middleBrick. The scanner will:

Identify endpoints that accept path-like parameters (e.g., via OpenAPI path parameters or query strings named path, file, directory).
Send payloads containing symlink indicators (e.g., ../ sequences, absolute paths like /tmp/malicious_link) and observe responses.
Analyze error messages, HTTP status codes, and response bodies for signs of filesystem interaction (e.g., NoSuchFileException, InvalidSSTableException, or unexpected data leakage).
Cross-reference findings with the OpenAPI spec to map vulnerable parameters to Cassandra operations (e.g., a parameter named snapshot_name linked to a restoreSnapshot action in the spec).

Manual Detection Indicators: Look for API responses that reveal internal filesystem paths, such as:

File /var/lib/cassandra/data/keyspace1/../../../../etc/passwd not found
Invalid SSTable magic number at /tmp/uploaded_file (indicating a non-SSTable file was read)
Unexpected file creation in system directories (check /tmp or /etc after API calls).

Cassandra Log Review: Examine system.log and debug.log for IOException or SecurityException when processing API-triggered operations. Logs showing access to paths outside cassandra.yaml configured directories (data_file_directories, commitlog_directory) are strong indicators.

Cassandra-Specific Remediation

Remediation focuses on strict path validation at the application layer and hardening Cassandra's filesystem permissions. Cassandra itself does not provide built-in symlink protection for all operations; thus, the application must ensure paths are canonical and within allowed directories.

1. Application-Level Path Canonicalization: Before passing any user-supplied path to Cassandra's Java driver or nodetool commands, resolve it to its canonical path and verify it resides within an approved base directory. Example in Java:

import java.nio.file.*;

public class CassandraPathValidator {
    private static final Path ALLOWED_BASE = Paths.get("/var/lib/cassandra/data");

    public static Path validateAndResolve(String userSuppliedPath) throws SecurityException {
        try {
            // Resolve symlinks, normalize path (remove .., .)
            Path resolved = Paths.get(userSuppliedPath).toRealPath(LinkOption.NOFOLLOW_LINKS);
            // Ensure resolved path is within ALLOWED_BASE
            if (!resolved.startsWith(ALLOWED_BASE)) {
                throw new SecurityException("Path outside allowed data directory: " + resolved);
            }
            return resolved;
        } catch (IOException e) {
            throw new SecurityException("Invalid path or symlink target inaccessible", e);
        }
    }
}

// Usage in API handler:
String userSnapshot = request.getParameter("snapshot_name");
Path safePath = CassandraPathValidator.validateAndResolve("/var/lib/cassandra/data/keyspace1/snapshots/" + userSnapshot);
// Pass safePath to Cassandra admin API

2. Restrict Filesystem Permissions: Ensure Cassandra's data directories (data_file_directories, commitlog_directory) are owned by the cassandra user and have strict permissions (750 or 700). Prevent other users from creating symlinks within these directories. Example on Linux:

sudo chown -R cassandra:cassandra /var/lib/cassandra
sudo chmod -R 750 /var/lib/cassandra
# Ensure no world-writable directories
find /var/lib/cassandra -type d -perm -o=w -exec chmod o-w {} \;

3. Disable Symlink Following in Cassandra (if possible): Cassandra does not offer a global switch to disable symlink following. However, you can mitigate by setting disk_failure_policy: stop in cassandra.yaml to halt on filesystem errors, reducing the impact of corrupted reads. Additionally, use OS-level protections like mount --bind with noatime,nodiratime and consider chroot jails for the Cassandra process (though complex in distributed setups).

4. API Design Hardening: Avoid accepting raw filesystem paths in APIs. Instead, use abstract identifiers (e.g., snapshot IDs) and map them to internal paths server-side. If paths are necessary, whitelist allowed characters and reject any containing .. or symlink indicators.

5. Monitoring and Alerts: Use middleBrick's Pro plan with continuous monitoring to track Input Validation scores over time. Set up GitHub Action gates to fail builds if new endpoints with risky path parameters are introduced. Integrate alerts (Slack/Teams) for score drops indicating potential regression.

Frequently Asked Questions

How does a symlink attack in Cassandra differ from traditional filesystem symlink attacks?

In Cassandra, symlink attacks are often chained through API endpoints that interact with distributed storage operations (snapshots, commit logs, SSTables). The attacker typically does not have direct shell access to the Cassandra nodes; instead, they exploit an application-layer API that passes user-controlled paths to Cassandra's admin functions. The impact is amplified because Cassandra may follow symlinks during distributed repair or bootstrap, potentially affecting multiple nodes. Traditional filesystem symlink attacks usually require local access to create the symlink; here, the attacker may only control a path parameter in an HTTP request.

Can middleBrick detect symlink vulnerabilities in Cassandra APIs without authenticated access?

Yes. middleBrick performs unauthenticated black-box scanning, testing any publicly accessible API endpoint. It sends crafted payloads containing symlink and path traversal patterns to endpoints that accept file-related parameters. The scanner then analyzes responses for signs of improper filesystem interaction (e.g., error messages revealing internal paths, unexpected data exposure). This aligns with OWASP API Top 10:2023 – Broken Access Control (A01) and Security Misconfiguration (A05). The scan does not require Cassandra credentials because it tests the application layer that wraps Cassandra operations.

Symlink Attack in Cassandra

How Symlink Attack Manifests in Cassandra

Cassandra-Specific Detection

Cassandra-Specific Remediation

Frequently Asked Questions

Related Pages