Integrity Failures in Cassandra
How Integrity Failures Manifest in Cassandra
In Apache Cassandra, integrity failures usually appear as divergent replica states, lost updates, or the resurrection of stale data. Because Cassandra relies on eventual consistency and timestamp‑based conflict resolution, an attacker who can influence write timestamps, consistency levels, or transactional guards can cause the database to converge on incorrect data.
- Timestamp manipulation (write skew): If a client can supply a future timestamp, the write will win over legitimate updates that use the current time. An attacker can repeatedly send writes with increasingly large timestamps to hide legitimate data or to cause a later read to return an older value.
- Insufficient lightweight transaction guards: Cassandra’s lightweight transactions (LWT) use Paxos to provide linearizable writes when an
IFclause is present. Omitting theIFclause or using a weak condition turns the operation into a regular write, losing the guarantee and allowing concurrent updates to overwrite each other. - Improper batch usage: Batching logged and unlogged statements, or mixing different consistency levels inside a batch, can lead to partial application. If one part of the batch fails while another succeeds, the replica set ends up in an inconsistent state.
- Low consistency levels for critical writes: Writing with
CONSISTENCY ONE(or read with ONE) increases the chance that a replica misses the update. Subsequent reads from a different replica may return stale data, which can be interpreted as a successful operation by the application. - Missing counter idempotency guards: Counters are not idempotent; replaying a counter increment due to a retry can over‑increment the value, corrupting metrics or financial totals.
The following code snippet shows a vulnerable write path that accepts a user‑supplied timestamp and uses consistency level ONE without any lightweight transaction guard:
// Pseudocode for a vulnerable Cassandra write endpoint
function writeSensorData(userId, value, clientTs) {
const query = "INSERT INTO sensor_data (user_id, ts, value) VALUES (?, ?, ?)";
// clientTs comes directly from the request – no validation
execute(query, [userId, clientTs, value], { consistency: 'one' });
}
An attacker can set clientTs far in the future, causing the write to dominate all legitimate updates.
Cassandra‑Specific Detection
Detecting integrity failures involves looking for patterns that break Cassandra’s consistency guarantees. Manual checks include nodetool repair status, examining system.* tables for tombstone ratios, and monitoring read‑repair metrics. From an API security perspective, middleBrick can help surface the conditions that enable these failures.
- Property Authorization check: middleBrick validates that incoming payloads only contain allowed fields. If an API accepts arbitrary
ts(timestamp) orconsistency_levelparameters, the check will flag them as unauthorized properties, indicating a potential integrity‑failure vector. - Input Validation check: The scanner tests for injection of malformed values (e.g., extremely large timestamps, negative TTLs) and for missing validation on batch payloads. A finding here often correlates with the ability to drive divergent replicas.
- Rate Limiting check: While not a direct integrity issue, missing rate limits can enable an attacker to flood the system with timestamp‑manipulated writes, increasing the chance of successful skew.
Example of scanning an endpoint that writes sensor data:
middlebrick scan https://api.example.com/v1/sensor/write
The resulting report will list any unauthorized properties (like ts or consistency) and will highlight missing validation on batch or lightweight‑transaction fields. Teams can then correlate those findings with the Cassandra‑specific patterns described above.
Cassandra‑Specific Remediation
Fixing integrity failures in Cassandra requires aligning the application’s write path with the database’s consistency model. The following remediations use only Cassandra‑native features.
- Enforce server‑side timestamp generation: Never trust a client‑supplied timestamp. Use the server’s clock (or the
now()function in CQL) so that all writes receive a monotonic, authoritative timestamp. - Use appropriate consistency levels: For writes that must not be lost, write at
QUORUM(or higher) and read atQUORUM. This guarantees that a majority of replicas acknowledge the write before it is considered successful. - Apply lightweight transactions correctly: When a conditional update is needed, always include an
IFclause that checks the expected state. Example:
// Correct use of a lightweight transaction
INSERT INTO user_profiles (user_id, email, last_login)
VALUES (?, ?, ?)
IF NOT EXISTS;
- Avoid mixed‑type batches: Keep batches homogeneous (all logged or all unlogged) and never mix different consistency levels inside a single batch. If you need atomicity across multiple tables, use a logged batch with a uniform consistency level.
- Validate counters before retrying: Because counters are not idempotent, design the application to be tolerant of duplicate increments (e.g., by using a separate idempotency key stored in a lightweight transaction) or avoid retries on counter updates.
- Enable and monitor repair: Run
nodetool repairon a regular schedule and watch for increasing tombstone ratios; high ratios can indicate that deleted data is being resurrected due to failed anti‑entropy. - Use materialized views with care: Ensure that the view’s primary key includes the columns you need to query, and set a suitable TTL to prevent stale view entries from accumulating.
Below is a secure version of the earlier write endpoint, showing server‑side timestamp generation, QUORUM consistency, and a lightweight transaction guard for an upsert pattern:
// Secure Cassandra write using server‑side timestamp and LWT
function writeSensorDataSecure(userId, value) {
const query = ""
INSERT INTO sensor_data (user_id, ts, value)
VALUES (?, toTimestamp(now()), ?)
IF NOT EXISTS;
"";
// Consistency QUORUM ensures the write is persisted on a majority of replicas
execute(query, [userId, value], { consistency: 'quorum' });
}
By applying these patterns, the API no longer accepts client‑controlled timestamps, writes achieve a quorum‑based guarantee, and conditional updates prevent lost updates—effectively eliminating the integrity‑failure vectors described earlier.