Migrating from 42Crunch to middleBrick for Multi-turn manipulation audit

What middleBrick covers

  • Deterministic multi-turn probe library with 18 adversarial tests
  • JSON output for programmatic analysis and CI integration
  • Score tracking and diffing across scheduled scans
  • Policy enforcement via GitHub Action gates
  • Retention controls and data deletion on demand
  • Read-only testing with no destructive payloads

Current limitations with multi-turn manipulation audits

Multi-turn manipulation testing typically depends on interactive sessions that record prompts, model outputs, and token-level traces. These workflows are often manual, scattered across chat logs, and tied to a specific provider account. Without a standardized capture format, it is difficult to replay a chain of turns, share findings with teammates, or integrate results into CI pipelines. The lack of structured artifacts also makes it hard to measure drift over time or to programmatically compare scanner configurations.

How middleBrick structures multi-turn audit evidence

middleBrick runs a deterministic sequence of 18 adversarial probes across three scan tiers: Quick, Standard, and Deep. Each probe is a self-contained turn or turn pair (input, expected failure mode) with a clear pass or fail outcome. Results are stored as a flat list of findings, each including the probe identifier, tier, observed behavior, and a short remediation hint. This structure makes it straightforward to export findings as JSON for downstream analysis or to embed them in compliance artifacts.

Migration workflow and artifact mapping

When migrating from a conversational audit to middleBrick, map your existing chat logs to the corresponding probe identifiers listed in the scan report. Discard raw prompt text that is not covered by a defined probe, and retain only the findings that map to OWASP API Top 10 categories such as LLM01 through LLM18. You can replay the same URL or endpoint with the CLI to regenerate findings, and you can use the JSON output to compare scores across scan dates. The dashboard tracks score drift and shows which findings were resolved, introduced, or unchanged.

middlebrick scan https://api.example.com/openapi.json --output json

CI integration and policy enforcement

middleBrick can be integrated into CI pipelines so that a failing build blocks deployment when the score drops below your chosen threshold. The GitHub Action reads the scan output, evaluates the list of findings against a configurable allowlist, and fails the job if violations exceed the policy. Because the scan is read-only and does not modify your API, you can run it on every pull request without risking production state. Use the CLI or the API client to tailor thresholds per service or per environment.

Ongoing monitoring and maintenance

With Pro tier, you can schedule rescans at six-hour, daily, weekly, or monthly intervals. Each new scan is diffed against the prior run, highlighting new findings, resolved findings, and score drift. Alerts are rate-limited to one per hour per API and can be delivered by email or through HMAC-SHA256 signed webhooks. Data retention is under your control: findings are deletable on demand and purged within 30 days of cancellation, and customer data is never used for model training.

Frequently Asked Questions

Can I import my existing multi-turn chat logs as findings?
You can map your chat logs to the standardized probe identifiers, but middleBrick does not ingest raw logs directly. Regenerate findings by scanning the endpoint so that results follow the defined schema.
Does the scanner store my API keys or sensitive parameters?
middleBrick only stores deletable scan results that contain findings. It does not retain credentials, and customer data is never used for model training.
How are new probe patterns added in future scans?
The scanner is updated independently; new adversarial patterns appear as additional probe identifiers in future reports. You do not need to change your scan configuration to benefit from them.
Can I set different thresholds for different APIs in the same scan job?
Yes, configure thresholds per API in the dashboard or CLI. The GitHub Action supports per-repository settings to enforce distinct policies across services.