Security Hardening
Every attack vector tested. Every result recorded. The gate has been run against poison injection, replay attacks, sequence skips, concurrent race conditions, and adversarial probe suites — all against the live deployment, not a test environment. Evidence is public and independently verifiable.
Injection hardening
The gate's checkPoison layer runs before payload evaluation. Any request carrying these patterns is rejected with HALT before reaching the enforcement core. Three attack categories are in scope; all are blocked at the gate layer, not the application layer.
Role directive injection
Patterns: system:, <|im_start|>, {role:. Attempts to embed LLM role or instruction directives in JSON payload fields — action, input, sequence_id, step. All field values are scanned before evaluation.
Base64 blob injection
Long base64-encoded strings in any payload field. Common vector for encoding instructions or binary payloads that bypass text-pattern filters. The gate detects and halts regardless of which field carries the blob.
YAML front matter injection
--- delimiters as both raw newlines and JSON-encoded \n. The initial regex matched literal newline characters only — JSON.stringify encodes newlines as \\n, so the pattern never fired. Regex updated to match both forms. Found by the daily adversarial probe.
How the YAML gap was found
The daily monitoring agent runs rotating adversarial probes against the live gate each morning. On 2026-05-03 it reported YAML front matter returning ALLOW instead of DENY. Root cause: checkPoison used a literal-newline regex against JSON-serialised bodies where newlines are two-character escape sequences. The fix was deployed the same session. The probe found a genuine gap in a production security control — not a theoretical edge case. 1M test data →
Pressure test record
All tests run against the live Cloudflare Workers deployment — not a staging environment, not a mock. Results are machine-recorded JSON files stored in the evidence directory. The 1M test results are publicly downloadable.
| Test | Requests | Enforcement accuracy | What it proves | |
|---|---|---|---|---|
| 1M Full-Sequence 125,000 complete 8-step MSMD sequences · 50 concurrent workers · 68 req/s · 3h 46m · rural NZ hotspot |
923,983 | 100% 0 enforcement errors 0 false DENY |
Rail is deterministic at scale. Every ALLOW correct, no misfire on valid sequences over 3+ hours. | Data → |
| 100K Enterprise 33,334 sequences × 3 scenarios: valid, skip, replay |
100,002 | 99.98% 0.02% network errors |
Skip-ahead and replay enforcement under sustained load. Valid sequences always pass; invalid sequences always block. | ✓ |
| Step Transition 1,000 sequences × 3 scenarios: valid, skip, replay |
3,000 | 100% zero errors |
Step ordering enforcement is perfect. Skip-ahead: 100% blocked. Replay: 100% blocked. Valid: 100% pass. | ✓ |
| Action Type Mismatch Valid MSMD functions with wrong action_type deliberately set |
3,003 | 100% 3,003 blocked 0 false allows |
The function/action_type contract is enforced without exception. No wrong-action-type request ever passes. | ✓ |
| Concurrent Burst 200 sequences × 10-request simultaneous burst each |
2,000 | 100% 200/200 bursts: exactly 1 ALLOW 0 race leaks |
Durable Object single-threaded model prevents race conditions. Concurrent requests on the same sequence are serialised — no double-advance possible. | ✓ |
| Seal Behaviour 100 full 8-step sequences + post-seal breach attempt each |
900 | 100% 100/100 sequences sealed 0 seal leaks |
The seal is permanent. After settle, no further steps are accepted on the sequence — ever. Post-seal attempts always DENY. |
✓ |
| Nonce Replay 300 sequences × first request + replay of same nonce |
600 | 100% 300/300 replays blocked |
Nonce replay protection is absolute. The same nonce is never accepted twice on the same sequence, regardless of timing. | ✓ |
| Intake Node Pressure Single-step entry point · concurrency 300 |
10,000 | — | Entry point throughput ceiling under maximum concurrency. Measures CPU and latency at the first enforcement step. | ✓ |
Total verified requests across all pressure tests: 1,045,508. JSON evidence files with full latency distributions available in the test evidence directory.
Live breach suite
Ten targeted tests, each designed to breach one specific enforcement rule. All 10 run against the live production gate — not unit tests, not mocks. Result: all 10 held.
| Test | Attack | Expected | Result |
|---|---|---|---|
| T1 | Unknown function — garbage_step as function name |
DENY — UNKNOWN_FUNCTION | HELD |
| T2 | Forbidden action_type on disruption step | DENY — ACTION_NOT_ALLOWED | HELD |
| T3 | Sequence skip — jump from step 1 to step 3 | DENY — SEQUENCE_VIOLATION | HELD |
| T4 | Function/step mismatch — step ≠ function |
DENY — FUNCTION_STEP_MISMATCH | HELD |
| T5 | Nonce replay — same nonce sent twice | DENY — REPLAY_NONCE | HELD |
| T6 | Forbidden action_type on execution step | DENY — ACTION_NOT_ALLOWED | HELD |
| T7 | Skip to settle — jump to final step from step 2 | DENY — SEQUENCE_VIOLATION | HELD |
| T8 | Settle with wrong function name | DENY — FUNCTION_STEP_MISMATCH | HELD |
| T9 | Clean 8-step MSMD run — full sequence seal | ALLOW × 8 → SEALED | SEALED |
| T10 | Report generator verification of sealed sequence | VERIFIED_INTACT | VERIFIED |
Architecture edge cases
Five edge cases that test the enforcement boundary conditions — not the happy path, the corners. All five pass on the live deployment.
| Test | Edge case | Result |
|---|---|---|
| Duplicate step names | step_order: ["intake", "intake", "settle"] — duplicate in the sequence definition |
Handled gracefully — ALLOW or DENY, no crash |
| DO nonce cap | 501 unique nonces on a single sequence — past the nonce ledger eviction threshold | Nonce eviction working — no crash, no memory leak, no replay admitted |
| Timestamp boundary | ts_ms exactly 299,999ms old — 1ms inside the 300s freshness window |
ALLOW — correctly inside window |
| XSS in sequence_id | <script>alert(1)</script> as sequence_id in report generator |
Sanitized — raw script tag not reflected in response |
| Empty step_order | step_order: [] — empty array sent with intake request |
Fell back to MSMD spine — ALLOW on valid intake |
Daily adversarial monitoring
The agent that monitors AgenticRail runs 2 randomly selected scenarios from a pool of 10 adversarial patterns every morning. This is the probe that found the YAML injection gap on 2026-05-03. The pool covers all 6 core enforcement rules.
| Scenario | Rule tested | Status |
|---|---|---|
Function/step mismatch — step=intake, function=disruption |
FUNCTION_STEP_MISMATCH | Held |
| Sequence skip — intake then instability (skip disruption) | SEQUENCE_VIOLATION | Held |
| Unknown step — random string not in spine | UNKNOWN_STEP | Held |
| Custom order exclusion — step not in custom step_order | UNKNOWN_STEP (custom spine) | Held |
| Forbidden action type — execution with CHECK_STATE | ACTION_NOT_ALLOWED | Held |
| Wrong index in custom order — settle at position 0 | SEQUENCE_VIOLATION | Held |
| Step regression — run forward then send a completed step again | SEQUENCE_VIOLATION | Held |
| Nonce replay — same nonce sent twice with delay | REPLAY_NONCE | Held |
| Hokianga step on MSMD spine — dialect step without spine flag | UNKNOWN_STEP | Held |
| Sealed sequence re-entry — full 8-step then attempt intake again | SEALED_SEQUENCE | Held |
2 scenarios selected at random per daily run. Running since 2026-03-25. Pool covers: FUNCTION_STEP_MISMATCH, ACTION_NOT_ALLOWED, UNKNOWN_STEP (3 variants), SEQUENCE_VIOLATION (3 variants), REPLAY_NONCE, SEALED_SEQUENCE.
Known gaps — tests not yet run
Transparency requires naming what has not been tested. These are on the roadmap. None are known vulnerabilities — they are coverage gaps in the test record.
Formal proof that sequence_ids under different client keys cannot interfere. The architecture isolates by Durable Object key (sequence_id is namespaced), but a dedicated cross-tenant test has not been run.
Deliberately modifying a stored R2 receipt and confirming the verification portal returns CHAIN_BROKEN. The HMAC signature would catch this, but the end-to-end test (corrupt → verify → detect) has not been formally recorded.
All current nonce replay tests are same-session. A 48-hour hold test — run sequence, wait two days, replay nonce — has not been run. This would confirm the DO nonce ledger survives idle eviction.
Two simultaneous requests both carrying the final step. The burst test approaches this but does not specifically target the settle step. Formal proof of seal atomicity under concurrent settle attempts.
Step names using Unicode lookalikes for intake or settle. The spine comparison is byte-exact, but a formal test of visually-similar character substitution has not been run.
Verify the claims independently
The 1M test data is publicly downloadable. Any sequence ID beginning with demo- can be verified at report.agenticrail.nz without an API key — full receipt chain, HMAC verification, and chain linkage proof. Run the demo at agenticrail.nz/demo/ and verify the sequence ID it returns. The receipts are the proof. No trust required.