Security Hardening

Every attack vector tested. Every result recorded. The gate has been run against poison injection, replay attacks, sequence skips, concurrent race conditions, and adversarial probe suites — all against the live deployment, not a test environment. Evidence is public and independently verifiable.

1,045,508
Verified requests
0
Enforcement errors
3
Poison categories blocked
11
Adversarial test suites

Injection hardening

The gate's checkPoison layer runs before payload evaluation. Any request carrying these patterns is rejected with HALT before reaching the enforcement core. Three attack categories are in scope; all are blocked at the gate layer, not the application layer.

HALT — Blocked

Role directive injection

Patterns: system:, <|im_start|>, {role:. Attempts to embed LLM role or instruction directives in JSON payload fields — action, input, sequence_id, step. All field values are scanned before evaluation.

HALT — Blocked

Base64 blob injection

Long base64-encoded strings in any payload field. Common vector for encoding instructions or binary payloads that bypass text-pattern filters. The gate detects and halts regardless of which field carries the blob.

HALT — Fixed May 2026

YAML front matter injection

--- delimiters as both raw newlines and JSON-encoded \n. The initial regex matched literal newline characters only — JSON.stringify encodes newlines as \\n, so the pattern never fired. Regex updated to match both forms. Found by the daily adversarial probe.

How the YAML gap was found

The daily monitoring agent runs rotating adversarial probes against the live gate each morning. On 2026-05-03 it reported YAML front matter returning ALLOW instead of DENY. Root cause: checkPoison used a literal-newline regex against JSON-serialised bodies where newlines are two-character escape sequences. The fix was deployed the same session. The probe found a genuine gap in a production security control — not a theoretical edge case. 1M test data →

Pressure test record

All tests run against the live Cloudflare Workers deployment — not a staging environment, not a mock. Results are machine-recorded JSON files stored in the evidence directory. The 1M test results are publicly downloadable.

Test Requests Enforcement accuracy What it proves
1M Full-Sequence
125,000 complete 8-step MSMD sequences · 50 concurrent workers · 68 req/s · 3h 46m · rural NZ hotspot
923,983 100%
0 enforcement errors
0 false DENY
Rail is deterministic at scale. Every ALLOW correct, no misfire on valid sequences over 3+ hours. Data →
100K Enterprise
33,334 sequences × 3 scenarios: valid, skip, replay
100,002 99.98%
0.02% network errors
Skip-ahead and replay enforcement under sustained load. Valid sequences always pass; invalid sequences always block.
Step Transition
1,000 sequences × 3 scenarios: valid, skip, replay
3,000 100%
zero errors
Step ordering enforcement is perfect. Skip-ahead: 100% blocked. Replay: 100% blocked. Valid: 100% pass.
Action Type Mismatch
Valid MSMD functions with wrong action_type deliberately set
3,003 100%
3,003 blocked
0 false allows
The function/action_type contract is enforced without exception. No wrong-action-type request ever passes.
Concurrent Burst
200 sequences × 10-request simultaneous burst each
2,000 100%
200/200 bursts: exactly 1 ALLOW
0 race leaks
Durable Object single-threaded model prevents race conditions. Concurrent requests on the same sequence are serialised — no double-advance possible.
Seal Behaviour
100 full 8-step sequences + post-seal breach attempt each
900 100%
100/100 sequences sealed
0 seal leaks
The seal is permanent. After settle, no further steps are accepted on the sequence — ever. Post-seal attempts always DENY.
Nonce Replay
300 sequences × first request + replay of same nonce
600 100%
300/300 replays blocked
Nonce replay protection is absolute. The same nonce is never accepted twice on the same sequence, regardless of timing.
Intake Node Pressure
Single-step entry point · concurrency 300
10,000 Entry point throughput ceiling under maximum concurrency. Measures CPU and latency at the first enforcement step.

Total verified requests across all pressure tests: 1,045,508. JSON evidence files with full latency distributions available in the test evidence directory.

Live breach suite

Ten targeted tests, each designed to breach one specific enforcement rule. All 10 run against the live production gate — not unit tests, not mocks. Result: all 10 held.

TestAttackExpectedResult
T1 Unknown function — garbage_step as function name DENY — UNKNOWN_FUNCTION HELD
T2 Forbidden action_type on disruption step DENY — ACTION_NOT_ALLOWED HELD
T3 Sequence skip — jump from step 1 to step 3 DENY — SEQUENCE_VIOLATION HELD
T4 Function/step mismatch — step ≠ function DENY — FUNCTION_STEP_MISMATCH HELD
T5 Nonce replay — same nonce sent twice DENY — REPLAY_NONCE HELD
T6 Forbidden action_type on execution step DENY — ACTION_NOT_ALLOWED HELD
T7 Skip to settle — jump to final step from step 2 DENY — SEQUENCE_VIOLATION HELD
T8 Settle with wrong function name DENY — FUNCTION_STEP_MISMATCH HELD
T9 Clean 8-step MSMD run — full sequence seal ALLOW × 8 → SEALED SEALED
T10 Report generator verification of sealed sequence VERIFIED_INTACT VERIFIED

Architecture edge cases

Five edge cases that test the enforcement boundary conditions — not the happy path, the corners. All five pass on the live deployment.

TestEdge caseResult
Duplicate step names step_order: ["intake", "intake", "settle"] — duplicate in the sequence definition Handled gracefully — ALLOW or DENY, no crash
DO nonce cap 501 unique nonces on a single sequence — past the nonce ledger eviction threshold Nonce eviction working — no crash, no memory leak, no replay admitted
Timestamp boundary ts_ms exactly 299,999ms old — 1ms inside the 300s freshness window ALLOW — correctly inside window
XSS in sequence_id <script>alert(1)</script> as sequence_id in report generator Sanitized — raw script tag not reflected in response
Empty step_order step_order: [] — empty array sent with intake request Fell back to MSMD spine — ALLOW on valid intake

Daily adversarial monitoring

The agent that monitors AgenticRail runs 2 randomly selected scenarios from a pool of 10 adversarial patterns every morning. This is the probe that found the YAML injection gap on 2026-05-03. The pool covers all 6 core enforcement rules.

ScenarioRule testedStatus
Function/step mismatch — step=intake, function=disruption FUNCTION_STEP_MISMATCH Held
Sequence skip — intake then instability (skip disruption) SEQUENCE_VIOLATION Held
Unknown step — random string not in spine UNKNOWN_STEP Held
Custom order exclusion — step not in custom step_order UNKNOWN_STEP (custom spine) Held
Forbidden action type — execution with CHECK_STATE ACTION_NOT_ALLOWED Held
Wrong index in custom order — settle at position 0 SEQUENCE_VIOLATION Held
Step regression — run forward then send a completed step again SEQUENCE_VIOLATION Held
Nonce replay — same nonce sent twice with delay REPLAY_NONCE Held
Hokianga step on MSMD spine — dialect step without spine flag UNKNOWN_STEP Held
Sealed sequence re-entry — full 8-step then attempt intake again SEALED_SEQUENCE Held

2 scenarios selected at random per daily run. Running since 2026-03-25. Pool covers: FUNCTION_STEP_MISMATCH, ACTION_NOT_ALLOWED, UNKNOWN_STEP (3 variants), SEQUENCE_VIOLATION (3 variants), REPLAY_NONCE, SEALED_SEQUENCE.

Known gaps — tests not yet run

Transparency requires naming what has not been tested. These are on the roadmap. None are known vulnerabilities — they are coverage gaps in the test record.

Multi-tenant isolation

Formal proof that sequence_ids under different client keys cannot interfere. The architecture isolates by Durable Object key (sequence_id is namespaced), but a dedicated cross-tenant test has not been run.

Receipt chain tampering

Deliberately modifying a stored R2 receipt and confirming the verification portal returns CHAIN_BROKEN. The HMAC signature would catch this, but the end-to-end test (corrupt → verify → detect) has not been formally recorded.

Long-lived nonce persistence

All current nonce replay tests are same-session. A 48-hour hold test — run sequence, wait two days, replay nonce — has not been run. This would confirm the DO nonce ledger survives idle eviction.

Concurrent settle race

Two simultaneous requests both carrying the final step. The burst test approaches this but does not specifically target the settle step. Formal proof of seal atomicity under concurrent settle attempts.

Unicode / homoglyph step names

Step names using Unicode lookalikes for intake or settle. The spine comparison is byte-exact, but a formal test of visually-similar character substitution has not been run.

Verify the claims independently

The 1M test data is publicly downloadable. Any sequence ID beginning with demo- can be verified at report.agenticrail.nz without an API key — full receipt chain, HMAC verification, and chain linkage proof. Run the demo at agenticrail.nz/demo/ and verify the sequence ID it returns. The receipts are the proof. No trust required.

Try the demo → 1M test data → Verify a sequence → Compliance matrix →