Security Hardening

Every attack vector tested. Every result recorded. The gate has been run against poison injection, replay attacks, sequence skips, concurrent race conditions, and adversarial probe suites — all against the live deployment, not a test environment. Evidence is public and independently verifiable.

1,045,508

Verified requests

Enforcement errors

Poison categories blocked

Adversarial test suites

Injection hardening

The gate's checkPoison layer runs before payload evaluation. Any request carrying these patterns is rejected with HALT before reaching the enforcement core. Three attack categories are in scope; all are blocked at the gate layer, not the application layer.

HALT — Blocked

Role directive injection

Patterns: system:, <|im_start|>, {role:. Attempts to embed LLM role or instruction directives in JSON payload fields — action, input, sequence_id, step. All field values are scanned before evaluation.

HALT — Blocked

Base64 blob injection

Long base64-encoded strings in any payload field. Common vector for encoding instructions or binary payloads that bypass text-pattern filters. The gate detects and halts regardless of which field carries the blob.

HALT — Fixed May 2026

YAML front matter injection

--- delimiters as both raw newlines and JSON-encoded \n. The initial regex matched literal newline characters only — JSON.stringify encodes newlines as \\n, so the pattern never fired. Regex updated to match both forms. Found by the daily adversarial probe.

How the YAML gap was found

The daily monitoring agent runs rotating adversarial probes against the live gate each morning. On 2026-05-03 it reported YAML front matter returning ALLOW instead of DENY. Root cause: checkPoison used a literal-newline regex against JSON-serialised bodies where newlines are two-character escape sequences. The fix was deployed the same session. The probe found a genuine gap in a production security control — not a theoretical edge case. 1M test data →

Pressure test record

All tests run against the live Cloudflare Workers deployment — not a staging environment, not a mock. Results are machine-recorded JSON files stored in the evidence directory. The 1M test results are publicly downloadable.

Test	Requests	Enforcement accuracy	What it proves
1M Full-Sequence 125,000 complete 8-step MSMD sequences · 50 concurrent workers · 68 req/s · 3h 46m · rural NZ hotspot	923,983	100% 0 enforcement errors 0 false DENY	Rail is deterministic at scale. Every ALLOW correct, no misfire on valid sequences over 3+ hours.	Data →
100K Enterprise 33,334 sequences × 3 scenarios: valid, skip, replay	100,002	99.98% 0.02% network errors	Skip-ahead and replay enforcement under sustained load. Valid sequences always pass; invalid sequences always block.	✓
Step Transition 1,000 sequences × 3 scenarios: valid, skip, replay	3,000	100% zero errors	Step ordering enforcement is perfect. Skip-ahead: 100% blocked. Replay: 100% blocked. Valid: 100% pass.	✓
Action Type Mismatch Valid MSMD functions with wrong action_type deliberately set	3,003	100% 3,003 blocked 0 false allows	The function/action_type contract is enforced without exception. No wrong-action-type request ever passes.	✓
Concurrent Burst 200 sequences × 10-request simultaneous burst each	2,000	100% 200/200 bursts: exactly 1 ALLOW 0 race leaks	Durable Object single-threaded model prevents race conditions. Concurrent requests on the same sequence are serialised — no double-advance possible.	✓
Seal Behaviour 100 full 8-step sequences + post-seal breach attempt each	900	100% 100/100 sequences sealed 0 seal leaks	The seal is permanent. After `settle`, no further steps are accepted on the sequence — ever. Post-seal attempts always DENY.	✓
Nonce Replay 300 sequences × first request + replay of same nonce	600	100% 300/300 replays blocked	Nonce replay protection is absolute. The same nonce is never accepted twice on the same sequence, regardless of timing.	✓
Intake Node Pressure Single-step entry point · concurrency 300	10,000	—	Entry point throughput ceiling under maximum concurrency. Measures CPU and latency at the first enforcement step.	✓

Total verified requests across all pressure tests: 1,045,508. JSON evidence files with full latency distributions available in the test evidence directory.

Live breach suite

Ten targeted tests, each designed to breach one specific enforcement rule. All 10 run against the live production gate — not unit tests, not mocks. Result: all 10 held.

Test	Attack	Expected	Result
T1	Unknown function — `garbage_step` as function name	DENY — UNKNOWN_FUNCTION	HELD
T2	Forbidden action_type on disruption step	DENY — ACTION_NOT_ALLOWED	HELD
T3	Sequence skip — jump from step 1 to step 3	DENY — SEQUENCE_VIOLATION	HELD
T4	Function/step mismatch — `step ≠ function`	DENY — FUNCTION_STEP_MISMATCH	HELD
T5	Nonce replay — same nonce sent twice	DENY — REPLAY_NONCE	HELD
T6	Forbidden action_type on execution step	DENY — ACTION_NOT_ALLOWED	HELD
T7	Skip to settle — jump to final step from step 2	DENY — SEQUENCE_VIOLATION	HELD
T8	Settle with wrong function name	DENY — FUNCTION_STEP_MISMATCH	HELD
T9	Clean 8-step MSMD run — full sequence seal	ALLOW × 8 → SEALED	SEALED
T10	Report generator verification of sealed sequence	VERIFIED_INTACT	VERIFIED

Architecture edge cases

Five edge cases that test the enforcement boundary conditions — not the happy path, the corners. All five pass on the live deployment.

Test	Edge case	Result
Duplicate step names	`step_order: ["intake", "intake", "settle"]` — duplicate in the sequence definition	Handled gracefully — ALLOW or DENY, no crash
DO nonce cap	501 unique nonces on a single sequence — past the nonce ledger eviction threshold	Nonce eviction working — no crash, no memory leak, no replay admitted
Timestamp boundary	`ts_ms` exactly 299,999ms old — 1ms inside the 300s freshness window	ALLOW — correctly inside window
XSS in sequence_id	`<script>alert(1)</script>` as sequence_id in report generator	Sanitized — raw script tag not reflected in response
Empty step_order	`step_order: []` — empty array sent with intake request	Fell back to MSMD spine — ALLOW on valid intake

Daily adversarial monitoring

The agent that monitors AgenticRail runs 2 randomly selected scenarios from a pool of 10 adversarial patterns every morning. This is the probe that found the YAML injection gap on 2026-05-03. The pool covers all 6 core enforcement rules.

Scenario	Rule tested	Status
Function/step mismatch — `step=intake, function=disruption`	FUNCTION_STEP_MISMATCH	Held
Sequence skip — intake then instability (skip disruption)	SEQUENCE_VIOLATION	Held
Unknown step — random string not in spine	UNKNOWN_STEP	Held
Custom order exclusion — step not in custom step_order	UNKNOWN_STEP (custom spine)	Held
Forbidden action type — execution with CHECK_STATE	ACTION_NOT_ALLOWED	Held
Wrong index in custom order — settle at position 0	SEQUENCE_VIOLATION	Held
Step regression — run forward then send a completed step again	SEQUENCE_VIOLATION	Held
Nonce replay — same nonce sent twice with delay	REPLAY_NONCE	Held
Hokianga step on MSMD spine — dialect step without spine flag	UNKNOWN_STEP	Held
Sealed sequence re-entry — full 8-step then attempt intake again	SEALED_SEQUENCE	Held

2 scenarios selected at random per daily run. Running since 2026-03-25. Pool covers: FUNCTION_STEP_MISMATCH, ACTION_NOT_ALLOWED, UNKNOWN_STEP (3 variants), SEQUENCE_VIOLATION (3 variants), REPLAY_NONCE, SEALED_SEQUENCE.

Known gaps — tests not yet run

Transparency requires naming what has not been tested. These are on the roadmap. None are known vulnerabilities — they are coverage gaps in the test record.

Multi-tenant isolation

Formal proof that sequence_ids under different client keys cannot interfere. The architecture isolates by Durable Object key (sequence_id is namespaced), but a dedicated cross-tenant test has not been run.

Receipt chain tampering

Deliberately modifying a stored R2 receipt and confirming the verification portal returns CHAIN_BROKEN. The HMAC signature would catch this, but the end-to-end test (corrupt → verify → detect) has not been formally recorded.

Long-lived nonce persistence

All current nonce replay tests are same-session. A 48-hour hold test — run sequence, wait two days, replay nonce — has not been run. This would confirm the DO nonce ledger survives idle eviction.

Concurrent settle race

Two simultaneous requests both carrying the final step. The burst test approaches this but does not specifically target the settle step. Formal proof of seal atomicity under concurrent settle attempts.

Unicode / homoglyph step names

Step names using Unicode lookalikes for intake or settle. The spine comparison is byte-exact, but a formal test of visually-similar character substitution has not been run.

Verify the claims independently

The 1M test data is publicly downloadable. Any sequence ID beginning with demo- can be verified at report.agenticrail.nz without an API key — full receipt chain, HMAC verification, and chain linkage proof. Run the demo at agenticrail.nz/demo/ and verify the sequence ID it returns. The receipts are the proof. No trust required.

Try the demo → 1M test data → Verify a sequence → Compliance matrix →