AgenticRail ← Back to home
April 27, 2026 · Public test data

913,079 ALLOW.
Zero denials.

We ran 1 million enforcement requests through the AgenticRail gate and published every receipt. No errors. No corrupted state. No false negatives.

0
Enforcement errors across 913,079 decisions
923,983
Requests sent
913,079
ALLOW decisions
114,096
Sealed sequences
0
DENY (enforcement errors)
68.07
Sustained req/s
556ms
Median latency (p50)
3.7 hrs
Test duration
522 MB
Receipts written to R2

What we tested

125,000 full 8-step MSMD sequences. Each sequence runs intake through settle — the entire enforcement pipeline from first contact to sealed ledger. That's 1,000,000 planned requests, 923,983 completed (76,017 lost to connection drops — see the transparency section below).

Every request traveled the full path:

// Request path: 6 hops, 3 Workers, 3 Durable Objects ClientWrapper (auth + normalize) → Gate (harden + cache) → Core (policy + timestamp + sequence enforcement) → DO (Slp8Sequence — state, nonce, seal) → R2 (receipt) + KV (stats) + DO (StatsCounter)

Each sealed sequence produces 8 cryptographically chained receipts. Each receipt is HMAC-SHA256 signed at write time. The previous receipt's pack_id is embedded in the next — the chain cannot be altered without invalidating every subsequent link.

How we ran it

ParameterValue
Concurrent workers50
Step delay80ms (DO breathing room)
Client timeout90 seconds
Spineintake → disruption → instability → state_read → internal_driver → execution → boundary → settle
ConnectioniPhone hotspot — Hokianga, rural New Zealand
Test scriptNode.js ESM, native fetch, no external HTTP library
Duration3 hours 46 minutes (12:55pm – 4:41pm NZT)

The hotspot detail matters. We ran this test over the same rural iPhone connection the developer uses every day — not a datacenter. The 10,806 DNS failures are an honest signal of that constraint, not a flaw in the rail.

Latency — over a rural hotspot

PercentileLatency
p50 (median)556ms
p951,474ms
p991,628ms
Minimum1.31ms
Maximum90s (client timeout threshold)

Half of all enforcement decisions returned in under 600ms over a phone connection. The p99 under 2 seconds includes TLS handshake, Cloudflare edge routing, and the full 6-hop request path. The 1.31ms minimum is a service-binding call between Workers in the same Cloudflare data center — no public network involved.

NO_REPLY transparency — the 1.2%

10,904 requests (1.2%) never received a decision. Every single one was a connection failure — not an enforcement failure. Here's the breakdown:

ReasonCount% of NO_REPLYCause
DNS_FAIL10,80699.1%Hotspot DNS drop
CONNECTION_RESET860.8%Hotspot reconnect
SERVER_ERROR (502)70.0008%Gate→core timeout (now fixed)
TIMEOUT50.0%Client 90s exceeded

The 7 server errors were gate→core timeouts at the 10-second service binding limit during peak concurrency. This threshold was raised to 15 seconds after the test — closing the 0.0008% gap. These sequences are incomplete: the DO advanced past the failed step, and any prior steps in that sequence were correctly receipted and chained.

Verify a receipt chain yourself

Every sealed sequence produces a chain of cryptographic receipts. Here's the two-step verification flow — no API key, no login, no setup.

Step 1 — Run a sequence on the demo

The demo generates a live 8-step MSMD sequence through the full enforcement pipeline. Watch each step get gated. Copy the sequence ID when it completes.

Open interactive demo →

Step 2 — Generate the compliance report

Paste your sequence ID into the report generator. You'll get the full enforcement log with HMAC-signed chain verification, policy audit, and an AI-generated compliance narrative.

Open report generator →

Demo sequences (starting with demo-) require no API key. Paste and verify. Production sequences need an x-slp8-key header.

Reproduce the test

The test script is open source. The raw results JSON is available. Run it yourself against the public demo endpoint.

Test script on GitHub → Raw results JSON →
# Run the smoke test first (10 sequences, verbose) SMOKE=true node tests/pressure/pressure_test_1m.mjs # Full 1M test (125,000 sequences, 50 concurrent) node tests/pressure/pressure_test_1m.mjs

Why this matters

Most AI safety tooling watches your agent and logs what happened. AgenticRail sits beneath the agent and blocks bad actions before they execute. The difference is between finding out after and preventing before.

This test proves the enforcement layer holds at scale. 913,079 decisions, zero errors — not because we filtered the data, but because the rail is deterministic. The DO is single-threaded. The nonce check is atomic. The seal is final. There is no probabilistic path through the gate.

Claims are marketing. Receipts are evidence. Every sealed sequence in this test has a cryptographic receipt chain that anyone can verify. No login required. No permission needed. The receipts are public.