Verified latency and throughput measurements for the ABS Core policy evaluation engine.

Performance Benchmarks

ABS Core is engineered for sub-30ms policy evaluation so it can intercept AI requests without adding perceptible overhead to your LLM stack.

All benchmarks are produced by running benchmark.mjs (included in packages/load-generator) against real engine instances. Results recorded on 2026-02-25.

Sandbox Environment Results

These are benchmarks measured against the local Next.js sandbox (/api/sandbox route), which runs a TypeScript-based policy engine using heuristic signature matching:

Metric	Value
Iterations	100 requests
Concurrency	5 parallel threads
Total Time	0.72s
Throughput	139.5 req/s
Success Rate	100% (0 failures)
Detection Rate	12% blocked as DENY (attack payloads)

Latency Distribution

Percentile	Latency
P50 (Median)	21.64 ms
P90	25.34 ms
P95	28.01 ms
P99	235.20 ms (first hot-path compile)
Min	17.00 ms
Max	254.80 ms
Mean	32.26 ms

The P99 spike of ~235ms is due to Next.js Turbopack's JIT compilation on the first few requests (cold start). After warm-up, consistent sub-30ms latency is maintained at P95.

Production WASM Engine (Cloudflare Edge)

These benchmarks were captured from the production WASM/Rust engine deployed as a Cloudflare Worker. The WASM core handles pure policy evaluation (no network, no I/O).

Metric	VM (Local)	Edge (Prod)
P50 Latency	21 ms	3.8 ms
P95 Latency	28 ms	11 ms
P99 Latency	235 ms (cold)	22 ms
Throughput	139 req/s	860+ req/s
Memory RSS	~45 MB	~12 MB (Worker)

The WASM Rust engine achieves <5ms P50 on the edge because it:

Runs in WebAssembly — bytecode-level execution without JIT warmup cost.
Is deployed to Cloudflare's edge network, co-located with LLM API gateways.
Has no I/O in the hot path — pure in-memory computation against the compiled policy AST.

What We're Measuring

Each request to the policy engine evaluates the following pipeline:

1. Input Parsing       — Schema validation via Zod
2. CHI Probe           — Intent classification (heuristic / semantic)
3. Policy Evaluation   — AST traversal against compiled YAML policies
4. Verdict Emission    — ALLOW / DENY with trace token
5. Audit Chain Write   — Async batch (off hot path)

Steps 1-4 constitute the blocking latency (what the caller waits for). Step 5 is fire-and-forget.

Running Benchmarks Locally

# Start the local sandbox (must be running)
cd packages/web && npm run dev

# Run benchmark (in a new terminal)
node packages/load-generator/benchmark.mjs \
  --url http://localhost:3001/api/sandbox \
  --iterations 500 \
  --concurrency 20

You can configure:

--iterations — Total number of policy evaluations to run
--concurrency — Number of parallel requests per batch
--url — Target URL (local dev or production)

Load-Test Attack Payloads

The benchmark engine cycles through 5 payload categories to test detection accuracy under load:

Payload Type	Expected Decision
`benign_text`	ALLOW
`benign_list`	ALLOW
`prompt_injection`	DENY
`pii_extraction`	DENY
`financial_fraud`	DENY

This ensures throughput measurements account for both ALLOW and DENY code paths.

Try the Sandbox

Interact with the policy engine in real-time — no signup, no sales call.

Performance Benchmarks

Performance Benchmarks

Sandbox Environment Results

Latency Distribution

Production WASM Engine (Cloudflare Edge)

What We're Measuring

Running Benchmarks Locally

Load-Test Attack Payloads

Try the Sandbox

On this page