ABS Core v2.0.3
Technical

Performance Benchmarks

Empirical latencies for the ABS Core Gateway in edge environments.

Performance and Reliability Benchmarks

ABS Core is designed as standard infrastructure. A core requirement for any security gateway sitting in front of AI models is that the security overhead must not severely degrade the application's overall time-to-first-token (TTFT).

Below are the empirical latency results for the ABS Core Policy Engine. Note that these metrics cover the security interception and policy evaluation path only, representing the added overhead before the payload is forwarded to the LLM or tool.


1. Decision Latency

Measurement of the time between the ABS Gateway receiving the payload and issuing a policy verdict (ALLOW/DENY) using the WASM runtime.

Tests were conducted using standard geographic distribution via Cloudflare Workers (Edge deployment).

Load (Req/Sec)P50 (ms)P95 (ms)P99 (ms)
1000.81.52.1
1,0001.12.43.8
5,0002.34.26.1

Key Insight: The policy engine consistently adds under 5ms of P95 latency up to 5,000 RPS. This speed is achieved because the V8 WASM isolate avoids the typical spin-up penalties associated with heavy interpreter-based sidecars.

2. Cold Start Overhead

Serverless and edge infrastructures inherently face cold start delays when instances are provisioned.

  • Cold Start Penalty: ~40ms - 85ms (Occurs only on the first request to a newly spawned edge node).
  • Warm Path: ~0.5ms - 2.5ms (Subsequent requests routed to active isolates).

3. Architecture Comparison Overview

Unlike traditional API Gateways designed primarily for REST routing (which may require external network calls to evaluate complex access policies), ABS Core embeds the policy definitions directly near the edge routing layer.

  • Containerized Sidecars (e.g., OPA in K8s): Typically exhibit 10ms - 40ms of latency per hop, depending on the network overlap inside the cluster.
  • ABS Core (WASM Edge): Bypasses the internal cluster network hop by evaluating the payload at the TLS termination edge node.

Note: These benchmarks reflect the core execution path. Total API roundtrip times will remain bounded by the underlying LLM provider's response times (often 500ms - 2,000ms).

On this page