ABS Core
Technical

Latency SLA & Performance Specifications

Formal SLO commitments, consistency model, idempotency guarantees, and circuit breaker behavior for ABS Core v10.1.5+.

Latency SLA & Performance Specifications

Effective: v10.1.5+ · Last updated: 2026-02-26

This document defines the formal service-level objectives (SLOs), consistency model, and failure behavior of the ABS Core governance engine. It is intended for risk committees, enterprise architects, and compliance officers evaluating ABS Core for regulated-environment deployment.


Formal SLO Commitments

Operational Load (up to 1,000 req/s)

MetricCommitmentBasis
P50 latency< 200msValidated: 161ms at 1,000 req/s (60 min test)
P99 latency< 300msValidated: 289ms at 1,000 req/s (60 min test)
Error rate (5xx)0%Validated: 0 errors across 3.6M requests
Monthly availability99.95%~4.4h allowable downtime/year
Audit log completeness100%Every request — passed or rejected — is recorded

Overload Protection (above 1,000 req/s)

BehaviorCommitment
Rate limit responseHTTP 429 with Retry-After header — never silent drop
5xx under overload0 — validated at 5,000 req/s (15 min stress test)
Audit integrity under overload100% — all requests recorded including rejected ones
Degradation modeGraceful shed via rate limiter, not crash

Latency Budget by Component

ComponentModeAdded LatencyNotes
WASM Policy EngineInline (blocking)< 2msPure in-memory evaluation — no I/O
CHI Semantic AnalysisAsync (parallel)< 150msGemini 2.0 Flash — runs in parallel with LLM call
PII RedactionInline< 1msRegex + entropy scan — deterministic
Audit Hash + L2 QueueAsync (non-blocking)0ms added to P50Hash computed inline; L2 anchor async
Full governance pipelineCombined~153ms P50Validated across 9M+ requests

LLM context: A standard LLM response (OpenAI GPT-4o, Anthropic Claude) takes 500ms–2,000ms. The 153ms P50 governance overhead represents <10% of total agent response time at P50, and is below the perceptual threshold for end users.


Consistency Model

This section addresses the consistency and transactional guarantees required for regulated financial environments.

Decision Consistency: Strong

Every POST /v1/decide or POST /v1/authorize call is:

  • Synchronous — the caller receives a verdict before any downstream action executes
  • Deterministic — same input + same policy version → same verdict, always
  • Versioned — the active policy version is recorded in every decision envelope

There is no eventual consistency in the decision path. A decision is either ALLOW or DENY — never "pending" or "probably allow."

Audit Log Consistency: Write-once, append-only

PropertyBehavior
Write modelAppend-only — no update or delete operations on audit records
DurabilityWritten to D1 (Cloudflare) before response is returned to caller
Hash chainingEach record includes SHA-256 of the previous record — tamper-evident
L2 anchoringBatches anchored to Polygon L2 asynchronously — does not block response (Enterprise tier)
Consistency on readStrong consistency within a single region; eventual across regions (<500ms)

Idempotency

All write operations in ABS Core are idempotent by event_id:

// Submitting the same event_id twice → same result, no duplicate record
const result = await abs.process({
  event_id: "evt_TXN-4421-refund",   // client-generated stable ID
  tenant_id: "my-tenant",
  event_type: "agent.action",
  payload:  { action: "WRITE", target: "accounts/acc_123/refund", amount: 250.00 },
}, { sync: true });

const decision = result.envelope;

If the same event_id is submitted twice (e.g., due to network retry), the second call returns the original decision without creating a duplicate audit record. This guarantee is critical for payment systems where retry-on-failure is standard practice.

v10.1.5 fix: Prior to v10.1.5, the trace_id field was used as the idempotency key. This was incorrect — trace_id is assigned by the server per-request, not stable across retries. v10.1.5 introduced the client-controlled event_id as the correct idempotency key. Callers on <v10.1.5 must upgrade before relying on idempotency guarantees.


MTTR and Recovery Commitments

ScenarioTargetBehavior
Cloudflare edge node failure< 30sTraffic automatically rerouted to adjacent PoP
CHI semantic engine timeout< 200msCircuit breaker → Fail-Safe ALLOW + audit flag
D1 write latency spikeTransparentResponse returned first; D1 write retried async
Polygon L2 congestionTransparentL2 anchor queued; local hash written immediately (Enterprise)
Full region outage (rare)< 15 min MTTRCloudflare multi-region failover

RTO / RPO

ParameterValueNotes
RTO (Recovery Time Objective)< 15 minutesTime to restore service after full outage
RPO (Recovery Point Objective)0 for decisionsNo decision data is losable — written before response
RPO for L2 anchoring< 1 block cycle (~2s)L2 anchor may be delayed; local hash is never lost

Circuit Breaker Behavior

The CHI semantic analysis engine has a 200ms hard timeout. If exceeded:

  1. The request proceeds with Fail-Safe ALLOW — the agent's action is not blocked
  2. The event is flagged as chi_timeout: true in the audit record
  3. A Sentry alert is fired for the workspace
  4. The CHI engine is bypassed for subsequent requests until it recovers (exponential backoff, max 30s)

This design ensures the governance layer never becomes a single point of failure that halts production systems — a hard requirement for financial infrastructure.


Shadow Mode (Non-Blocking Governance)

For high-frequency, low-stakes operations where even 153ms P50 overhead is unacceptable:

# policy.yaml
mode: shadow          # analyze but do not block
enforcement: strict   # when promoted, full blocking applies
alert_on_violation: true

In shadow mode:

  • All requests pass through regardless of verdict
  • Violations are recorded in the audit log with shadow: true flag
  • Dashboards show violation rate for that operation class
  • Teams can promote to enforcement mode when confident in policy correctness

Shadow mode is the recommended entry point for new agent integrations and high-frequency read operations.


Benchmark Reference

All numbers in this document are derived from the Benchmark Report, which documents three test classes (endurance at 200 req/s for 2h, load at 1,000 req/s for 60 min, stress at 5,000 req/s for 15 min) with results anchored on the Bitcoin blockchain.

On this page