Formal SLO commitments, consistency model, idempotency guarantees, and circuit breaker behavior for ABS Core v10.1.5+.

Latency SLA & Performance Specifications

Effective: v10.1.5+ · Last updated: 2026-02-26

This document defines the formal service-level objectives (SLOs), consistency model, and failure behavior of the ABS Core governance engine. It is intended for risk committees, enterprise architects, and compliance officers evaluating ABS Core for regulated-environment deployment.

Formal SLO Commitments

Operational Load (up to 1,000 req/s)

Metric	Commitment	Basis
P50 latency	< 200ms	Validated: 161ms at 1,000 req/s (60 min test)
P99 latency	< 300ms	Validated: 289ms at 1,000 req/s (60 min test)
Error rate (5xx)	0%	Validated: 0 errors across 3.6M requests
Monthly availability	99.95%	~4.4h allowable downtime/year
Audit log completeness	100%	Every request — passed or rejected — is recorded

Overload Protection (above 1,000 req/s)

Behavior	Commitment
Rate limit response	HTTP 429 with `Retry-After` header — never silent drop
5xx under overload	0 — validated at 5,000 req/s (15 min stress test)
Audit integrity under overload	100% — all requests recorded including rejected ones
Degradation mode	Graceful shed via rate limiter, not crash

Latency Budget by Component

Component	Mode	Added Latency	Notes
WASM Policy Engine	Inline (blocking)	< 2ms	Pure in-memory evaluation — no I/O
CHI Semantic Analysis	Async (parallel)	< 150ms	Gemini 2.0 Flash — runs in parallel with LLM call
PII Redaction	Inline	< 1ms	Regex + entropy scan — deterministic
Audit Hash + L2 Queue	Async (non-blocking)	0ms added to P50	Hash computed inline; L2 anchor async
Full governance pipeline	Combined	~153ms P50	Validated across 9M+ requests

LLM context: A standard LLM response (OpenAI GPT-4o, Anthropic Claude) takes 500ms–2,000ms. The 153ms P50 governance overhead represents <10% of total agent response time at P50, and is below the perceptual threshold for end users.

Consistency Model

This section addresses the consistency and transactional guarantees required for regulated financial environments.

Decision Consistency: Strong

Every POST /v1/decide or POST /v1/authorize call is:

Synchronous — the caller receives a verdict before any downstream action executes
Deterministic — same input + same policy version → same verdict, always
Versioned — the active policy version is recorded in every decision envelope

There is no eventual consistency in the decision path. A decision is either ALLOW or DENY — never "pending" or "probably allow."

Audit Log Consistency: Write-once, append-only

Property	Behavior
Write model	Append-only — no update or delete operations on audit records
Durability	Written to D1 (Cloudflare) before response is returned to caller
Hash chaining	Each record includes SHA-256 of the previous record — tamper-evident
L2 anchoring	Batches anchored to Polygon L2 asynchronously — does not block response (Enterprise tier)
Consistency on read	Strong consistency within a single region; eventual across regions (<500ms)

Idempotency

All write operations in ABS Core are idempotent by event_id:

// Submitting the same event_id twice → same result, no duplicate record
const result = await abs.process({
  event_id: "evt_TXN-4421-refund",   // client-generated stable ID
  tenant_id: "my-tenant",
  event_type: "agent.action",
  payload:  { action: "WRITE", target: "accounts/acc_123/refund", amount: 250.00 },
}, { sync: true });

const decision = result.envelope;

If the same event_id is submitted twice (e.g., due to network retry), the second call returns the original decision without creating a duplicate audit record. This guarantee is critical for payment systems where retry-on-failure is standard practice.

v10.1.5 fix: Prior to v10.1.5, the trace_id field was used as the idempotency key. This was incorrect — trace_id is assigned by the server per-request, not stable across retries. v10.1.5 introduced the client-controlled event_id as the correct idempotency key. Callers on <v10.1.5 must upgrade before relying on idempotency guarantees.

MTTR and Recovery Commitments

Scenario	Target	Behavior
Cloudflare edge node failure	< 30s	Traffic automatically rerouted to adjacent PoP
CHI semantic engine timeout	< 200ms	Circuit breaker → Fail-Safe ALLOW + audit flag
D1 write latency spike	Transparent	Response returned first; D1 write retried async
Polygon L2 congestion	Transparent	L2 anchor queued; local hash written immediately (Enterprise)
Full region outage (rare)	< 15 min MTTR	Cloudflare multi-region failover

RTO / RPO

Parameter	Value	Notes
RTO (Recovery Time Objective)	< 15 minutes	Time to restore service after full outage
RPO (Recovery Point Objective)	0 for decisions	No decision data is losable — written before response
RPO for L2 anchoring	< 1 block cycle (~2s)	L2 anchor may be delayed; local hash is never lost

Circuit Breaker Behavior

The CHI semantic analysis engine has a 200ms hard timeout. If exceeded:

The request proceeds with Fail-Safe ALLOW — the agent's action is not blocked
The event is flagged as chi_timeout: true in the audit record
A Sentry alert is fired for the workspace
The CHI engine is bypassed for subsequent requests until it recovers (exponential backoff, max 30s)

This design ensures the governance layer never becomes a single point of failure that halts production systems — a hard requirement for financial infrastructure.

Shadow Mode (Non-Blocking Governance)

For high-frequency, low-stakes operations where even 153ms P50 overhead is unacceptable:

# policy.yaml
mode: shadow          # analyze but do not block
enforcement: strict   # when promoted, full blocking applies
alert_on_violation: true

In shadow mode:

All requests pass through regardless of verdict
Violations are recorded in the audit log with shadow: true flag
Dashboards show violation rate for that operation class
Teams can promote to enforcement mode when confident in policy correctness

Shadow mode is the recommended entry point for new agent integrations and high-frequency read operations.

Benchmark Reference

All numbers in this document are derived from the Benchmark Report, which documents three test classes (endurance at 200 req/s for 2h, load at 1,000 req/s for 60 min, stress at 5,000 req/s for 15 min) with results anchored on the Bitcoin blockchain.

Latency SLA & Performance Specifications

On this page