ABS Core
Technical

Benchmark Report — ABS Core Governance Engine

Three-tier performance validation: endurance (200 req/s / 2h), load (1k req/s / 60min), stress (5k req/s / 15min). Results anchored on the Bitcoin blockchain via OpenTimestamps.

Benchmark Report

This report documents three classes of performance tests following industry-standard SRE methodology (Google SRE Book, k6 test taxonomy). Raw results are hashed SHA-256 and anchored on the Bitcoin blockchain via OpenTimestamps — any alteration is detectable.

Blockchain Proof of Integrity

FieldValue
Report SHA-256d10e3aab0288387d0b425143f86dca9cf17e2d9a896e9e5678b1005bc01a7619
Anchoring Date2026-02-18T15:40:45-03:00
ProtocolOpenTimestamps (Bitcoin)
Calendarsa.pool.opentimestamps.org, b.pool.opentimestamps.org, a.pool.eternitywall.com, ots.btc.catallaxy.com
OTS Filedocs/technical/benchmark-report.mdx.ots
StatusConfirmed in Bitcoin block
# Verify independently
pip install opentimestamps-client
curl -O https://raw.githubusercontent.com/abscore-ai/abs-core/main/docs/technical/benchmark-report.mdx.ots
ots verify benchmark-report.mdx.ots

Test Taxonomy

ABS Core was validated across three distinct test classes, each serving a different purpose for regulated-environment approval:

ClassLoadDurationRequestsPurpose
Endurance (Soak)200 req/s2h~1.44MStability over time, memory leaks, log growth
Load (Operational)1,000 req/s60 min~3.6MSLO validation under real banking load
Stress5,000 req/s15 min~4.5MFailure mode characterization, backpressure

Why three classes? A risk committee cannot approve a system tested only at one load point. Each class answers a different question: Does it hold over time? Does it meet SLO at operational load? Does it fail gracefully under extreme load?


Test 1 — Endurance (Soak) · 200 req/s · 2 hours

Date: 2026-02-18 · Infrastructure: Cloudflare Workers (Global Edge)

Results

MetricValueAssessment
Total Requests~1,440,000200 req/s × 7,200s
Duration2h 00mFull soak window
P50 Latency153msStable — no drift over time
P95 Latency198msWithin SLO envelope
P99 Latency247msP99/P50 ratio: 1.6x (excellent)
Errors (5xx)0Zero crashes in 2 hours
Throughput sustained200 req/sNo degradation detected
Memory growthFlatNo leak observed

Interpretation

The P99/P50 ratio of 1.6x is the critical stability indicator. Systems with ratio >3x exhibit jitter under sustained load — indicative of GC pressure, cache eviction, or connection pool exhaustion. At 1.6x after 2 hours and 1.44M requests, ABS Core demonstrates:

  • No garbage collection pauses visible at the tail
  • No progressive latency drift (common in systems with unbounded log queues)
  • Consistent behavior across the full test window

Verdict for endurance: Passes SRE soak test criteria.


Test 2 — Load (Operational) · 1,000 req/s · 60 minutes

Date: 2026-02-25 · Infrastructure: Cloudflare Workers (Global Edge)

Results

MetricValueAssessment
Total Requests~3,600,0001,000 req/s × 3,600s
Duration60 minFull operational window
P50 Latency161msWithin SLO
P95 Latency224msWithin SLO
P99 Latency289msWithin SLO (threshold: 300ms)
Errors (5xx)0
Throughput sustained1,000 req/sNo shedding below rate limit
Rate limit activations~12% of requestsExpected — protects downstream

Latency Distribution

PercentileLatencyClassification
Min~98msBest case (warm edge node)
P50161msMedian — operational baseline
P75~194msThird quartile
P90~218msHigh-load profile
P95224msEnterprise-grade threshold
P99289msTail — within 300ms SLO
Max~744msCold start (edge node wake)

Comparison with Banking Operational Profiles

Institution ProfileTypical Operational LoadABS Core TestedCoverage
Regional bank (R$ 500M/mo)80–200 req/s avg1,000 req/s5–12x headroom
Mid-size fintech (Pix-connected)200–600 req/s peak1,000 req/s1.6–5x headroom
Large bank (Pix-scale governance)800–2,000 req/s peakCovered to 1k, stress to 5kPartial — see Test 3

Verdict for load: Meets SLO at operational banking load. P99 < 300ms at 1,000 req/s sustained.


Test 3 — Stress · 5,000 req/s · 15 minutes

Date: 2026-02-26 · Infrastructure: Cloudflare Workers (Global Edge)

Results

MetricValueAssessment
Total Requests~4,500,0005,000 req/s × 900s
Duration15 minStandard stress window
P50 Latency178msGraceful degradation (+17ms vs Test 2)
P95 Latency312ms⚠ Above 300ms SLO at extreme load
P99 Latency487ms⚠ Expected tail growth under saturation
Errors (5xx)0Zero crashes at 5x operational load
Rate limit activations~68% of requestsBackpressure working as designed
Circuit breaker trips0System self-regulated via rate limiter

The 68% rate limit activation rate at 5,000 req/s is correct behavior.

The rate limiter is the first line of defense against DDoS and load spikes. At 5x operational load, the system deliberately sheds excess requests with HTTP 429 (Retry-After header) rather than allowing latency to spiral or returning 5xx errors. The 32% of requests that pass through are served within normal parameters.

This is the difference between graceful degradation and catastrophic failure — a core requirement for regulated environments.

Failure Mode Characterization

At 5,000 req/s, the system exhibited the following behavior:

  1. Rate limiter engaged at ~1,200 req/s (configured threshold), returning 429 with Retry-After: 1
  2. No 5xx errors — the governance kernel never crashed or returned unhandled exceptions
  3. P50 drift of +17ms — acceptable for a 5x overload scenario
  4. P99 growth to 487ms — expected; tail latency is the first casualty of overload, P50 held
  5. Audit log integrity maintained — 100% of requests (passed + rejected) recorded in ledger

Stress Test Verdict

RequirementThresholdResultPass/Fail
Zero 5xx under overload0 errors0 errorsPass
Graceful degradationRate limit, not crashHTTP 429 with Retry-AfterPass
Audit log integrity under stress100% recorded100% recordedPass
P50 stability<2x normal P50178ms vs 161ms (+10%)Pass
Circuit breaker behaviorNo cascading failureClean shed via rate limiterPass

Verdict for stress: Failure mode is controlled and predictable. System sheds load via rate limiter rather than crashing. Suitable for regulated environments where graceful degradation is required.


Aggregate SLO Validation

SLO CommitmentTargetTest 1 (200 rps)Test 2 (1k rps)Test 3 (5k rps)
P50 latency<200ms153ms161ms178ms
P99 latency<300ms247ms289ms487ms ⚠ (overload)
Error rate (5xx)0%0%0%0%
Availability99.95%100%100%100%
Audit completeness100%100%100%100%

P99 note at 5,000 req/s: The 487ms P99 at 5x operational overload is expected and acceptable. The SLO of <300ms P99 applies to normal operational load (up to 1,000 req/s). At 5,000 req/s, the commitment is zero crashes and complete audit integrity — both maintained.


Test Methodology

Infrastructure

ParameterValue
ToolArtillery (artillery.io)
Targethttps://api.abscore.app/v1/authorize
ProtocolHTTPS / TLS 1.3
InfrastructureCloudflare Workers (Global Edge)
Rate LimitingActive (real production behavior — not disabled for tests)
DataSynthetic payloads — no real PII

Scenario Mix (all three tests)

ScenarioWeightDescription
Allowed operation60%Balance query (query_balance), low risk → expected ALLOW
Blocked operation20%Offshore transfer (Cayman Islands), critical risk → expected DENY
High-value transfer20%Pix R$100,000 to CPF → expected policy escalation

Reproduce These Tests

git clone https://github.com/abscore-ai/abs-core.git
cd abs-core

# Endurance: 200 req/s for 2h
./scripts/run-benchmark.sh --profile endurance

# Load: 1,000 req/s for 60 min
./scripts/run-benchmark.sh --profile load

# Stress: 5,000 req/s for 15 min
./scripts/run-benchmark.sh --profile stress

Limitations

  • Rate limiting was active during all tests — this reflects real production behavior
  • Results include network latency from test runner to Cloudflare edge; pure compute latency is lower
  • Stress test P99 (487ms) applies only at 5,000 req/s — 5x above designed operational load
  • Cold start latency (max ~744ms) reflects first-request edge node initialization; subsequent requests are served from warm state

Report generated: 2026-02-26 · SHA-256: d10e3aab0288387d0b425143f86dca9cf17e2d9a896e9e5678b1005bc01a7619 · Anchored: Bitcoin blockchain via OpenTimestamps

Note on Polygon L2: The optional on-chain anchoring feature uses Polygon Amoy — a public testnet — for Enterprise Early Access. This is disclosed as a preview capability. Production anchoring on mainnet is roadmapped. The Bitcoin OpenTimestamps proof above is the primary immutable integrity mechanism for all tiers.

On this page