ABS Core
Features

Policy Studio — Simulate

Test policy rules against real historical decisions before promoting them to production. Zero risk, instant feedback.

Policy Studio — Simulate

Why this exists

Writing governance policies without a way to test them is dangerous. A rule that's too broad blocks legitimate agent actions. A rule that's too narrow misses the threats it was written to catch.

Policy Studio Simulate lets you run a proposed rule against your actual decision history — before it touches any live traffic.

How it works

The simulation endpoint fetches your last N real authorization decisions from decision_logs and evaluates your proposed rule against each one's recorded inputs:

Your proposed rule (name + pattern + action)

Fetched: last 50 historical decisions (real inputs)

Rule evaluated against each decision in isolation

Result: how many decisions would change verdict

No production data is modified. No live agents are affected.

Run a simulation

POST /v1/policies/simulate
Authorization: Bearer {your-api-key}

{
  "name": "Block exec_cmd in production",
  "content": "exec_cmd|shell_exec|run_script",
  "action": "DENY",
  "sample_size": 100
}
FieldRequiredDescription
nameHuman-readable name for this rule
contentRegex pattern matched against tool_name and input_data
actionDENY, ALLOW, or FLAG
sample_sizeDecisions to test against (default: 50, max: 200)

Reading the results

{
  "rule": {
    "name": "Block exec_cmd in production",
    "content": "exec_cmd|shell_exec|run_script",
    "action": "DENY"
  },
  "simulation": {
    "sample_size": 100,
    "decisions_tested": 100,
    "matched": 3,
    "changed_to_deny": 3,
    "changed_to_allow": 0,
    "unchanged": 97
  },
  "impact": "LOW",
  "decisions": [
    {
      "decision_id": "dec_01JN8K...",
      "tool_name": "exec_cmd",
      "original_verdict": "ALLOWED",
      "simulated_verdict": "DENIED",
      "changed": true,
      "matched_pattern": true
    }
  ]
}

impact values: NONE (0 changes) | LOW (<5%) | MEDIUM (5-20%) | HIGH (>20%)

Interpreting impact

A high impact simulation means your rule would affect a significant portion of recent decisions. This isn't necessarily bad — it might be intentional — but it's a signal to review the matched decisions carefully before promoting.

Look for:

  • changed_to_deny on decisions that were legitimately allowed → rule is too broad
  • changed_to_allow on decisions that were correctly denied → rule conflicts with existing policy
  • Expected changes only → rule is ready for production

Promoting to production

Once satisfied with the simulation, create the policy via the standard policy API:

POST /v1/policies
{
  "name": "Block exec_cmd in production",
  "content": "exec_cmd|shell_exec|run_script",
  "action": "DENY",
  "active": true,
  "change_reason": "Simulation passed: 3/100 decisions affected, all expected"
}

The change_reason field is recorded in the policy accountability audit trail.

On this page