Capability Formalization

This section formalizes the positive side of the optimization problem: what we’re trying to maximize.

$\text{Capability} = \text{Power} \times \text{Agency}$

Pages

Page	Question
Agents, Power, and Authority	What makes something an agent? How do we measure power?
Worked Examples	What do these metrics look like for real systems?
The Strong Tools Hypothesis	Can we get high capability with low agency?

Key Concepts

Agency Score: How well a system’s behavior fits a simple utility function (0 = tool, 1 = optimizer)
Power Score: Ability to achieve diverse goals
RACAP: Risk-Adjusted Capability = Capability / Risk

The Core Insight

We want AI systems that are maximally capable while minimally risky. This may be achievable through “strong tools”—high power with low agency.

See The Strong Tools Hypothesis for analysis.

The Bridge to Delegation Risk

Earlier versions of this site presented Power and Agency alongside the risk accounting without saying how they connect. Here is the bridge — each score moves a specific term of the core formula.

Recall Delegation Risk = Σ P(harm) × Damage, and that Risk Decomposition splits P(harm) into two channels: accidents (the component fails at its task) and defection (the component works against you).

$DR \;=\; \underbrace{P(\text{accident}) \times D_{\text{accident}}}_{\text{any system}} \;+\; \underbrace{P(\text{defection}) \times D_{\text{defection}}}_{\text{requires agency}}$

Agency gates the defection channel. Defection is goal-directed behavior against the principal’s interests — it requires something that behaves like an optimizer. A system with Agency ≈ 0 (a calculator, a compiler) has no defection term: its entire risk budget is accident risk, which is exactly why we trust powerful tools that we’d never trust as agents. As the Agency Score rises, P(defection) becomes nonzero and grows with it.

Power bounds the damage terms — in both channels. A component’s worst-case Damage is capped by what it can actually reach and affect (the resources, capabilities, influence, and optionality dimensions of the Power Score). High power raises the ceiling on accident damage and defection damage; low power caps both regardless of intent.

Putting the two together: defection risk requires both high agency and high power — agency to want it, power to do damage with it. That is the Strong Tools Hypothesis restated in risk-accounting terms: pushing Power up while holding Agency near zero grows capability while keeping the defection term pinned at zero, leaving only accident risk — which verification patterns handle far more reliably than they handle adversaries. RACAP (Capability / Risk) is the ratio this trade optimizes.