Extraction is proposal, not fact.
When an AI model produces structured output destined for persistent state — a database row, a knowledge graph node, an API call to an external system — that output is a proposal. Not a fact. Until a human authorizes the transition, the output lives in a staging layer where it cannot affect downstream consumers, cannot be queried by other agents, and cannot leak into reports that present it as truth.
The pattern fails the moment a single bypass route exists. Bypass routes are not "advanced features for trusted contexts." They are governance breaches waiting to happen. The architectural answer is to close every one explicitly — at every layer where state can be modified.
.env cannot stand in for human presence.For each bypass route, identify the layer where it would land and close it at that layer:
- "Trust this agent for this category." No. Categories with implicit trust become the next exploit vector.
- "Auto-promote when confidence > X." No. Confidence is not authorization.
- "Auto-promote when downstream consumer expects data." No. The consumer's expectation is not the human's authorization.
- "Allow the agent to write to a 'soft production' table that downstream tools also read." No. Soft production is production with a different label.
- "Treat scheduled jobs as pre-authorized." No. A scheduled job is a machine identity, not a human. Tier-2 promotion requires human presence at promotion time.
Each closure is implemented at the layer where bypass would occur. Documentation-only closures decay.
1. Schema design
Create explicit staging tables (e.g. extractions, proposals) with their own primary keys. Production tables reference staging tables only via promotion records, not via direct foreign key. The promotion record carries: actor identity (the human), source extraction ID, decision (accept / reject / modify), timestamp, and notes.
2. Promotion API
POST /v1/extractions/:id/promote
Headers:
Authorization: Bearer <session-jwt>
Body:
{ decision: "accept" | "reject" | "modify",
modifications: {...},
notes: "..." }
The endpoint validates the JWT is fresh (within session-life), validates the actor identity is human-class, writes the promotion record, and only then writes to production schema.
3. Audit on rejection
Rejected proposals are not deleted. They are persisted with their rejection record. Rejection patterns become the dataset that informs extraction-quality work.
4. Read paths
Production queries return only promoted data. Staging queries are explicitly separate endpoints with explicit naming. Agents and reports cannot accidentally read staging data and present it as production.
Any system where:
- An AI model produces structured output
- The output will be referenced by downstream consumers (humans, other agents, external systems)
- The cost of a fabrication reaching production is material
If the cost of fabrication is trivial (e.g. brainstorming session output), this pattern is overhead. If the cost is reputational, legal, or financial, this pattern is the only safe operating mode.
- Hand-waving "we'll review later" — review never happens at the rate extraction happens. Architecture, not discipline.
- Trusting "low-confidence" filters as the gate — the system fails when high-confidence-but-wrong outputs land in production.
- One staging table for everything — staging boundaries should mirror production schema boundaries. Otherwise promotion paths become a mess.
- Allowing staging reads without explicit scope — agents pulling staging data into reports re-creates the fabrication risk you tried to architect away.