Human-in-the-Loop Architecture Pattern

THE PRINCIPLE

Extraction is proposal, not fact.

When an AI model produces structured output destined for persistent state — a database row, a knowledge graph node, an API call to an external system — that output is a proposal. Not a fact. Until a human authorizes the transition, the output lives in a staging layer where it cannot affect downstream consumers, cannot be queried by other agents, and cannot leak into reports that present it as truth.

The pattern fails the moment a single bypass route exists. Bypass routes are not "advanced features for trusted contexts." They are governance breaches waiting to happen. The architectural answer is to close every one explicitly — at every layer where state can be modified.

THE THREE ENFORCEMENT LAYERS

LAYERED ENFORCEMENT, NOT LAYERED REDUNDANCY

LAYER 1

Database schema

Production tables and staging tables are separate. Foreign keys from staging point to staging. The schema makes "AI output writing to production" representationally impossible — not just policy-prohibited.

LAYER 2

API validation

Every endpoint that writes state has an explicit promotion route. The promotion route requires a human-authenticated session token. AI-attributed requests are rejected at the API layer if they attempt to write directly to production schema.

LAYER 3

Session authentication

Promotion requires a fresh presence signal — a session JWT or equivalent, not a stored long-lived credential. The token proves the human is present at the moment of approval. A 30-day PAT in .env cannot stand in for human presence.

THE BYPASS ROUTES TO CLOSE

For each bypass route, identify the layer where it would land and close it at that layer:

"Trust this agent for this category." No. Categories with implicit trust become the next exploit vector.
"Auto-promote when confidence > X." No. Confidence is not authorization.
"Auto-promote when downstream consumer expects data." No. The consumer's expectation is not the human's authorization.
"Allow the agent to write to a 'soft production' table that downstream tools also read." No. Soft production is production with a different label.
"Treat scheduled jobs as pre-authorized." No. A scheduled job is a machine identity, not a human. Tier-2 promotion requires human presence at promotion time.

Each closure is implemented at the layer where bypass would occur. Documentation-only closures decay.

IMPLEMENTATION GUIDANCE

1. Schema design

Create explicit staging tables (e.g. extractions, proposals) with their own primary keys. Production tables reference staging tables only via promotion records, not via direct foreign key. The promotion record carries: actor identity (the human), source extraction ID, decision (accept / reject / modify), timestamp, and notes.

2. Promotion API

POST /v1/extractions/:id/promote
Headers:
  Authorization: Bearer <session-jwt>
Body:
  { decision: "accept" | "reject" | "modify",
    modifications: {...},
    notes: "..." }

The endpoint validates the JWT is fresh (within session-life), validates the actor identity is human-class, writes the promotion record, and only then writes to production schema.

3. Audit on rejection

Rejected proposals are not deleted. They are persisted with their rejection record. Rejection patterns become the dataset that informs extraction-quality work.

4. Read paths

Production queries return only promoted data. Staging queries are explicitly separate endpoints with explicit naming. Agents and reports cannot accidentally read staging data and present it as production.

WHEN THIS PATTERN APPLIES

Any system where:

An AI model produces structured output
The output will be referenced by downstream consumers (humans, other agents, external systems)
The cost of a fabrication reaching production is material

If the cost of fabrication is trivial (e.g. brainstorming session output), this pattern is overhead. If the cost is reputational, legal, or financial, this pattern is the only safe operating mode.

ANTI-PATTERNS

Hand-waving "we'll review later" — review never happens at the rate extraction happens. Architecture, not discipline.
Trusting "low-confidence" filters as the gate — the system fails when high-confidence-but-wrong outputs land in production.
One staging table for everything — staging boundaries should mirror production schema boundaries. Otherwise promotion paths become a mess.
Allowing staging reads without explicit scope — agents pulling staging data into reports re-creates the fabrication risk you tried to architect away.