7 min read · For teams trying to make agent governance fast enough to stay on by default
The same lesson keeps showing up in different forms. Semantic caching cuts repeated LLM calls. Traditional text classifiers beat LLMs on speed and cost when labels are clear. Breadth-first query execution batches similar work instead of walking one branch at a time. Structured live dataset agents only become trustworthy when every row has source provenance. Streaming output removes dead air. Dynamic harnesses work best when critic, tournament, loop, and fan-out patterns are selected deliberately.
For ThumbGate, these are not separate product bets. They collapse into one control-plane rule: choose the cheapest reliable gate before the action runs.
| Lane | Use when | Why it is high ROI |
|---|---|---|
| Deterministic | Secrets, force-push, destructive SQL, protected files, known repeated commands. | Near-zero latency, no tokens, no provider call. This is the default for exact policy risk. |
| Semantic cache | A prompt or action is semantically equivalent to a prior rejected or approved pattern. | Returns the cached decision without rerunning the judge. This is the AISG-style buyer message applied to pre-action checks. |
| Rubric gate | A critic/rubric loop failed a criterion, hit its cap, or lacks done evidence. | Turns LangChain-style rubric iteration into an enforcement event: block completion claims until the missing proof exists. |
| Local classical classifier | High-volume labels with enough examples and low ambiguity. | Fast and cheap for routine feedback triage, import classification, and known error families. |
| Local semantic recall | Few examples, fuzzy near-misses, or cross-session recurrence. | Keeps private context local while catching cases regex and keyword routing miss. |
| LLM judge | High-risk semantic ambiguity with explicit cloud permission and a budget cap. | Useful for critic/rubric review, multi-document evidence review, and structured provenance checks, but not for every action. |
| Human review | Private, regulated, payment, credential, customer-data, or unbounded external-posting risk. | Prevents automation from laundering a risky decision through a model call. |
ThumbGate now has a small, testable routing primitive that makes this policy explicit:
node scripts/classifier-routing.js --risk=high --ambiguity=0.82 --allow-cloud --latency-ms=5000That command returns an evidence-requiring LLM judge lane. Add --semantic-cache-hit, and it reuses the prior decision without a provider call. Add --rubric-failed or --structured-dataset --missing-provenance, and it blocks completion through the rubric gate. Change the same high-risk ambiguous input to --privacy-sensitive without --allow-cloud, and it routes to human review instead.
Check the gate lane before spending tokens on a risky decision.