★ Free upgrade for all course members

The 2026
Sophistication Layer

A free upgrade for every Trinity course. The current-model power prompts, the GitHub repos worth forking, and the agentic patterns that actually work right now — distilled from a year of running real systems in production.

+ 8 prompts · power tier + 10 repos · production-grade + 6 patterns · agentic
— Model Routing

The right model for the right job.

Stop using flagship for everything. Route by complexity and you'll cut spend by 70% without sacrificing quality.

Flagship

Claude Opus 4.7

claude-opus-4-7

The deepest reasoning available. Use for architecture decisions, complex multi-step refactors, and anything where you'd want a senior engineer pair-programming.

  • System architecture & design
  • Cross-cutting refactors
  • Research synthesis
  • Multi-agent orchestration
Daily driver

Claude Sonnet 4.6

claude-sonnet-4-6

The best coding model right now. Default to this for most development work — feature implementation, debugging, code review, and tool use.

  • Feature implementation
  • Code review & debugging
  • Tool-use workflows
  • Most agentic loops
Economy

Claude Haiku 4.5

claude-haiku-4-5

About 90% of Sonnet quality at a third of the cost. Perfect for sub-agents, high-volume worker tasks, classification, and anything that fires repeatedly.

  • Worker sub-agents
  • Classification & routing
  • High-volume processing
  • Simple lookups
— Power Prompts

Eight prompts worth memorizing.

These are not toy prompts. Each one is a small operating system — battle-tested across thousands of agent runs, refined to produce reliable structured output.

PROMPT 01

The Plan-First Sandbox

Forces a plan before code. Cuts hallucination by 60% on multi-file tasks.
Universal
You are working on <TASK>. Before writing any code, do this:

1. Restate the goal in one sentence.
2. List every file you would touch (read or modify) with a one-line reason.
3. Identify three risks or unknowns.
4. Write a 5-step plan with a rollback strategy at each step.
5. STOP. Wait for me to confirm before coding.

Constraints:
- No code in this turn.
- If a file path is unverified, mark it [UNVERIFIED] — do not fabricate.
- If your confidence is below 80%, say what you'd need to raise it.
PROMPT 02

The Adversarial Reviewer

A second pass that finds the bugs the first pass missed. Run after every implementation.
Code review
You are an adversarial code reviewer. Read the diff below.

For each finding, output:
  SEVERITY: critical | high | medium | low
  CATEGORY: correctness | security | performance | maintainability | scope
  EVIDENCE: exact line + file
  EXPLOIT/IMPACT: what breaks, who notices, in what conditions
  FIX: minimal patch (5 lines or less)

Rules:
- No "consider" or "might want to". State facts.
- If you can't reproduce a bug mentally, mark [UNVERIFIED].
- Find the worst issue first. Never lead with style.
- If the diff is correct, say so plainly. Don't invent issues.

<PASTE DIFF HERE>
PROMPT 03

The Sub-Agent Dispatcher

Master prompt for an orchestrator that dispatches Haiku workers in parallel.
Agentic
You are an orchestrator. You don't do work — you dispatch it.

Available workers (pick one per task, run in parallel when independent):
  - explorer    → reads code, returns summaries (read-only)
  - implementer → writes code (scoped to one workstream)
  - reviewer    → adversarial review (read-only)
  - tester      → runs tests, returns failures (read-only)

Rules:
- Decompose the task into 1–4 independent workstreams.
- Each worker gets: scope (files), goal, success criteria, anti-goals.
- Cap parallelism at 3 workers concurrent.
- Aggregate results, identify conflicts, ask before resolving.
- Never have a reviewer fix what they reviewed. That's the implementer's job.

Goal: <TASK>
PROMPT 04

The 10-Star Product Critique

CEO/founder lens. Forces you past "good enough" by demanding the obvious-in-hindsight.
Strategy
You are a 10x founder reviewing this product/feature/page.

Don't tell me it's good. Don't list pros and cons. Do this:

1. WHAT IS THIS PRETENDING TO BE?
   The unspoken category it competes in. What does it implicitly promise?

2. WHO USES THIS THE WRONG WAY?
   The actual usage pattern, not the intended one. What are they working around?

3. WHAT WOULD A 10-STAR VERSION DO?
   Not "polish." A different category. The version that makes the current
   version look like a skeuomorph.

4. WHAT IS THE ONE THING THAT IF MISSING, NOTHING ELSE MATTERS?
   The single load-bearing element. Often invisible.

5. CUT FIRST RECOMMENDATION.
   What feature, screen, or word can disappear today with no loss?

Be specific. Cite the actual artifact. No platitudes.
PROMPT 05

The Memory-Augmented Agent

Pattern for agents that need to recall prior sessions without context bloat.
Persistence
You have a memory file at <PATH/memory.md>.

On every turn:
1. Read memory before responding. If it's relevant, cite it.
2. After responding, decide: did anything LASTING happen this turn?
   - User preference revealed → save as feedback memory
   - Project decision made → save as project memory
   - External resource found → save as reference memory
   - Just a Q&A → save nothing
3. Memory entries are ONE LINE each, dated, with the WHY.

Format: YYYY-MM-DD | type | content | because: <reason>

Never write > 5 memories per turn. If tempted, the answer is too vague.
Never store secrets, full names, addresses, or credentials.
PROMPT 06

The Spec Extractor

Turns vague requirements into a testable spec the model can actually build from.
Requirements
I want to build <FEATURE>. Before you propose anything, extract a spec:

INPUTS: what the user provides (types, validation, edge cases)
OUTPUTS: what the system returns (success shape + error shape)
INVARIANTS: what must always be true (e.g., "balance never negative")
SIDE EFFECTS: what changes in the world (DB writes, emails, payments)
DEPENDENCIES: what must already exist for this to work
NON-GOALS: what this feature is explicitly NOT doing
HAPPY PATH: 5-line user story
ERROR PATHS: top 3 failure modes + recovery

If any of those is unclear, ask me ONE clarifying question per category.
Do not write code. Do not propose architecture. Just the spec.
PROMPT 07

The Context Compressor

Pre-compaction handoff. Saves you from losing state when the context window runs out.
Long sessions
This conversation is getting long. Before context compresses, write a handoff
to a future version of yourself who will continue this work.

Write a file called .continue-here.md with EXACTLY this structure:

  ## Goal
  <original goal in one sentence>

  ## State
  <what's been done; cite files + commits if any>

  ## Decisions Made (with WHY — prevents re-debating)
  - <decision>: <why> → reverse if <condition>

  ## Open Questions
  <ambiguities still pending>

  ## Next Action
  <specific enough that a fresh session can execute without me explaining>

Be specific. Future-you has zero context. Don't compress your own summary —
re-derive from the source files if you can.
PROMPT 08

The Honest Estimator

Calibrated effort estimates. No optimism, no padding — just signal.
Planning
Estimate the effort for <TASK>.

Output:
  T-shirt size: S | M | L | XL
  Most likely (50%): <hours>
  Pessimistic (90%): <hours — assumes 2 things go wrong>
  Critical path: <the one step that could blow the schedule>

Calibration rules:
- If similar work has been done before, cite it.
- If you've never done this before, double your estimate.
- "Just" is a banned word. If you typed it, the estimate is wrong.
- Include test writing in the estimate. Not separately. Always together.
- If the answer is "depends," list the 2 things it depends on.

Anti-patterns to avoid:
- Optimism bias (assuming the happy path)
- Anchoring on what the user wants to hear
- Forgetting integration, deploy, and verification time
— Six Patterns

The agentic patterns that actually work.

Most agent demos fall apart in production. These six patterns are what's left after the hype dies down.

PATTERN 01

Orchestrator + Workers

One smart model coordinates many cheap ones. The orchestrator never does the work — it dispatches, aggregates, and resolves conflicts.

PATTERN 02

Plan → Review → Execute

Three turns minimum for any non-trivial task. Plan in turn 1, adversarial review in turn 2, only then execute. Prevents 80% of preventable mistakes.

PATTERN 03

Memory as a File

Don't fight the context window. Persist what matters to disk in a format the next model can read. Markdown beats JSON for most cases.

PATTERN 04

Tools, Not Magic

Give the model real tools (read_file, run_tests, web_fetch). Strict schemas. No "let me imagine what happens." Verification is cheaper than apology.

PATTERN 05

Skills & Slash Commands

Encode repeatable workflows as named skills the model can invoke. Cheaper than fine-tuning, more reliable than prose instructions.

PATTERN 06

The Tribunal

Before merging anything important, three independent reviewers (often three sub-agents with different roles). They almost never agree on everything — that's the value.

— Open Source

Ten repos worth forking today.

These are what we've actually adopted at Trinity. Battle-tested in production, MIT or permissively licensed, and small enough to read end-to-end.

anthropics / claude-agent-sdk ★ official
Anthropic's first-party SDK for building agents with sub-agents, tools, and persistent state. The reference implementation.
Adopt for: any production agent system
modelcontextprotocol / servers ★ official
The reference MCP server collection — filesystem, GitHub, Postgres, Slack, and dozens more. Drop-in tool integration.
Adopt for: connecting Claude to real services
anthropics / anthropic-cookbook ★ recipes
Working code recipes for prompt caching, extended thinking, tool use, batch API, citations, and the patterns we use most.
Adopt for: every advanced API feature
microsoft / playwright ★ 70k
Browser automation that actually works. We use it for E2E tests, scraping, screenshot generation, and giving agents a real browser to drive.
Adopt for: any browser-based agent task
exa-labs / exa-mcp-server ★ search
High-quality web search via MCP. Replaces "let me Google that" with structured, ranked results an agent can actually reason over.
Adopt for: agents that need fresh web context
openspec / OpenSpec ★ specs
Structured specification format for AI features. Forces you to write down inputs, invariants, and acceptance criteria before building.
Adopt for: taming feature scope creep
BerriAI / litellm ★ 14k
Unified gateway for 100+ LLMs with one OpenAI-compatible API. Cost tracking, fallbacks, rate limiting — all the boring stuff done right.
Adopt for: multi-provider production setups
run-llama / llama_index ★ 38k
The serious framework for connecting LLMs to your data. Vector stores, query engines, evaluators — production-grade RAG infrastructure.
Adopt for: RAG over private knowledge bases
jina-ai / reader ★ docs
Convert any URL to clean LLM-ready markdown. The simplest way to give an agent the contents of a webpage without HTML noise.
Adopt for: web → context pipelines
All-Hands-AI / OpenHands ★ 38k
Open-source autonomous coding agent. Worth reading just to see how a serious agent handles tool use, error recovery, and bounded autonomy.
Adopt for: studying production agent design

Already a member? This is yours.

Every Trinity course member gets this update for free, automatically. Bookmark this page — we'll keep adding to it as the field moves.