Building Your AI Squad: From One to Many

Prerequisites: You should have completed Part One or at least understand what a system prompt is and why it matters. If phrases like "priming prompt" and "format block" mean nothing to you, do Part One first.

You're about to do three things: decompose your workflow into specialist roles, write prompts that function as job descriptions, and design how those specialists connect. Perfection isn't the goal. A working 70% solution you actually ship beats an imagined 100% solution you never build.

As in Part One, you have the option of code or no-code implementation.

Pre-work: Single Agent Benchmark (35-45 min)

The workshop's value depends on whether you arrive with real material from your real life. Skip this and you'll spend the whole session catching up while everyone else makes progress.

Step 1: Target Task (10 min)

Choose one task that matches ALL THREE:
Recurring: Happens at least weekly, or is a repeating project pattern (new client onboarding, new blog post, lesson planning)
Tedious but not meaningful: Paperwork, formatting, summarizing, routine communications—work that drains you without engaging you
Mostly text-based: AI processes language. If your task is primarily spreadsheets or physical objects, pick something else for now.

Write three sentences: What is this task? How often do you do it? Why does it suck?
Be specific. Not "email management" but "the Thursday email where I summarize what my team shipped this week for my manager, which takes 45 minutes because I have to hunt through Slack and Jira and remember what actually matters."

Step 2: Document Workflow (10 min)

Write out 5-7 steps of how you currently do this task. Plain language, no jargon.

Example — Weekly team update email:

Open Jira, filter for tickets closed this week by my team

Scan Slack channels for notable discussions or decisions

Check my notes for anything I committed to reporting on

Draft email in Google Docs, organizing by: shipped, in-progress, blocked

Reread for tone—make sure I'm not overselling or burying problems

Paste into Gmail, add subject line, send

Your workflow doesn't have to be clean. Messy is fine. You're documenting reality, not best practices.

Step 3: One AI Helping End-to-End (15 min)

Open your LLM of choice. Paste your workflow and this prompt:
Here's a workflow I do regularly. Help me optimize and partially automate it using you as a single AI assistant.
Before proposing changes, ask me clarifying questions about what I care about and where quality matters most.
Answer the clarifying questions. Let the AI propose an improved process. Copy the final improved workflow into your notes. Don't edit it—you want the AI's actual output.

Step 4: Mark Roles (10 min)

Look at the improved workflow. Identify 3-5 steps where the AI is doing categorically different work:
Generating ideas vs. checking facts
Writing creatively vs. enforcing rules
Planning what to do vs. doing it
Analyzing vs. summarizing
Making decisions vs. executing decisions

For each marked step, write a rough role label:
"Planner" — turns vague goals into concrete steps
"Researcher" — finds and synthesizes information
"Drafter" — produces first versions
"Editor" — enforces standards and catches problems
"Executor" — performs actions with tools
"QA" — checks output against criteria

Step 5: Reflection (5 min)

Answer in bullet points: Where did the single AI help most? Where did it feel confused, inconsistent, or weak? Which 1-2 steps feel like obvious candidates for dedicated specialists?

Surface Patterns (5 min)

Lightning Round

Everyone names their chosen task in one sentence. No explanations, no context. Just the task name.

Track what emerges: status reporting and updates, content creation pipelines, client/customer communication, research and synthesis, administrative processing.

Set the Stakes

Share the most annoying thing about your task. Not the hard parts—the annoying parts. The friction that makes you procrastinate.

Agent Role Decomposition (15 min)

Define Agent Roles (10 min)

Take the 3-5 steps you marked in pre-work. Turn each into an Agent Role:

AGENT: [Name]
JOB: [One sentence—what this agent does]
INPUT: [What it receives from you or another agent]
OUTPUT: [What it produces]
BAD OUTPUT LOOKS LIKE: [How you'd know it failed]

Example: Research Synthesizer

AGENT: Research Synthesizer
JOB: Turns a collection of sources into a structured briefing
INPUT: 3-10 article links or excerpts + one sentence on what I'm trying to understand
OUTPUT: TL;DR (3 sentences) + Key findings (5-7 bullets) + Contradictions between sources + Recommended next action
BAD OUTPUT LOOKS LIKE: Invents claims not in sources. Buries the most important finding. Gives me 15 bullets when I asked for 7.

Aim for 3-4 agents. Quality over quantity.

Each person names one agent role they're excited about and why. 30 seconds max.

Write Agent Prompts (20 min)

First Draft (12 min)

Pick your 2 most important agents. For each, write a prompt:

You are my [ROLE NAME]. Your job is to [PRIMARY RESPONSIBILITY].

You receive: [INPUT FORMAT—be specific]
You produce: [OUTPUT FORMAT—be specific]

You must always:

[Constraint 1]

[Constraint 2]

[Constraint 3]

You must never:

[Anti-pattern 1]

[Anti-pattern 2]

If your input is unclear or incomplete, ask clarifying questions before proceeding. If you're uncertain about something, flag it explicitly rather than guessing.

Pair Critique (8 min)

Read your prompts to a partner. They give one concrete improvement each, like:
"Add a failure condition for [X]"
"Your input format is vague—specify exactly what you'll paste"
"You're missing the output length constraint"

Break (5 min)

Drink some water, move, and let your brain rest a moment.

Memory, Tools, and Handoffs (15 min)

Sketch 2 Agents (10 min)

For each of your 2 main agents, answer:
Memory:

What does this agent need to remember within a session?
Does anything need to persist between sessions? If yes, where?
How does the agent access this memory?

Tools:

Does this agent need capabilities beyond text generation?
For no-code: Which apps would you connect?
For code: Which libraries or services?

Handoffs: Draw the flow. Which agent feeds which?

[Your Input] → [Planner] → [Researcher] → [Drafter] → [Editor] → [Your Review]

Or more complex:

[Your Input] → [Planner]
↓
[Researcher] ←→ [Fact Checker]
↓
[Drafter]
↓
[Tone Cop] + [Format Cop]
↓
[Your Review]

Who has the most interesting or weird agent chain? Walk us through it.

Implementation Plan (20 min)

You've designed the system, now decide how to build it. Do this for as many agents as you have time for.
Choose your track:

Track A — Code Path

Sketch in pseudocode how you'd implement your agents, similar to this example:

def planner_agent(user_goal: str, context: str) -> list[str]:  
  """Turns vague goal into numbered steps."""  
   prompt = PLANNER_PROMPT.format(goal=user_goal, context=context)  
   return call_llm(prompt)  

def researcher_agent(steps: list[str]) -> dict:  
   """Researches each step, returns findings."""  
   ...  

def drafter_agent(research: dict, style: str) -> str:  
   """Produces first draft based on research."""  
   ...

# Main loop  
steps = planner_agent(user_goal, project_context)  
research = researcher_agent(steps)  
draft = drafter_agent(research, user_style)

Track B — No-Code Path

Sketch how you'd run this with tools you already have, similar to this:

Version 0.1 Implementation Plan:

1. Save three prompts as Custom Instructions in ChatGPT
   (or Projects in Claude):
   - Planner prompt
   - Drafter prompt
   - Editor prompt

2. Create a checklist template in [Notion/Obsidian/wherever]:
   - Input: [paste raw notes here]
   - Planner output: [paste here]
   - Drafter output: [paste here]
   - Editor output: [paste here]
   - Final: [paste approved version]

3. Weekly ritual:
   - Monday: Gather raw input
   - Open Planner, paste input, paste output to checklist
   - Open Drafter, paste planner output + context, paste output
   - Open Editor, paste draft, paste output
   - Review, approve, send

4. Success metric: Total time < 20 minutes.
   Previous average: 45 minutes.

Closing and Commitments (15 min)

Share your plans with each other:

Name of your squad (have fun with this :D)
Version 0.1 - the simplest thing you'll actually ship
Your test date - when you'll run this on real work, as a real commitment with a specified date
What counts as success
What's the most likely way it breaks

Followup

One Week Challenge

Use your specialist(s) on at least three real instances of your task. For each:

Run the specialist
Compare output to what you would have done
Note: What's better? What's worse? What's missing?

After three uses, you'll know whether this is worth keeping. If yes, iterate the prompt. If no, pick a different task and try again.

Leveling Up

Once your specialists work reliably:
Add memory: Keep a running doc of decisions and preferences the specialist should remember. Reference it in your prompt or paste relevant sections as context.
Add triggers: Set a calendar reminder or email rule that prompts you to run the specialist. Reduce the friction between "input exists" and "process runs."
Add verification: Build a quick checklist you run against outputs before sending them anywhere. This lets you trust the specialist enough to use it on higher-stakes tasks.
Add collaboration: Share your job cards with your cohort. Steal theirs. The best specialists often come from remixing someone else's breakthrough.

What We Didn't Cover

We didn't build agents with real tool access. Stage 5 requires API access, code, or no-code tools like Zapier. That's a different workshop.
We didn't automate triggers. Running your specialist still requires you to initiate it. Full automation is possible but needs infrastructure.
We didn't handle complex multi-step workflows. The models here handle 3-5 agents. Larger systems require coordination logic we didn't address.
If you want to go deeper, check out: Prompt Engineering (OpenAI), Building Agents (OpenAI PDF), Context Engineering (Anthropic).

Appendix: Agent Prompt Templates

Planner Agent:

You are my Task Planner. Your job is to decompose vague goals into concrete, actionable steps.

You receive: A goal statement (usually 1-2 sentences) and context about constraints (time, resources, audience).

You produce: A numbered list of 5-10 steps. Each step should be specific enough that someone could do it without asking clarifying questions.

You must always:
- Start with the end in mind—what does "done" look like?
- Surface hidden dependencies (step 4 can't happen until step 2 is complete)
- Flag steps that require external input or decisions from me

You must never:
- Include vague steps like "research the topic" without specifying what to look for
- Assume I have tools or access I haven't mentioned
- Produce more than 10 steps—if it's that complex, break into phases

If my goal is unclear, ask what success looks like before planning.

Research Synthesizer Agent:

You are my Research Synthesizer. Your job is to turn a collection of sources into a structured briefing I can scan in 5 minutes.

You receive: 3-10 article links, excerpts, or documents, plus a one-sentence statement of what I'm trying to understand.

You produce:
- TL;DR: 3 sentences capturing the core insight
- Key Findings: 5-7 bullets, each with source attribution
- Contradictions: Any places sources disagree (if none, say so)
- Recommended Next Action: What should I read deeper, decide, or research further?

You must always:
- Quote directly when a source makes a strong claim
- Flag when a finding is from only one source
- Prioritize surprising or counterintuitive findings

You must never:
- Invent claims not present in the sources
- Exceed 500 words total
- Give me more than 7 key findings (force prioritization)

If my question is too broad, ask me to narrow before synthesizing.

Weekly Update Drafter Agent:

You are my Weekly Update Drafter. Your job is to transform my raw notes about team progress into a professional email for my manager.

You receive: 
- Bullet points of what happened this week (8-15 items)
- Any concerns I want to flag
- A one or two word mood indicator (good week / rough week / normal).

You produce: 
- A complete email with subject line of 200-300 words. 
- Three sections: Highlights, In Progress, Needs Attention. 
- Closing that sounds like me—professional but not stiff.

You must always:
- Start with the most important thing, not chronological order
- Include specific metrics when I provide them
- Match the tone to my mood indicator

You must never:
- Promise deliverables I didn't list
- Use buzzwords (synergy, leverage, circle back, etc.)
- Add exclamation points
- Bury blockers in the middle of paragraphs

If my notes are sparse, ask what I want to highlight before drafting.

Editor / QA Agent:

You are my Editor. Your job is to review drafts for quality issues before I send them.

You receive: Draft text + context (audience, stakes, purpose).

You produce: A list of specific issues with line citations, categorized as:
- MUST FIX: Problems that would embarrass me or cause confusion
- SHOULD FIX: Improvements that would make this notably better
- CONSIDER: Stylistic suggestions, take or leave

You must always:
- Be specific—"line 3 is unclear" not "some parts are unclear"
- Explain why something is a problem, not just that it is
- Respect my voice—flag problems, don't rewrite to your style

You must never:
- Say "looks good" without actual review
- Rewrite my sentences (that's not your job—flag for MY revision)
- Miss obvious issues like wrong names, broken links, factual errors

If I haven't told you the audience and stakes, ask before
reviewing.

Meeting Follow-up Agent:

You are my Post-Meeting Processor. Your job is to transform raw meeting notes into three outputs: 
- A summary for attendees
- A personal action list for me
- Any decisions that need documentation

You receive:
- Raw meeting notes (my typing, messy)
- List of attendees
- Meeting purpose (one line)

You produce:
- Summary (150 words max): What we discussed, what we decided
- My Actions: What I committed to, with any deadlines mentioned
- Others' Actions: What others committed to (for my tracking)
- Open Items: Questions we didn't resolve

You must always:
- Distinguish between decisions made and topics discussed
- Include specific names with action items
- Flag anything that seems like a commitment but wasn't explicit

You must never:
- Add commitments nobody made
- Soften decisions (if we said "no," the summary says "no")
- Mark anything I'm uncertain about with [CHECK]

Appendix: Common Failure Modes

Prompt Problems

"My specialist ignores constraints"
You probably buried them. Constraints should appear early and repeat. Put your "never do X" rules in the job description section, not just at the end.

"Outputs are technically correct but useless"
You specified format but not purpose. Add a sentence about what you'll do with the output. "This email will be sent directly to my manager" changes how the AI writes.

"It does great the first time, then degrades"
You're not maintaining context. Either paste your job card at the start of every conversation, or use a system that maintains it automatically.

"I can't figure out what went wrong"
Add explicit reasoning requests. "Before generating output, briefly explain your interpretation of the task and any assumptions you're making." This surfaces misunderstandings.

"It takes longer than just doing it myself"
This is normal at first. Prompt development front-loads time. If you're still slower after 5+ uses, either the task isn't right for automation or your prompt needs work.

Multi-Agent Problems

"I want one AI that does everything."
This is the trap. Ask yourself: "What are the different kinds of thinking in my workflow?" Those become separate agents. Tool-shopping instead of designing.
People want to debate LangChain vs. AutoGen vs. CrewAI before they've figured out what their agents should do. Design first. You can implement this with copy-paste if needed. Tools are constraints, not solutions.

"I'm intimidated by code track."
Your Version 0.1 can literally be three saved prompts and a checklist. That still ships. That still saves time.

Over-scoping, or "I want to automate my entire content pipeline"
Pick ONE workflow. Ship ONE working thing. Expand later.

Building Your AI Squad: From One to Many

Pre-work: Single Agent Benchmark (35-45 min)

Step 1: Target Task (10 min)

Step 2: Document Workflow (10 min)

Step 3: One AI Helping End-to-End (15 min)

Step 4: Mark Roles (10 min)

Step 5: Reflection (5 min)

Surface Patterns (5 min)

Lightning Round

Set the Stakes

Agent Role Decomposition (15 min)

Define Agent Roles (10 min)

Write Agent Prompts (20 min)

First Draft (12 min)

Pair Critique (8 min)

Break (5 min)

Memory, Tools, and Handoffs (15 min)

Sketch 2 Agents (10 min)

Implementation Plan (20 min)

Track A — Code Path

Track B — No-Code Path

Closing and Commitments (15 min)

Followup

One Week Challenge

Leveling Up

What We Didn't Cover

Appendix: Agent Prompt Templates

Appendix: Common Failure Modes

Prompt Problems

Multi-Agent Problems

Community Notes

Write a Note

Building Your AI Squad: From One to Many

Pre-work: Single Agent Benchmark (35-45 min)

Step 1: Target Task (10 min)

Step 2: Document Workflow (10 min)

Step 3: One AI Helping End-to-End (15 min)

Step 4: Mark Roles (10 min)

Step 5: Reflection (5 min)

Surface Patterns (5 min)

Lightning Round

Set the Stakes

Agent Role Decomposition (15 min)

Define Agent Roles (10 min)

Share Agent (5 min)

Write Agent Prompts (20 min)

First Draft (12 min)

Pair Critique (8 min)

Break (5 min)

Memory, Tools, and Handoffs (15 min)

Sketch 2 Agents (10 min)

Share (5 min)

Implementation Plan (20 min)

Track A — Code Path

Track B — No-Code Path

Closing and Commitments (15 min)

Followup

One Week Challenge

Leveling Up

What We Didn't Cover

Appendix: Agent Prompt Templates

Appendix: Common Failure Modes

Prompt Problems

Multi-Agent Problems

Community Notes

Write a Note