11 Jan 2026 8 min read Data

Prompts are Important but Orchestration is King - Why My DND Agent Failed

I didn't disappear, I just fell down a rabbithole.

It started in Madrid last year at the Google Cloud Summit. I was watching a demo of a "clerk" agent that to my eye seemed like magic. It wasn’t just chatting. It was using Gemini’s multi-modal eyes to look up inventory, listening to voice commands, and connecting to a backend to place stock orders.

Couldn't find the actual demo I watched, but this one is similar and demonstrates the giant impact agents will have on digital ecommerce.

What I saw opened up a world of possibilities. It's when I realized that the 'Chatbot' era was over. We aren't just building answering machines anymore. We are building systems that can take action.

Determined to learn the stack, I went deep into the wires. I spent weekends obsessed with the Model Context Protocol (MCP) just to understand the architecture of how agents talk to tools.

If you missed it, you can read my technical deep dive on Building a Ghost CMS MCP Server there.

The D&D Experiment

Armed with that knowledge, I tried to play God.

I decided to try and build my childhood dream; a virtual Dungeons & Dragons Game Master (I know, I'm a nerd). I wanted an AI that could tell infinite stories, control the monsters, manage the game economy, and track every single item in a player’s inventory.

In hindsight, I was wildly optimistic. I wanted it to be perfect, but I hit three walls that killed the project:

Context Rot: I was trying to force a single "Brain" to do everything. I wanted one massive prompt to be a storyteller, a rule lawyer, and a combat expert all at once. The result? The more I played, the dumber it got. It would write beautiful prose about a dragon but forget the steps to initiate combat. As conversations grew longer, performance degraded.
Economics: Using massive, general-purpose models for every trivial interaction (like rolling a d20) is grabbing money and lighting it on fire. I didn't see a path to a sustainable business model using expensive reasoning models for simple tasks, and I didn't know enough about the open-source field to optimize costs.
Inexperience: The project was too ambitious for a beginner. When I saw the incredible work a dedicated team like @FriendsandFables was already doing, I realized I was outgunned.

Eventually, I put the project on pause. But it wasn't in vain. I learned the most important lesson of 2025: You cannot prompt-engineer your way out of a bad architecture.

The Shift: From Prompts to Org Charts

That failure is why I’m writing to you today. Because despite agents being a 2025 concept, 2026 is when they will start shining.

The landscape has shifted. We are finally seeing models capable of handling this complexity without choking. Gemini 3 has set a new standard for cloud reasoning, while local models like Gemma 3, Llama 4, and Phi-4 are finally delivering reliable tool execution on consumer hardware. It isn't just raw intelligence anymore; it’s speed, efficiency, and context window size.

We are also seeing incredible research (like this recent paper) discussing how to structure these systems efficiently. The industry is moving away from "One Giant Prompt" to "Many Specialized Agents" working together to achieve a task.

Which brings us to Google's Agent Development Kit

Stop Building Gods. Start Hiring Interns.

Here is the hard truth I had to swallow: My D&D agent failed because I was trying to build a God. I wanted one entity to know everything, see everything, and do everything.

The ADK framework forces you to abandon that ego.

Think of ADK not as a library, but as an Org Chart generator. It doesn't want you to write a better prompt. It wants you to hire a staff of specialized workers.

In my failed experiment, I had one model context window filled with 20 pages of rules, tools, backstory, and conversation context. When I asked it to roll a dice, it had to read War and Peace just to understand the logic behind using the roll_dice Python function.

ADK solves part of this with Specialized Sub-Agents.

Instead of one agent, ADK encourages you to build a team. You can create a storyteller agent, a combat agent, and an inventory agent. Then, you hire a manager—an Orchestrator agent—whose only job is to route traffic.

Here is what that looks like in practice:

1. The Orchestrator (The Traffic Cop)

This is your entry point. It’s usually a fast, smart model (like Gemini Flash), that simply listens to the user and decides who needs to handle the request. User says: "I enter the dungeon" and it directs you to the narrative agent.

2. The Specialists (The Workers)

This is where the magic happens.

The Narrative Agent: in charge of actually telling the story. Prompted to act as a Movie writter, giving context and sensory detail, and keeping the narrative moving forward.
The Combat Agent: This agent knows the D&D 5e rulebook by heart. It runs on a highly logical model. It doesn't write poetry. It calculates damage.

3.The Orchestration Layer

Think of this as the company Slack. It’s the invisible glue. It holds the shared state (Memory) so that when the Combat Agent beheads a goblin, the Narrator Agent knows to describe the blood on the floor. The way this logic works can be customized, but ADK comes with some easy to set up configs by default.

The Org Chart through Code

The best part about ADK is that the code looks like boring, standard Python. It's quick to set up, you just define your agents like you define variables.

orchestrator_agent = Agent(
        name="coordinator_agent",
        model="gemini-3-flash-preview",
        description="Routes user input to Narrator or Combat agents.",
        instruction=f"""
    You are the Dungeon Master Coordinator.
    Your job is to act as the interface between the user and the sub-agents.
    
    GLOBAL CONTEXT:
    {context_str}
    
    Analyze the user's input:
    1. If the user is exploring, talking to NPCs, or asking about lore, delegate to the narrator agent tool.
    2. If the user triggers a fight, attacks someone, or is in combat, delegate to the combat agent tool.
    
    Always delegate to the most appropriate agent.
    """,
        tools=[
            AgentTool(narrator),
            AgentTool(combat)
        ]
    )

For example, in the code above, we can see how easy it is to set up the orchestrator. We fill out a class with name, model name, description, some instructions and we are ready to go. See at the end where we define a list of tools? That's where we can list any custom functions, MCP servers, or in this case, a set of sub-agents.

Advanced Management: Beyond the Basic Boss

The example above is a simple "Router" pattern—the boss points, the worker works. But real companies (and real D&D games) are more complex. ADK introduces three specific agent types that mirror real-world workflows:

1. The Assembly Line (Sequential Agents)

Sometimes, task B cannot start until task A is finished. ADK uses SequentialAgent to enforce this strict order. It passes the output of one agent as the input to the next.

    roll_to_hit = Agent(
        name="roll_to_hit",
        model="gemini-3-flash-preview",
        description="Rolls to hit against the player.",
        instruction=f"""
        You are rolling an attack for an enemy.
        Use the roll_dice tool to roll 1d20.
        Write the result to your output. Just state the number rolled.
        CONTEXT: {context_str}
        """,
        tools=[roll_dice],
        output_key="attack_roll"
    )
    
    calculate_damage = Agent(
        name="calculate_damage",
        model="gemini-3-flash-preview",
        description="Calculates damage if the attack hits.",
        instruction=f"""
        The attack roll was: {{attack_roll}}.
        If the roll is 10 or higher, the attack HITS. Roll 1d6 for damage using the roll_dice tool.
        If the roll is below 10, the attack MISSES. Output "0" for damage.
        Output only the damage number.
        CONTEXT: {context_str}
        """,
        tools=[roll_dice],
        output_key="damage_dealt"
    )
    
    describe_attack = Agent(
        name="describe_attack",
        model="gemini-3-flash-preview",
        description="Describes the attack cinematically.",
        instruction=f"""
        The attack roll was: {{attack_roll}}.
        The damage dealt was: {{damage_dealt}}.
        
        Describe what happened in vivid, cinematic detail.
        If damage is 0, describe a dramatic miss.
        If damage is high, describe a brutal hit.
        CONTEXT: {context_str}
        """
    )
    
    enemy_turn = SequentialAgent(
        name="enemy_turn_sequence",
        description="Executes an enemy's attack turn in sequence.",
        sub_agents=[roll_to_hit, calculate_damage, describe_attack]
    )

In a DnD game, this might happen when your agent is managing an enemy turn during combat. You want to ensure that the agent first rolls a d20 to see if it hits, then calculates damage, and then describes the output.

2. The Refiner (Loop Agents)

ADK also lets you run agents in a loop. This is quite useful if you want your agents to iterate over an output. For example, in this case, we have an agent that writes scenes for the game, and a critic that reviews them before they are shared with the user.

class QualityChecker(BaseAgent):
    """Exits the loop when quality is good enough."""
    async def _run_async_impl(self, ctx):
        approved = "APPROVED" in ctx.session.state.get("feedback", "")
        yield Event(author=self.name, actions=EventActions(escalate=approved))

draft = LlmAgent(name="writer", instruction="Write a scene.", output_key="draft")
critic = LlmAgent(name="critic", instruction="Review {draft}. Score it on engagement, narrative, and if it moves the story forward. Say APPROVED if good.", output_key="feedback")

refinement_loop = LoopAgent(
    name="story_refinement",
    max_iterations=3,  # Stop after 3 tries even if not approved
    sub_agents=[draft, critic, QualityChecker()]
)

3. The Council (Parallel Agents)

Sometimes you need speed, or you need to simulate a messy, real-time conversation. ParallelAgent fires off multiple sub-agents simultaneously and gathers their results.

Imagine a game where we are the DM preparing a scenario for multiple NPCs. There might be times where we don't want to process the events one by one in a polite circle. We want them to process the events in parallel.

barbarian = Agent(
    name="grog",
    model="gemini-3-flash-preview",
    instruction="You are Grog. You are angry and loud. React to the DM.",
    output_key="grog_response"
)

wizard = Agent(
    name="elara",
    model="gemini-3-flash-preview",
    instruction="You are Elara. You are wise and cautious. React to the DM.",
    output_key="elara_response"
)

rogue = Agent(
    name="vax",
    model="gemini-3-flash-preview",
    instruction="You are Vax. You are suspicious. React to the DM.",
    output_key="vax_response"
)

npc_party = ParallelAgent(
    name="npc_party_reaction",
    description="Gets reactions from all party members simultaneously.",
    sub_agents=[barbarian, wizard, rogue]
)

Hopefully, you have noticed a pattern. The code is quite simple, and it always works by defining a set of sub-agents that form part of a process (sequential, loop, parallel). And just so you know, you aren't limited to these three orchestration patterns. ADK comes with these as default, but you could also create your own.

The New Standard

I used to think that "better prompting" was the answer to everything. If my bot hallucinated, I just needed to yell at it in clearer English or expand the prompt a tad bit more.

But that wasn't true.

The answer isn't more or better words; it's better boundaries. By breaking your application into these specialized, modular agents, you gain three superpowers that no amount of prompt engineering can give you:

Sanity: Your "Combat Agent" never hallucinates poetry because it doesn't know poetry exists. Because you can manage the agents context, you are able to limit it and keep it low, vastly reducing context creep.
Savings: You stop paying for a PhD-level model to say "Hello." You hire the cheap, fast intern (Gemini Flash) for the grunt work and save the genius (Gemini Pro) for the hard stuff.
Scale: Adding a new feature doesn't mean rewriting your massive prompt. It just means hiring a new agent and integrating it into your orchestration.

We aren't building answering machines anymore. We are building organizations. The difference between a toy project and a startup in 2026 won't be which model you use, it will be who you hire to run it.

So, stop trying to build a God. Go check out the ADK docs, open your editor, and hire your first intern.

The D&D Experiment

The Shift: From Prompts to Org Charts

Stop Building Gods. Start Hiring Interns.

1. The Orchestrator (The Traffic Cop)

2. The Specialists (The Workers)

3.The Orchestration Layer

The Org Chart through Code

Advanced Management: Beyond the Basic Boss

1. The Assembly Line (Sequential Agents)

2. The Refiner (Loop Agents)

3. The Council (Parallel Agents)

The New Standard

You might also like...

Your Marketing Stack Is a System. Are You Treating It Like One?

Beyond Data-Driven: Why Your Analytics Might Be Lying to You

Building a Ghost CMS MCP Server: When Your Blog Finally Gets the AI Assistant It Deserves

Take Your First Steps in Marketing Mix Modeling Using Google's Meridian Python Library

Attribution Detectives - Cracking the Code of Marketing Attribution