Why Every AI System Needs a Consigliere

In my previous post about the 8-Agent SDLC, I introduced the system: eight specialised AI agents covering the full software development lifecycle. Requirements, architecture, design, build, test, ship, security—plus an orchestration layer.

That orchestration layer—SDLC-07, the Chief of Staff—is where the magic actually happens.

If you build a multi-agent system without a coordination layer, you just have multiple tools. The COS is what turns them into a team.

The Consigliere Concept

I call it the “consigliere” rather than “orchestrator” because that word captures something important about the role.

In mafia movies, the consigliere isn’t the boss. The boss makes the big decisions. But the consigliere is the one who makes the organisation actually function—advising the boss, managing relationships between capos, catching problems before they escalate, ensuring nothing falls through the cracks.

That’s exactly what SDLC-07 does. I’m still the boss. I make the decisions about what to build and why. But the COS makes the system actually function.

What the Chief of Staff Actually Does

1. Dispatches specialist agents.

When I say “I need a PRD for this feature,” the COS doesn’t write the PRD itself. It invokes SDLC-01 (Product Requirements) with the right context, lets the specialist do its work, and brings me back the output.

This separation matters. If the COS tried to do everything—write requirements AND design architecture AND implement code—it would be stretched thin and mediocre at all of them. By dispatching specialists, each task gets the focused attention it needs.

The dispatch isn’t just “hey SDLC-01, go.” It includes context: what phase are we in, what’s been decided, what constraints exist. The specialist gets what it needs to do good work.

2. Reviews what comes back.

AI agents don’t always complete tasks fully. They make assumptions, skip edge cases, produce outputs that aren’t quite right.

The COS catches this. When SDLC-01 produces a PRD, the COS reviews it before bringing it to me. Is the scope clear? Are the success metrics measurable? Did it address the concerns I mentioned? If something’s missing, the COS either sends it back to the specialist or flags it for me.

This is the “quality gate before the quality gate.” Human review is still essential—but the COS pre-filters so I’m reviewing stronger outputs.

3. Manages handoffs.

When SDLC-01 (Requirements) finishes, SDLC-02 (Architecture) needs that output as input. When SDLC-02 finishes, SDLC-03 (Build) needs the architecture decisions.

The COS manages these handoffs. It ensures each phase has the context it needs from previous phases. It maintains continuity across the whole workflow.

Without this, you’d be manually copying context between agents, making sure each one knows what the others decided. That’s cognitive overhead that adds up fast.

4. Maintains project state.

There’s a single file—PROJECT-STATE.md—that captures everything about the current project: what’s been decided, what’s been built, what’s still pending, what issues are open.

The COS keeps this updated. After each phase, it writes the summary. Before each phase, it reads the state so the specialist has current context.

This is the “single source of truth” that makes the whole system coherent. Without it, you’d have decisions scattered across multiple conversations, easy to lose or contradict.

5. Works with the human.

The COS is my primary interface to the system. I tell it what I want to accomplish. It figures out which specialists to invoke. It brings me back outputs for review. When there are decisions to make—architectural tradeoffs, scope questions, prioritisation calls—it escalates to me.

This is the “human in the loop” made practical. I’m not reviewing every line of code or every test case. I’m reviewing at the right level of abstraction—requirements documents, architecture decisions, deployment reports.

Why You Can’t Skip the Orchestration Layer

I’ve experimented with simpler setups. Just call the specialist agents directly—why add an extra layer?

It doesn’t work. Here’s why:

Context gets lost. Without the COS maintaining project state, each specialist invocation starts somewhat fresh. You end up repeating context, explaining what was decided before, correcting misunderstandings. The overhead accumulates.

Quality varies wildly. Without pre-review, you’re seeing raw specialist output. Sometimes it’s great; sometimes it’s missing obvious things. You spend more time on review because you can’t predict what shape the output will be in.

Handoffs break. The specialist that finished doesn’t know what the next specialist needs. You become the manual connector, copying relevant context, remembering what to include. That’s exactly the cognitive work you were trying to automate.

Progress is invisible. Without project state documentation, you lose track of where things are. “Did we decide on the database schema?” “What were the security requirements again?” The information exists somewhere in conversation history, but it’s not accessible.

The orchestration layer solves all of these. It’s not overhead—it’s what makes the system more than the sum of its parts.

Implementation Details

For those interested in building something similar, here’s how SDLC-07 works in practice:

It’s a skills file. In Claude Code, skills are instructions that get added to context when invoked. The COS skill defines the Chief of Staff’s responsibilities, how it should interact with specialists, and what project state documentation looks like.

PROJECT-STATE.md is structured. It has sections for: project overview, current phase, completed phases with summaries, open decisions, pending tasks, known issues. The structure makes it quick to parse—both for AI and for humans.

Specialist invocation is explicit. The COS doesn’t just think about requirements—it actually invokes SDLC-01. This creates clear handoffs and ensures the specialist’s full capability is engaged.

Human escalation is built in. The COS knows to bring decisions to me rather than making assumptions. Architectural tradeoffs, scope changes, anything with significant implications—it asks rather than guesses.

Iteration is expected. When specialist output isn’t quite right, the COS can send it back with specific feedback. “The PRD is missing success metrics for the admin user flow.” The specialist revises, COS reviews again.

The Failure Modes I’ve Learned From

Building this system wasn’t smooth. Some things I got wrong:

V1: COS did too much. My first version had the COS both coordinating AND doing specialist work. “COS, write me a PRD and then design the architecture.” It would try, but the outputs were mediocre—jack of all trades, master of none.

The fix: strict separation. COS coordinates; specialists execute. COS doesn’t write code, doesn’t design architecture, doesn’t create tests. It invokes the agents that do those things.

V2: COS was too passive. I swung too far the other way. The COS would dispatch specialists and bring back whatever came out, without review or filtering. I was drowning in incomplete or misaligned outputs.

The fix: active review. The COS checks outputs before they reach me. “Does this PRD answer the original question? Does this architecture match the requirements?” Not perfection—just basic sanity checking.

V3: Project state was too verbose. I had the COS documenting everything in exhaustive detail. PROJECT-STATE.md became unwieldy—thousands of words, hard to parse, hard to maintain.

The fix: structured brevity. Current phase. Key decisions. Open items. Not a narrative—a checklist with context. Quick to read, quick to update.

When the Consigliere Shines

The COS is most valuable for:

Multi-phase projects. Anything that spans requirements → design → build → test → ship. The handoffs between phases are where context gets lost; the COS preserves it.

Complex features. When there are multiple components, multiple considerations, multiple decisions to track. The COS keeps everything organised.

Long-running work. Projects that span multiple sessions. You come back after a week and PROJECT-STATE.md tells you exactly where things are.

Iteration-heavy development. When you’re refining requirements, revising designs, debugging implementations. The COS tracks what’s been tried and what’s changed.

For simple, single-shot tasks—“write me a function that parses CSV”—the COS is overkill. Just ask directly. The value is in complexity and continuity.

Building Your Own

If you want to implement something similar:

Start with the project state format. Define what information you need to track across phases. Current state, key decisions, open items, pending work. The format matters—make it scannable.
Define clear specialist roles. Each specialist should have a focused responsibility. Overlap creates confusion about who does what.
Build the dispatch logic. How does the COS know which specialist to invoke? Usually from the current phase and the human’s request.
Add review gates. The COS should sanity-check outputs before surfacing them. Not detailed review—just “does this address the request?”
Iterate on failure. You’ll get it wrong at first. Pay attention to where context gets lost, where quality drops, where handoffs break. Each failure teaches you what the COS needs to do better.

The Meta-Lesson

The consigliere pattern isn’t specific to software development. Any complex multi-agent system benefits from a coordination layer—something that dispatches specialists, maintains state, manages handoffs, and ensures coherence.

In future posts, I’ll go deeper on specific specialists. But if you take one thing from this series, let it be this: the orchestration layer is not optional overhead. It’s what makes multi-agent systems actually work.

Without it, you have powerful tools that don’t talk to each other. With it, you have a team.

This is Part 6 of a series on AI-assisted software development. Previously: Field Notes: The WhatsApp Listener Project. Next: Requirements Are the New Code.