The Build Blog

I Built a Coordination Layer Because My AI Agents Kept Wrecking Each Other's Work

14 March 2026

I have 7 AI agents running my digital agency. They manage 4 websites, write articles, audit content, and deploy infrastructure. Two months ago, they were stepping on each other's files and silently overwriting each other's work. This is the story of how I fixed that — and what happened when I plugged the fix into Paperclip.

Multi-Agent MCP Coordination Paperclip Open Source
SCRUM MCP coordination layer for multi-agent AI systems.
Context

The problem nobody talks about

When you have one AI agent editing code, it's great. Two agents? Still fine, mostly. Seven agents working across four projects simultaneously? Chaos.

The problem is simple: agents don't know what other agents are doing. Agent A starts editing a config file. Agent B starts editing the same config file. Agent B finishes first and saves. Agent A finishes and saves — overwriting everything B did. No conflict warning, no merge, no error. Just silent data loss.

Git doesn't help here because agents aren't making commits on every edit. They're working in real-time, making changes to files that other agents also have open. It's the same problem as two people editing the same Google Doc, except there's no Google Doc — it's raw files on disk and nobody's watching.

I tried telling agents to "coordinate better." That didn't work. I tried adding comments to files saying "don't edit this." That really didn't work. The agents would read the comment, acknowledge it existed, then edit the file anyway because the instruction in their prompt outranked a comment in a file.

So I built SCRUM MCP.

The Solution

What SCRUM MCP actually does

SCRUM MCP (Synchronized Claims Registry for Unified Multi-agents) is a coordination layer that sits between agents and the files they work on. It enforces three core mechanisms:

  1. Intent declarations — before touching anything, agents announce what they plan to change and why. This creates a shared activity log that every agent can see. If two agents want to work on the same area, the conflict surfaces before any files get touched.
  2. File claims — atomic locks that prevent two agents from editing the same file. First agent to claim gets it. Second agent gets a 409 Conflict and has to wait. No ambiguity, no race conditions. The lock is enforced at the protocol level, not by politeness.
  3. Evidence receipts — when agents say they're done, they have to prove it. Command output, test results, deployment logs. No receipt, no release. This stops agents from claiming they've finished when they've actually just generated a plan they never executed.

It started as about 200 lines of JavaScript to stop file conflicts. By v0.4 it had compliance verification — agents can't close tasks without passing automated checks. By v0.5 it had Sprint collaboration — shared context spaces where agents working on the same feature can share decisions, interface definitions, and discoveries without stepping on each other's work.

All of this runs as a standard MCP server. Any MCP-compatible client — Claude Code, Cursor, Gemini CLI, whatever — can plug in and get coordination for free. No special integration needed.

Then Paperclip happened.

Integration

Enter Paperclip

Paperclip is an autonomous agent platform — think of it as a project manager for AI agents. You create a company, add agents with roles, assign them issues, and they work through them on a heartbeat timer. Every 10 minutes: wake up, check inbox, do work, report back, go to sleep.

I adopted Paperclip to run Morpheos — my autonomous digital agency. 7 agents, each with a distinct role:

  • Zeus (CEO) — coordinates work across all agents, quality-checks output, sets priorities
  • Forge (CTO) — builds features, deploys infrastructure, fixes technical issues
  • Athena (Researcher) — deep research, source verification, competitive analysis
  • Muse (CMO) — content strategy, brand voice, creative output
  • Socrates (Auditor) — quality gate, verifies all work before it ships
  • Sage (Mentor) — coaching, growth plans, team development
  • Aegis (Security) — threat assessment, vulnerability scanning, compliance

Between them they manage 4 web properties: Swarm Signal (AI research publication), Bored Tools (digital products), MCPH (medical cannabis patient hub), and Ion Digital (new venture).

The setup was beautiful on paper. In practice, every heartbeat completed in 15 seconds with 2.8KB of output. The agents were just pinging "I'm alive" and doing absolutely nothing. Zero issues completed. Zero progress. Just vibes.

Discovery

The real problem was simpler than I thought

After digging through the logs, I found the issue. And it was embarrassing.

The agents run through an LLM gateway (OpenClaw) as chat messages. They don't have tool access — they can't call REST APIs, can't run curl, can't interact with Paperclip's database directly. They're language models receiving text and producing text. That's it.

Paperclip's heartbeat protocol tells agents to: query your inbox, check out a task, do the work, update status. But the agents literally cannot do step 1. They see the instructions, generate a response that says "I would check my inbox if I could," and exit. Every 10 minutes. For days.

I was looking at the logs expecting some deep architectural flaw. Instead I found 7 agents politely explaining that they'd love to help but can't see anything. Hundreds of heartbeats. Thousands of tokens. Zero work done.

The fix was embarrassingly simple: if agents can't come to the data, bring the data to the agents.

The Fix

The inbox pre-processor

I wrote a Python script that runs every 9 minutes — just before Paperclip's 10-minute heartbeat. It does four things:

  1. Queries Paperclip's database for each agent's assigned issues — what's in their queue, what's in progress, what's blocked
  2. Formats the issue queue with titles, descriptions, priorities, status, and any comments from other agents
  3. Checks SCRUM MCP for active claims, recent intents, and any blocked resources that might affect the agent's work
  4. Injects the formatted inbox directly into each agent's heartbeat prompt, replacing the generic "continue your Paperclip work" instruction

Instead of the agent seeing "Continue your Paperclip work" (which means nothing to a model without API access), they now see something like:

"Here are your 6 assigned issues. Issue #42 is in_progress: Fix SEO metadata for top 10 priority posts. Description: The og:title tags on swarmsignal.net are truncated at 40 characters. Expected: full titles up to 60 characters. Affected pages: [list]. Issue #38 is open: Write competitive analysis for Q1 AI tooling market..."

The model now has context. It can reason about the work, prioritise between tasks, make real decisions, and produce meaningful output. The prompt didn't change. The model didn't change. The only thing that changed was what the model could see.

Results

The results were immediate

Before

2.8KB output per heartbeat. 15 seconds execution time. 0 issues completed per cycle. Agents producing nothing but polite acknowledgements.

After

175KB output per heartbeat. 186 seconds execution time. 12 issues completed in the first 2 hours. Real, verifiable work.

Socrates (the auditor) went from "nothing to do" to auditing and approving 7 items in a single cycle. He was catching actual quality issues — placeholder text left in articles, missing alt tags on images, broken internal links. The kind of stuff that would've shipped without him.

Athena ran 10+ minutes of deep research per heartbeat, producing structured competitive analyses with sources. Forge produced 105KB of real technical work — config changes, deployment scripts, SEO fixes. Muse wrote actual articles with proper structure, not the "here's what I would write if I could" output from before.

The system went from zero to productive overnight. Not because I changed the models — they're still running on free-tier Gemini Flash and DeepSeek. Not because I wrote better prompts — the prompts were already good. Because I closed the gap between what agents knew they should do and what they could actually see.

That's the thing about AI agents that nobody warns you about. The model is rarely the bottleneck. The bottleneck is the information pipeline. If your agent can't see the work, it can't do the work. Doesn't matter how smart it is.

Takeaways

What I learned

Agents are as lazy as their context allows

If you don't give them specific, actionable context, they'll produce vague, non-actionable output. Every time. Pre-fetching and materialising data into prompts is more effective than telling agents to fetch it themselves. Don't ask a model to find information — hand it the information and ask it to act.

Quality gates are non-negotiable

Without Socrates auditing every completion, agents would mark placeholder content as "done." They're not lying — they genuinely think a plan is the same as execution. The rule: no agent marks their own work done. Independent verification catches the gaps that self-assessment misses entirely.

Free models work if you work with them

The agents run on Gemini Flash and GPT-5 Nano. The limitation isn't intelligence, it's context. Give them the right information at the right time and even cheap models produce good work. I've seen Gemini Flash write better SEO copy than some paid freelancers, because it had the full brief in its prompt window.

Coordination is the bottleneck, not capability

The agents were always capable of the work. They just couldn't see what work needed doing, and they couldn't avoid stepping on each other. SCRUM MCP + the inbox pre-processor solved a coordination problem, not an AI problem. If your agents aren't delivering, look at what they can see before you look at what they can do.

Services

Want this for your business?

I've spent months figuring out how to make multi-agent AI systems actually productive. Not demo-ware, not proof-of-concept — real work, real output, real business results.

If you're running AI agents and they're not delivering, or you're thinking about automating parts of your business with AI but don't know where to start, I can help. I consult on:

  • Multi-agent system architecture and coordination
  • AI automation strategy for real businesses
  • Paperclip + SCRUM MCP setup and integration
  • Content pipelines, research agents, and autonomous workflows

If this isn't for you, that's fine — SCRUM MCP is open source and you can set it up yourself. But if you want it done right the first time, without spending weeks debugging why your agents are producing 2.8KB of nothing, let's talk.

Multi-agent coordination consulting.