AI development workflow: Turning my codebase into an orchestra

Fair warning: Technical mumbo jumbo ahead

Nov 11, 2025

This post gets into the weeds of how I manage multiple AI agents in a production codebase. If acronyms like MCP, PR, and git worktree make your eyes glaze over, you might want to grab coffee first. Or skip to our less technical posts. No judgment.

Still here? Excellent. Let me show you how I’ve turned software development into an AI orchestra.

The setup: read-only safety net

Here’s the foundation of my workflow:

monorepo-collection/
├── CLAUDE.md               (How to work with this codebase)
├── TECHNICAL_OVERVIEW.md   (Architecture, patterns, conventions)
├── main/                   (READ-ONLY, always on main branch)
├── agent-1-feature/        (git worktree)
├── agent-2-bugfix/         (git worktree)
├── agent-3-refactor/       (git worktree)
└── ...

The main folder is sacred. It’s read-only, always on the main branch, never touched by agents. This is my safety net – a clean reference that agents can read but never corrupt. When an agent needs to understand our codebase, they look here. When they need to edit, they work elsewhere.

The orchestra begins

My day starts by spinning up multiple Claude Code agents. Each gets assigned differently:

Agent 1: Gets a Linear ticket through my MCP (Model Context Protocol) connection. It reads the ticket, understands requirements, checks related issues.
Agent 2: Gets a brain dump: “Hey, that thing we discussed about optimizing the skill execution pipeline...”
Agent 3: Gets context from a Slack conversation: “Users are reporting slow load times on the dashboard”

Every agent has access to our global context document – a Claude-written markdown file that explains our architecture, where features live, coding standards, common patterns. It’s like giving each agent a company handbook on day one.

The plan before the code

Here’s where it gets interesting. Before any agent writes a single line, they create a plan. (Shift+Tab twice in Claude Code to enter plan mode, for those following along.)

The plan includes:

Understanding of the problem
Proposed solution approach
Files they’ll need to modify
Potential risks or dependencies
Questions if anything’s unclear

I review these plans like a technical lead reviewing design docs. “Agent 2, you’re missing the edge case where users have no skills. Agent 3, that optimization will break our caching layer.”

Parallel development on steroids

Once I approve a plan, the magic happens. Each agent automatically:

Creates a new git worktree: git worktree add ../agent-{timestamp}-{feature-name}
Starts implementing in their isolated environment
Commits with meaningful messages
Pushes to a feature branch
Creates a PR following our template

I’m running 5-6 of these in parallel. While Agent 1 is implementing authentication changes, Agent 2 is refactoring our notification system, and Agent 3 is adding new API endpoints.

The review loop of madness

Here’s where it gets meta. Once an agent creates a PR, a GitHub Action spins up another Claude instance to review the code. An AI reviewing an AI’s code. We’re through the looking glass here.

The review catches things like:

Style violations the linting missed
Potential performance issues
Security concerns
Missing test cases
Architectural inconsistencies

Then I review the review AND the code. It’s reviews all the way down. I’m checking:

Did the agent understand the requirements?
Did the reviewer catch all issues?
Are there subtle bugs both AIs missed?
Does this actually solve the user’s problem?

The new bottleneck: human verification

This workflow has completely flipped our constraints. Before, I was limited by how fast I could type. Now I’m limited by:

Testing capacity: I can generate 20 PRs a day, but can I properly test 20 features?
Context switching: Jumping between 6 different features taxes my brain like nothing else
Quality assurance: Ensuring each feature actually works in production, handles edge cases, provides good UX

We’ve gone from “we need more developers” to “we need more testers and reviewers.” It’s a fundamental shift in how software teams need to operate.

Evolving my verification flow

I’m constantly improving my guardrails to reduce time in verification:

Automated smoke tests: Every PR now auto-runs through user journey tests. If users can’t complete basic flows, I don’t even look at it.

AI-written test plans: I have agents write comprehensive test plans for their own features. “Here’s what to test, here’s what could break, here’s the edge cases.”

Preview environments: Every PR gets its own deployed preview. No more “works on my machine” – I can click a link and see it running.

Structured review templates: Instead of freestyle reviewing, I follow checklists. Does it handle errors? Is the UX consistent? Are there security implications?

The goal is to shift from “I need to deeply understand this code” to “I need to verify this works correctly.” Different skill, different mindset.

The mental model shift

The biggest change? I’m no longer thinking in terms of “how do I implement this?” but rather:

How do I decompose this for AI understanding?
What context does the agent need?
How can I verify this was built correctly?
What’s the highest-leverage use of my human judgment?

I’m rebuilding my entire approach to software development every few weeks as I discover new patterns and optimizations. It’s exhilarating and exhausting.

What actually breaks

Let me be real about what doesn’t work:

Agents sometimes create circular dependencies between PRs
They occasionally modify files in unexpected ways
The review agent might miss context from other parallel work
Integration tests become nightmares with 6 parallel feature branches

But the productivity gains are so massive that working through these issues is worth it.

From clean code to great product

Here’s something wild I’m discovering: our codebase needs to evolve for our new colleagues. I’ve been tracking every error Claude Code hits, and patterns emerge. The agent expects files in certain places, looks for documentation that doesn’t exist, assumes conventions we never established.

But here’s the real insight: we’re not moving from “Clean Code” to some other coding philosophy. We’re moving from “Clean Code” to “Great Product”.

None of your customers care about your code quality. They don’t care about your test coverage, your linting rules, or your elegant abstractions. They care about whether your product works, delights them, and solves their problems.

LLMs let us stop obsessing over code craftsmanship and start obsessing over product excellence. The code becomes a means to an end, not the end itself.

What great product development looks like

Documentation everywhere: Markdown files in every directory explaining what lives there. Humans are too lazy to maintain docs, but LLMs never skip documentation. They’ll update that README.md every single time they touch the code.

Explicit over implicit: That clever convention you have where userService automatically connects to the user database? Spell it out. AI agents don’t do tribal knowledge.

Claude Code hooks: .claude/ directories with agent-specific instructions. “When working in this directory, always run these tests.” “This service has these dependencies.”

Aggressive automation: Pre-commit hooks, linters, formatters – everything that can be automated, should be. Not because humans need it (we ignore half of it anyway) but because agents follow rules religiously.

Error messages that teach: Instead of Error: Invalid config, we need Error: Invalid config. Expected format: {host: string, port: number}. See docs/configuration.md.

I’m literally restructuring our entire repository not for my human colleagues, but for my AI ones. And you know what? It’s liberating. Instead of agonizing over the perfect abstraction, I’m shipping features. Instead of refactoring for elegance, I’m improving user experience.

The code is messier. The documentation is everywhere. There’s redundancy. And our product has never been better.

The future is already here

This workflow sounds insane because it is. I’m managing a team of AI developers, with AI reviewers, in parallel branches, shipping features at a pace that would have required a team of 10 just a year ago.

It’s not perfect. It’s not even stable – I’m tweaking the process constantly. But it’s so powerful that going back to solo development feels like trying to build a house with a toy hammer.

The bottleneck has shifted from creation to verification. The challenge has moved from “can we build it?” to “can we ensure it works?”

Welcome to the weird, wild future of software development. Bring coffee. Lots of coffee.

Building the future of AI-assisted development at neople.io. Follow our journey as we figure out what it means when AI can code faster than humans can review.

Comments

Ready for more?