Share

The Brain and the Muscle

The Brain and the Muscle

A practical guide to spec-driven development — built around one real website, so you can follow along and try it yourself.

If you have used an AI coding agent, you already know the feeling. You type "add a contact form to my page" and look at what comes back. It is a contact form — but it has no email field, it does not validate anything, and it says nothing about what should happen when someone submits it. Close to what you wanted, and wrong in two or three ways that matter. So you point out the mistakes, it tries again, you correct it again, and eventually you arrive at something acceptable. Then you close the window, and the entire dialogue that produced it vanishes, unsaved.

This is vibe coding, and there is nothing wrong with it for a single form. It is fast, conversational, and perfectly adequate for a small throwaway task. The problem is that it does not scale. High-level prompts produce quick results, but they also produce disposable code and mounting technical debt. The moment the project is larger than one form — the moment it must be maintained, extended, or understood by anyone other than you on the afternoon you wrote it — the conversational approach starts to work against you.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep free lifetime access before the paid tier launches.

Claim free lifetime access


There is a more disciplined alternative: spec-driven development, or SDD — the professional response to the chaos of unsupervised AI generation. The idea at its center is simple. Instead of describing what you want in a disposable conversation, you write it down as a specification — a clear, maintained document explaining the what and the why — and how the code the agent generates. The spec becomes a permanent technical artifact: a contract between you and your collaborators, and now also between you and the agent.

If there is one sentence to carry through this issue, it is this: the agent is the muscle, but the spec is the brain. The agent provides the speed and the execution. You provide the blueprint. Your job, increasingly, is not to write the code but to convert your intentions into specifications clear enough that a fast, capable agent can build the right thing from them.

To keep this concrete, we will build one website across the whole issue, watching a real spec take shape rather than talking about SDD in the abstract.

Our example: TheJukebox

Imagine a members' karaoke club — call it TheJukebox — with a group of regulars and recurring karaoke nights. They want a small website of exactly three pages:

  • home page that introduces the club to its members.
  • past-events page with videos and photos from previous nights.
  • comments page where members leave their thoughts about events that have happened, which is to say, the contact form from a moment ago, now with a real job to do.

We will assume a modern stack: a Next.js / React site. Keep it in mind, because every concept that follows is shown against it. By the end, you will not just understand SDD — you will have seen it applied, step by step, to something you could actually build.

How it actually works

The core move in SDD is to give the agent the best possible context before it writes anything — first about the project as a whole, then about each feature. You are not improvising in a conversation; you are supplying decisions, in writing, that the agent builds from.

This raises the single most important skill in the approach: knowing the right level of detail. Too vague and the agent guesses; too prescriptive and you have written the code yourself. The reliable way to find the balance — the analogy that runs through this whole method — is to treat the agent as a highly capable pair programmer. Give it plenty of context about what it cannot know: the goals, the mission, the audience, the constraints. Give it less about the low-level decisions it can make for itself. For TheJukebox, that means telling it who the members are and that the site is for reliving past nights, not selling tickets — and not telling it which React hook to use or how to name a CSS class. Tell it what you want and why; let it work out the how.

Start with a Constitution

The workflow begins at the project level, with what is often called a Constitution — a document that captures the decisions that govern everything else. It answers three questions, and it is easiest to see what each one means by writing it for TheJukebox.

The mission explains the why. For our club, it might read:

A website for the regulars of TheJukebox to relive past karaoke nights and stay connected between events. The audience is existing members, not the general public. In scope: introducing the club, showing photos and videos from past events, and letting members leave comments about those events. Out of scope: ticket sales, bookings, payments, and public sign-ups.

Notice how much that short paragraph rules out. Because you defined the scope, the agent knows not to build a login wall, a reservation system, or a payment flow — the very things an unguided agent loves to add uninvited.

The tech stack is the shared understanding of the technologies and constraints — how the thing is built and deployed. For TheJukebox:

A Next.js / React site of three pages. Videos and photos are embedded from an existing hosting service rather than stored by us. A small backend stores members' comments. Deployed as a static-first site wherever possible.

That single statement keeps humans and agent building on the same ground, and stops the agent reaching for technology the project does not need.

The roadmap is a living document: a sequence of phases, each of which will later be built out through its own feature-level spec. Ours is almost dictated by the three pages:

Phase 1 — the home page introducing the club. Phase 2 — the past-events page with embedded videos and photos. Phase 3 — the comments feature, where members leave thoughts on past events.

The Constitution matters because it is an agreement on two fronts at once — between the humans on the project, and between the humans and the agent. (You may have seen developers use a top-level agents.md file for something similar. A Constitution does the same job but is more structured, and it is agent-agnostic — it is not tied to any one tool.)

You write the Constitution with the agent

Here is the part that surprises people: you do not write the Constitution alone and hand it over. You write it in conversation with the agent, which turns out to be a genuinely useful collaborator. As you describe TheJukebox, it asks questions you had not thought to ask — and answering them is how the spec gets sharp.

In practice: you give the agent your project description — "a private site for our karaoke club's members to relive past nights" — point it at any input your stakeholders have written down (a READMEnote from the organizer), and ask it to work with you on the three pieces. Ask it to keep the roadmap in small steps, because SDD works best when a human reviews small changes rather than waving through one enormous plan.

Then the questions start, and this is where the value shows. The agent might ask what tone the site should take — for TheJukebox, warm and playful rather than corporate. It might be noted that you never said where comments are stored and suggest a lightweight database. It might surface a package that already solves part of the problem, or name a tradeoff you were glossing over. These are exactly the gaps a vibe-coded session leaves silent, flushed out before any code exists.

When the interview is done, the agent writes the result to your project — typically three files in a specs directory: mission.mdtech-stack.md, and roadmap.md. (Most agents ask permission before writing, which is the point: changes stay under your control.)

Now the genuinely human-in-the-loop step: you review what it wrote. Maybe the mission left out the target audience — reasonable, since the agent cannot know your members. Here is a small but important habit: rather than editing the file by hand, ask the agent to make the change. Edit one document manually, and you will eventually forget to update a related one, and your specs will drift out of sync. Allowing the agent to edit keeps all artifacts consistent.

Do a final review, then commit the Constitution. From here, it is a living document: the shared brain of the project — for the agent, for collaborators, and for you in six months when you have forgotten why you decided what you decided.

Then work feature by feature

Once the Constitution is drafted, each feature follows the same repeatable loop: plan it, implement it, validate it. The temptation is to jump straight to "implement" — but the plan is where the real work happens, so it is worth slowing down to see what a feature spec actually contains.

Take Phase 3 of TheJukebox — the comments feature, the form we started with. Before any code, two small setup habits matter. Start the agent with fresh context rather than continuing a cluttered earlier conversation; it does not need your old chat history because it can get all the authoritative information from the Constitution. And do the work on a separate branch, so the feature can be built and reviewed in isolation. Then you open a conversation with the agent to produce the feature spec, and a good feature spec has three parts.

If you work with enterprise data platforms, migrations, performance tuning, or AI-driven delivery teams, DWHPro Letters is written for you. Get the next issue by email.

The plan is the approach: what will be built, in what sequence, and how you will know it worked. For the comments feature, that is the shape of the thing — a form attached to each past event, the comments displayed beneath it, stored somewhere, and shown back to members.

The requirements are the specific decisions and constraints. This is where you answer the questions the vibe-coded version silently skipped: What fields does a comment have — name, which event, the message, perhaps a rating of the night? What counts as valid input? What happens when a member submits? That surface stores the comment, and does it appear immediately or only? The feature works, but it has not been merged yet. after review? This is the place to pin genuine technical constraints — but not to dictate trivia. State that comments must be tied to a specific event; do not tell the agent what to name its variables. Control the process; do not oversteer it.

The validation is the scorecard: how the agent (and you) can confirm it got it right. For the comments feature that might be as concrete as "a member can submit a comment on a past event, and it appears in the list for that event." Make sure success is something that can actually be checked.

The agent will ask you to make the key decisions along the way, and you should pay close attention to conflicts or problems that surface. You do not have to accept the agent's proposed solutions, and you should clarify anything that bothers you. When the spec is drafted, you review all three documents, fix anything wrong by asking the agent (so the plan, requirements, and validation stay in sync with one another), and commit. Only then does the agent implement, after which you validate against the scorecard and accept the result — or send it back.

Why is it worth this much care before a line of code? Because of the leverage. The few sentences you write in a feature spec will expand downstream into hundreds of lines of code. Time spent getting the spec right is the highest-value time in the whole process.

Once the spec is committed, implementation is almost anticlimactic — which is the point. You clear the agent's context again (a clean slate, working only from the spec) and tell it to build. You can let it implement the whole feature at once, or hand it one task group at a time for smaller, safer commits — worth choosing deliberately wherever a small mistake compounds later, like anything touching security or the database. As it works, you watch the changes appear in your commit history, giving you an early start on review. Then you run it: for TheJukebox, you submit a test comment on a past event and watch it appear in the list.

The feature works, but it has not been merged yet. The last step is the review, and what matters most is the altitude from which you review. Focus on the high-level questions — does it work, does it match the spec? Not on which CSS classes the agent chose. You are checking intent, not trivia.

This review often reveals something subtle and valuable: a flaw in the code that is really a flaw in the plan. Suppose the comments page came out far too bare, and you realize you never asked for a proper page layout. The mistake is not the agent's; it faithfully built an underspecified plan. So the fix is not just to patch the code but to correct the spec and the implementation together, keeping the two in step. This generate-then-verify rhythm, with you confirming each step, is the human-in-the-loop that makes the method safe.

There is a reason to keep these review increments small, and developers have a name for it: cognitive debt — the mounting mental load of tracking what fast-generated code is doing and how it has changed. Because an agent produces code far faster than you would by hand, that load accumulates quickly. Manageable, well-scoped changes keep you genuinely in command of your project rather than nodding along to code you no longer understand.

One temptation to resist: when you spot something easy to tidy by hand, the instinct is to just fix it in the editor. Don't. Same discipline as before — a manual edit risks leaving the specs and README out of sync. Ask the agent to make even the easy change. Then, once it confirms nothing broke, you mark the work complete and merge.

This is also where the difference from the opening becomes visible. The vibe-coded form "had no email field and no validation" because no one had decided whether it needed them. The spec-driven version answers those questions before the agent writes a line, so the agent builds the right thing the first time.

Replan between features

After the first feature ships, the urge is to charge into the next one. Resist it. Between features comes a step that is easy to skip, but is where the method compounds: replanning. The principle is worth stating bluntly — you have to run slow to run fast. A little reflection between features is what keeps the next ten from going sideways.

Replanning is where you revise the Constitution in light of what you just learned. Because it is a living document, make these updates on their own branch, so you can track which version of the specs produced which code. Say, building the comments feature taught you that you never recorded your testing preferences — replanning is when you add that to the tech stack and ask the agent to bring the existing specs and code into line.

It is also where genuine product changes land, and here, a judgment call matters. Suppose word comes that many of TheJukebox's members visit on their phones, so the site needs to be responsive. Because the project is still young, it is small enough to do directly during replanning — tell the agent, and have it correct the product spec, the feature specs, and the code together, so the decision lives in the specs and not just the implementation. But if the work were large, schedule it on the roadmap as its own phase instead. Small fixes inline, big features on the roadmap.

And replanning is not only about features. It can be about the whole project — revisiting the roadmap and noticing that several later items actually hang together and should be one step rather than four. It can even be about improving your workflow itself. This is where skills come in: a skill is a reusable package of instructions that gives the agent a new, repeatable capability tailored to how you work. If non-technical members of TheJukebox want to follow progress, you might build a skill that updates a changelog on every merge — written, fittingly, with the agent's help — living either in this one project or across all of them. Validation steps like linting, formatting, and test writing are natural candidates, too: define the repeatable process once and do less manual work forever.

The deeper point sits underneath all of this. Once most of your work is planning and validation rather than implementing, replanning is not overhead — it is the work. The spec is not written once and abandoned; it improves as the project progresses, which is precisely what keeps the project from drifting.

Keeping it sustainable across many features

One feature is easy. The real test is the tenth, and a predictable pain point waits there: AI fatigue. An agent generates an enormous volume of code in very little time, and reviewing all of it, feature after feature, is genuinely exhausting. Left unmanaged, the review becomes the bottleneck — and a tired reviewer waves things through, which defeats the point.

The main defense is a clean break between features. Rather than sliding from one into the next in a blur, start each from a known state. A short pre-flight check does it: Is the previous branch merged? Any unfinished work? Is the next roadmap item still right? Have I cleared the agent's context, so the specs carry the intent rather than a stale snapshot of an earlier chat? That last point matters twice — it keeps the specs authoritative and frees the agent's limited context budget for the work ahead.

Two more habits keep fatigue down. First, an attitude: when a review turns up something you never specified — a convention you would have wanted, a detail you forgot — that omission is not a failure. It is the spec doing its job. You found a new detail, you capture it, and every future feature benefits.

Second, a technique for when you need to do a deep review of the whole project against the latest change. This gives the review room to think, and using separate sub-agents keeps the main agent's context clean rather than polluting it. The sub-agents return issues and recommendations, the agent fixes them and re-runs the tests, and you have caught problems that a single quick look would have missed. A second, deeper look usually finds something worth finding.

Done consistently, these habits turn the loop into something you can run indefinitely without burning out — the difference between a clever trick you try once and a practice you actually adopt.

If you want one image to hold onto, it is this. You are an architect handing detailed drawings to a team of builders. You design, supervise, review, and accept the result, or request changes. What you do not do is stand over the builders explaining how to lay each brick — you give them the context they could not have known and let them apply their skill to the rest.

For TheJukebox, you decide that the site is for members reliving past nights, that it has three pages, and that comments are attached to specific events. The agent turns "a warm, simple home page" into the actual hundreds of lines of React and CSS. You hold the drawings; it lays the bricks.

That is the whole posture of spec-driven development. It takes you from thinking clearly at the start to delivering at the finish and, because of the replanning step, keeps you improving from there. The agent is the muscle. You hold the drawings.

Bring back the engineering

Step back and look at the distance covered. We began with vibe coding, where you describe what you want and hope the agent obliges. We end somewhere more deliberate: a written Constitution, features planned through conversation, work validated by a human in the loop, the process tightened over time with skills and standards. That is the difference between hoping for what you want and engineering what you actually need.

The deeper shift is one of stance. Much of the unease around AI-generated code stems from watching things move too fast — code appearing faster than anyone can understand, decisions made implicitly, and then forgotten. Spec-driven development is correct. It brings the hard-won lessons of software engineering back to a process that had begun to abandon them, and it keeps you in the driver's seat rather than along for the ride.

And there is a quieter payoff that shows up only later. The specs you write today become the memory of your project tomorrow — the record of what you decided and why, still legible long after the conversations that produced the code have scrolled away. Keep them sharp. Keep improving the process. The best code starts with a great spec.

For TheJukebox, that is a three-page site for a karaoke club. For your own work, it may be something far larger. The method is the same at any size — and now that you have seen it end to end, you have everything you need to try it on something real.

that the site is for members reliving past nights, that it has three pages, and that comments are attached


Trying to understand what AI means for data engineering work?

I write about the parts of IT work that are actually changing — and the parts companies still misunderstand.

Subscribe before the paid tier launches and keep free lifetime access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe