Agentic Engineering Is Vibe Coding in a Blazer

Every few months tech gives an old thing a new name and acts surprised when the old problems show up. “Vibe coding” got rebranded to “agentic engineering” sometime in February. It sounds like it has gravitas. Does it really?

Same workflow. Same risks. New label your vendors can say out loud in a boardroom without anyone wincing.

The rename

Andrej Karpathy declared vibe coding done in February. Simon Willison wrote a long, careful field guide called Agentic Engineering Patterns a few days later. It hit the Hacker News front page twice. Within weeks the term was everywhere: Anthropic’s annual coding report, InfoQ, Addy Osmani’s newsletter, an ACM brief, a thousand LinkedIn carousels.

The patterns in Willison’s guide are real. Test-driven development with agents. Human-in-the-loop checkpoints. Tight context management. The kind of discipline that separates someone who ships software from someone who pastes prompts and prays.

But here’s the thing. Those patterns existed before the label. You could have called them “writing software carefully while using AI tools” and they would have been the same patterns. The rename did one job: it gave the workflow a name that sounds like it belongs on a slide.

A CTO will hear “vibe coding” and reach for the door. They hear “agentic engineering” and they pull up a chair. That’s the entire shift.

What’s actually new

Almost nothing, on the discipline side. The tools got better. Claude Code got better. Cursor got better. MCP downloads went from two million to ninety-seven million in about a year, which is a wild graph if you ever want to look at one.

But the practice of “let an agent do a thing, then check what it did” is not new. It’s just being marketed harder.

The real change since February is the failure mode is getting louder. Amazon Kiro had a bad few days in March where an AI agent skipped a two-person production approval and the storefront lost a chunk of its US sales for the day. A startup called PocketOS lost their production database and every backup in about nine seconds because an over-permissioned agent ran a destructive call with no confirmation step.

Neither of those incidents required new vocabulary to describe. “We let the agent do too much” works fine.

The Willison turn

Here is the part I keep thinking about.

On May 6th, Simon Willison, the guy who wrote the patterns guide, posted that those two categories he carefully separated in February are blurring in his own work. He’s no longer reviewing every line. He’s shipping things to production he hasn’t fully read. “Those things have started to blur for me already, which is quite upsetting” is roughly the quote.

I love that he said it. It’s the most useful thing anyone has written about agentic engineering this year, and it took ten weeks for the author of the field guide to admit the wall he drew was already leaking.

The wall is the whole pitch. Vibe coding on one side, where the hobbyists live, fine for prototypes, fine for weekend stuff. Agentic engineering on the other side, where the professionals live, with their TDD and their checkpoints and their adult supervision. Pay your vendor accordingly.

If the guy who built the wall is climbing over it, the wall is decorative.

My own version of this

GitHub did a big push earlier this year on what they called agentic workflows. There were blog posts and demos and a steady drumbeat across their dev channels for a few weeks. I picked it up, wired it in, shipped a few changes that depended on it.

Today I deleted the whole thing. Pulled the feature out of the codebase. I don’t need it. Honestly I’m not sure I ever did.

That’s the part nobody puts in the rebrand announcement. The trend arrives, you try it, you ship a small change because the documentation makes it look essential, and a quarter later you quietly pull it back out. Net effect on the product: zero. Net effect on my time: not zero.

I do not regret trying it. That’s how I figure out whether something is real for our work or just well-marketed. But this is the rhythm I’ve watched repeat for years now. New label, new push, real engineering hours spent integrating, then a slow walk back to whatever was actually working.

The label changes faster than the engineering does.

What this means if you’re hiring an agency

If you’re a founder or a CTO and your dev shop pitches you “agentic engineering” in the next quarter, that’s fine. You’d hear it from any vendor that wants the contract. The term has stuck. They have to use it.

The question is what’s underneath the word.

A few things to actually ask:

Who approves destructive actions? When the agent decides to drop a column, run a migration, or hit a paid third-party API, what human signs off? If the answer is “the agent is configured carefully,” that’s not an answer. Configurations drift. The Amazon Kiro thing was a configuration problem.

What’s the rollback plan? Not the theoretical one. The actual one. If the agent breaks production at 2am, how long does it take to undo, and who’s awake to do it?

What does the agent’s permission scope include? This is the PocketOS lesson. An over-permissioned token plus an agent plus no confirmation step equals nine seconds to total loss. Permissions are the whole game.

None of those questions require you to understand what “agentic engineering” means. They’re the same questions you’d ask any engineering vendor about any deploy pipeline, just pointed at the part where the human used to stand.

Is this even an LLM problem

This is the question I find myself asking on a third of our projects now. A client describes what they want and someone on a call says “we can solve this with an agent.” Sometimes that’s right.

A lot of the time, what they actually need is a clean algorithm. Some validation. A scheduled job. A query that runs in the background and emails someone when a number crosses a threshold.

The hardest part of being a fractional CTO right now isn’t picking the right LLM. It’s telling someone that the problem they want to throw an agent at would be better solved by twenty lines of code that have been running fine since 2014.

I won’t pretend I always make that call. Sometimes I get caught up in the new thing too. But the discipline of asking “is this a spot for an LLM, or is this a spot for an algorithm” is older than every label we’ve slapped on this stuff, and it’ll outlast the next three.

The label doesn’t matter

Vibe coding became agentic engineering. In another six months it’ll be something else. The patterns underneath, when they’re actually being practiced, are valuable. The label is a coat of paint.

If your vendor’s pitch leans heavily on the new word, push on what’s underneath. If they have good answers about approvals, rollbacks, and permissions, they’re doing the work. If they don’t, the blazer’s just a blazer.

Shameless plug: At Victoria Garland we build Shopify Plus infrastructure with AI tools where they pay off and without them where they don’t. Mostly we just try not to delete prod.