Rob Colantuoni

April 07, 2025

Tags: AI, Infrastructure, and Strategy

From Copilots to Something Weirder

Something shifted in the last year. AI coding tools went from autocomplete-on-steroids to something qualitatively different. The new generation of agents can take a task description, break it into subtasks, write code across multiple files, run tests, interpret failures, iterate — all without a human in the loop for each step.

I’ve been watching this closely, both as someone building infrastructure for these systems and as a practitioner using them daily. The agent paradigm isn’t an incremental improvement over copilot-style assistance. It’s a different model of interaction, with different implications for how we build software.

What’s actually changed

The copilot model — I wrote about this back in 2022 — operates at the line and function level. You write code; the AI suggests completions. You stay in the driver’s seat. The AI is basically a faster keyboard.

The agent model operates at the task level. You describe what you want — “add rate limiting to the API gateway with configurable thresholds per endpoint” — and the agent figures out which files to modify, what tests to write, how to handle edge cases, how to verify the implementation. The difference is autonomy. The copilot has none. The agent has some. And that “some” is growing fast.

Current agents aren’t reliable enough for unsupervised work on complex tasks. They make mistakes — sometimes subtle ones that slip through review. They get stuck in loops. They lack system-level understanding to anticipate how a local change might ripple through the architecture. But they’re remarkably good at a growing category of well-scoped tasks, and they’re improving on a timescale of months, not years.

What this does to engineering roles

If the copilot era shifted value from code production to code judgment, the agent era pushes it further — toward problem definition, system design, and quality assurance.

Problem definition becomes critical. An agent can implement a solution, but it can’t tell you if you’re solving the right problem. Decomposing a business need into well-specified technical tasks — tasks clear enough that an agent can execute them — is becoming a core skill. It’s a form of requirements engineering many of us haven’t practiced explicitly.

System architecture gains importance. Agents work within the architecture you give them. Sound architecture tends to produce sound agent output. A mess produces a faithful reproduction of the mess. The humans defining architecture, establishing patterns, setting conventions — they’re laying the rails the agents run on.

Review and verification become primary activities. I spend more time reviewing agent-generated code than writing code directly now. It’s a different skill. Assessing correctness, security, performance, maintainability without having been the author. Closer to auditing than authoring.

And integration challenges are emerging. As agents handle more implementation, the hard part shifts to orchestrating their work. How do you break a project into tasks that are appropriately scoped? How do you ensure consistency across agent-generated code? What do you do when the agent gets stuck or goes the wrong direction?

What infrastructure has to handle now

From the infrastructure side, agents are creating new demands.

Compute patterns. Agent workloads involve long chains of inference calls with tool use — code gen, test execution, error analysis, revision. Long-running, multi-step processes that don’t fit the request-response model most inference infra is optimized for.

Sandboxed execution. Agents that write and run code need sandboxed environments. Fast, secure, isolated execution at scale — each agent task needs its own workspace with the right dependencies, context, permissions. Significant infrastructure challenge.

Context management. Agents need access to full codebase, docs, historical context. Managing and serving that efficiently — keeping it current, relevant, within context windows — is a data infrastructure problem.

Where we actually are

Let me be clear-eyed: current agents are powerful but not autonomous in any meaningful sense. They need careful task scoping, prompt engineering, human oversight for reliable results. The failure modes are real — hallucinated APIs, subtly wrong logic, security gaps, changes that pass tests but violate unstated assumptions.

The trajectory is compelling though. Each model generation is better at reasoning, following complex instructions, maintaining coherence across longer tasks. The gap between “needs constant supervision” and “can be trusted with well-scoped tasks” is closing.

My prediction: within two years, agent-assisted development will be the dominant mode for implementation work. Humans focus on problem definition, system design, code review, and the integration challenges that arise when you’re orchestrating human and AI contributors. The best engineers won’t be the ones who write the most code. They’ll be the ones who get the most out of their agents while catching the mistakes that matter.

The role is evolving from craftsperson to conductor. Both require deep skill. The skills are different. The engineers who adapt to this shift will thrive. The ones insisting that “real engineering” means writing every line by hand will find themselves increasingly outpaced.