I had a conversation last week with a VP of Engineering who told me his infrastructure team “doesn’t have time” for infrastructure as code. They’re too busy fighting fires, dealing with snowflake servers, and responding to urgent requests from product teams. The irony is painful.
This is a pattern I run into constantly. Teams that are so busy dealing with the consequences of unmanaged infrastructure that they can’t invest in managing it properly. It’s like being too busy bailing water to patch the hull. At some point you have to stop bailing.
I’ll just say it: it’s 2019. Infrastructure as code is not a nice-to-have. If your infra isn’t codified, version-controlled, and reproducible, you’re building on sand and it’s going to bite you.
Every piece of your infrastructure – servers, networks, load balancers, DNS, firewall rules, IAM policies, database config – should be defined in code that lives in a repo, gets reviewed via pull requests, and gets applied through an automated pipeline. Terraform, CloudFormation, Pulumi, whatever. The specific tool matters way less than the principle: your infrastructure should be a deterministic function of the code in your repo. Same code, same inputs, same infrastructure. Every time.
Why does this matter? A few reasons that I care about personally:
You can reproduce any environment quickly. Staging, prod, DR – all from the same code. I watched a team recover from a full region failure in under an hour because their entire infrastructure was codified. They just stood it up somewhere else. Disaster recovery went from “hope and pray” to “run the pipeline.”
You get auditability for free. Every change flows through version control. “Who changed this firewall rule and when?” is always answered by the git log. This is table stakes for compliance but it’s also huge for debugging. When something breaks, the first question is always “what changed?” and with IaC, the answer is always findable.
Infrastructure changes get reviewed like code. I’ve lost count of how many times a PR review caught a bad security group change or a busted DNS record before it hit production. A second pair of eyes on infra changes isn’t optional.
And counterintuitively, it makes you faster. The upfront investment is real, but it pays off fast. Spinning up a new environment goes from a day of clicking around the AWS console to a ten-minute pipeline run. Manual config is fast for the first change and then progressively slower and more error-prone for every change after that.
Most teams I’ve talked to are stuck somewhere between “everything is manual” and “we have some shell scripts.” The manual stage is obvious – everything is configured through web consoles and SSH, knowledge lives in people’s heads, environments drift, recovery is slow. The scripts stage is better but not by much – scripts are fragile, they get stale, they cover 70% of the setup and then you have to do the other 30% by hand.
The real jump happens when you move to declarative config – Terraform or whatever – where you describe the desired state and the tool figures out how to get there. That’s the minimum viable version of IaC, and it’s where the transformation happens.
From there you can mature into reusable modules, self-service platforms, all that good stuff. But honestly, if you can just get to “our infra is in Terraform and changes go through PRs,” you’re ahead of 80% of the industry.
“It’s too slow.” Slow compared to what? Clicking around in a console? Sure, for the first time you set something up. But the second time? The fifth time? The twentieth time? Codified infra gets faster every iteration. Manual infra gets slower.
“Our stuff is too complex to codify.” If it’s too complex to describe in code, it’s too complex to manage by hand. The process of writing it down forces you to understand it and simplify it. That’s a feature.
“We’ll get to it later.” Every week you wait, your infrastructure drifts further from anything you could describe declaratively. The migration cost compounds. Just start. Codify one thing today. One more tomorrow. It adds up.
I think we’re approaching a point where IaC is as uncontroversial as version control for application code. The question won’t be “should we?” but “why haven’t we?” The teams that make this investment now are going to have a structural advantage in reliability and velocity. The ones that don’t are going to wonder why they can’t keep up.