Every engineering org I’ve worked in has this recurring argument: do we focus on quality or do we focus on speed? It comes up in planning meetings, retros, 1:1s. And every time it comes up I want to bang my head on the table because the framing is wrong. The best teams I’ve been part of didn’t choose between the two. They figured out how to make them reinforce each other. That’s not idealism, it’s just good engineering.
But it only works if the culture supports it.
When I say “culture” I don’t mean the ping-pong table or the beer fridge or whatever your recruiter puts on the careers page. I mean the default behaviors that kick in when nobody’s watching. What happens when a deploy breaks at 4:55 on a Friday – does the team rally or does everyone suddenly have somewhere to be? When someone causes an outage, does the team blame them or does it write a blameless post-mortem? When a junior pushes back on a senior’s design, is that seen as insubordination or is it seen as a sign that the culture is working?
Those micro-moments define everything. And they’re set almost entirely by whatever the most senior people do – managers and ICs both. If your staff engineer writes a one-line incident summary and calls it a post-mortem, that’s the standard now. If your EM rewards the person who heroically saved the weekend deploy but ignores the person who fixed the build pipeline so deploys stop breaking in the first place… well, you’re going to get a lot more heroes and a lot more broken deploys.
I’ve been paying attention to this for years, and a few patterns keep showing up on teams that deliver consistently without burning people out:
People need to feel safe enough to say “I screwed up” or “I don’t understand” without it hurting their career. Amy Edmondson’s research on psychological safety is worth reading – the short version is that if people are afraid to be wrong, the team’s intelligence gets capped at whoever is most senior. Which is a terrible way to build software.
Ownership needs to be real but not territorial. Someone should care deeply about every system, every service, every pipeline. But “I own this” can’t mean “only I can touch it.” Ownership means you care about the quality and the direction, not that you’ve walled off a fiefdom.
Crunch is a smell. If the team is consistently working late, something upstream is broken – the planning is wrong, the automation is missing, the scope is out of control. The most productive teams I’ve seen work reasonable hours because they’ve invested in the tooling and testing that multiplies what they can do per hour.
And the feedback loop has to be tight. Ship small. Review fast. Deploy continuously. If it takes two weeks from “I had an idea” to “it’s in production and I can see how it’s behaving,” you’re learning way too slowly.
Hero culture is the worst. Some teams celebrate the person who saved the day during the big outage. Sounds nice, feels great in the moment. But it creates an incentive to have big outages. If heroes get rewarded, nobody invests in prevention.
Consensus paralysis is another one. You want people’s input on decisions, but if every decision requires everyone’s buy-in, nothing moves. “Disagree and commit” is an underrated principle. Sometimes you just need someone to make the call.
And then there’s process worship – when following the process becomes more important than achieving the outcome. Process is supposed to encode good judgment so it scales. When it becomes the goal instead of the tool, you’ve lost the thread.
I’ve started thinking about culture the same way I think about a product. You design it deliberately. You iterate on it based on what you observe. You maintain it, because it degrades if you don’t. It’s not soft. It’s not a distraction from the real work. It is the real work, because literally everything else – your architecture, your velocity, your retention, your reliability – is downstream of it.