2005: your infrastructure is automated using a handful of Bash, Perl and Python scripts written by two system administrators. They are custom, sometimes brittle and get rewritten every 5 years.
2022: your infrastructure is automated using 10 extremely complex devops tools. You automated the two system administrators away - but then had to hire 5 DevOps engineers paid 2x more. The total complexity is 10x.
They wrote YAML, TOML, plus Ansible, Pulumi, Terraform scripts. They are custom, sometimes brittle and get rewritten every 3 years.
EDIT: to the people claiming that today's infra does more things... No, I'm comparing stuff with the same levels of availability, same deployment times, same security updates.
2005 your average box served some php and static assets, connecting to some generic relational database. Reading logs means grepping files over ssh.
2022 your architecture runs in the cloud, has multiple flavors of databases, queues, caches, and so on. You have at least two orders of magnitude more complexity because you aren’t just serving a web page anymore - you handle payments, integrate with other services, queue tasks for later, and so on. Your automation may be an order of magnitude more complex than 2005, but it enables two orders of magnitude more functionality.
My classic C programmer curmudgeon take is that the root of the problem, as with everything else in this industry, is bad software built on top of bad software built on top of bad software... and on it goes.
The systems in your 2022 world are hard to test and maintain because they are bad, and the tools we built to test and maintain them are largely built on the same foundational ideas and technologies, so they are even worse.
We're going to have to rip everything back down to the foundation in order to make progress beyond finger-pointing (X is DevOps but Y is software engineering and Z is IT admin).
> My classic C programmer curmudgeon take is that the root of the problem, as with everything else in this industry, is bad software built on top of bad software built on top of bad software... and on it goes.
No, not really. Nowadays things are way way better than how they were a decade ago. Nowadays you can get pipelines that build and test on multiple target platforms with a dozen or so lines of code, and things run so well that they became reliable infrastructure integrated with your prod system.
Back then you had none of this, and had to pay the salary for a "build engineer" who worked like a wizard stirring a cauldron just to get a single production build out of the door.
I’ve been cultivating this opinion that it’s the other way around. In any other domain, few people use crap tools to make something amazing, and I think it comes down to what you surround yourself with informs what you make.
Back in the late 90's I worked on a system that shipped on 4 different chip architectures, 4 different (unix-based) operating systems, dealt with different endianness and was more reliable and easier to understand. And it was more responsive to users with 1990's hardware than stuff is today. :shrug:
Oh sure. Can you try to set up ANY CI pipelines to make it more or less working, don't worry about "pipelines that build and test on multiple target platforms" and get back to share with us your experiences?
How much code did you write to get it barely working all by yourself? Or you were just working on top of someone else's work? I did it a couple of times, it was never something like "dozens lines of code".
I don't think we need to muddy the water by saying bad software. Just write the word software. It's multiple layers upon layers. This isn't automatically bad.
There's so much more going on. I don't know if the complexity is all required but we're doing more now. There's just more going on. That's how it is. We need the abstractions. Maybe not all of them. But we need more abstractions and tools running now. There's so much to manage.
I don't see us reducing this complexity that much other than consolidation and some simplification to clean things as they get sorted further. But there is still a need for the layers. It's not 2005. Things are more complex.
The nearest analogy is would you like to code everything in line numbered BASIC or would you like to do it using some modern equivalent? Even with the additional layers there has been real progress in the tools. Improvements across a multitude of metrics. It's not all complexity for the sake of complexity.
Smile. It's another stage of progress. The old mess will fall away. At the next stage what we see as an improvement now will be the next stage's mess.
There's a whole slew of problems that will need even newer tools.
We haven't even scratched the surface of the automation we'll require in ten years time.
What software is bad precisely? The modern browsers? The tools we use to build modern browser apps? Perhaps your gripe is with something else entirely, like modern databases? Or is it the OS that you don't like?
Some people feel anything not written or created by them is "bad".
It doesn't matter to me who turned the wrench, but in my career I've encountered these types where it's only good enough if it's theirs. And IMHO their output is generally quite atrocious. Root cause: Psychological issues.
We have way too much empathy for bad software. Honestly, it's almost all total shit. If my care was as bad as the median level of software, it would last 30 days and then be broken down in my driveway until the scrapper hauls it off.
What part of the stack you think of? Because my experience is that the underlying stuff Just Works^TM. Sure, it does have the occasional hiccup, but 1 out of 1000, it is some higher level tool that has a bug.
Just to share the terminology, under underlying stuff I mean the linux kernel, the JVM, etc, while higher level stuff could be certain build tools (khm, looking at you npm), and of course end user tools that are unfortunately way too buggy. Like I have a hard time listing end user applications where I didn’t notice a bug yet.
I feel like there's an opportunity missed here with the transition to clustering for us to build a small kernel for running simple services and moving things over to run on top of these things.
Give me a tight, highly coherent API, with a culture of avoiding feature factory work. At least get "Make it work, make it right, make it fast" to not skip "make it right" anymore. That's sort of the core illness in software today.
Open source doesn't have to be pushed by business concerns. There's a degree of regulatory capture, but the biggest problem is that we just don't know any better. We repeat what we see, and make sure our own pain points are handled. I've seen this play out in API design where an awful API is introduced, and each person who fixes it only fixes 20% of the pain, and so it's 10 years before we get from a mediocre library to one you could actually call 'good', because nobody made 'good', they made better than better than better than ridiculous.
The sad part is when encountering this situation and the person who you wish could develop some empathy is completely unreachable and hostile. Terminal.
Honestly it's not always a gift to have someone point out to you the 'movie continuity errors' to you all the time.
I tell my boss you want 2-3 people like me on the project. If everyone was like me we'd make each other miserable. I have worked on trying to frame these things in a positive manner. Unfortunately the guy who would have been the best mentor in that regard went into semi-retirement.
They said that the world is bad software on top of bad software, so it’s not one thing. It’s everything being bad and working around everything else being bad.
> They said that the world is bad software on top of bad software, so it’s not one thing.
I don't know about OP, but I personally feel like software like GCC/clang/MSVC and Debian/Ubuntu and Docker and SSH and Git are pretty great, and were never better.
Heck, the whole dotnet ecosystem is turning some significant problems in developer experience into at most minor nuisances.
Even Firefox+Chrome are stellar, and extremely solid as-is.
Where exactly is this bad software OP is talking about?
In my experience it's the custom scripts in an attempt to separate infra from developers.
If you are using one of the major cloud providers and instead of just using a native IaC scripts, you have decided that it would be better if the developers only had access to your custom kubernetes operators to manage their infrastructure then you are the problem.
If you are using a deployment pipeline that developers have zero involvement in, and they just "import your Jenkins scripts" then you are the problem.
If you are using major cloud provider and rather than just using their managed kubernetes, you have decided to deploy your own, then the chances are that you are the problem.
Rarely have these approaches ever been stable, in my 20+ year experience they have never been stable. And they are the reason that "real DevOps" came along because developers could do a better job with less downtime if people actually engaged them.
Yea but they didn’t explain why its bad. Software devs complaining about “bad software” is basically an industry trope at this point but often it just means “Its too complex and I don’t understand why that complexity is probably necessary” or “Its not written the way I would have written it”.
To whom does "we" refer. The classic C programmer curmudgeon. Those commenting on HN frequently use the term "we" but reading HN one can see that, amongst those commenting, there is a divergence of opinions about software. Who is "we".
Asking prospective "DevOps", "software engineer" and "IT admin" candidates to program in C is an effective way to weed out those who are not truly capable of writing "good" software. It is easy to find mistakes and expose the incompetent. That may offend the incompetent who believe they are "good" programmers. Thus there are new programming languages created every year, more people using them to write "bad" software, and more people who feel emboldened to attack C as the source of problems, instead of "bad" programmers.
C is truth serum. It is difficult for people programming in C to pretend they are "good" programmmers. They will be found out. It is funny how people on HN attack the language for exposing so many "bad" programmers. Under this theory, programmers are absolved of all responsibility for their mistakes. The language is at fault. "Good" programmers do not blame languages for their own mistakes.
IMHO, there is sometimes a benefit to languages that make it more difficult, not easier, to create and sustain Rube Goldberg complexity. Not to mention languages that compile to smaller, faster programs. "Programmer productivity" is ambiguous. For example, it could be a positive for a programmer trying to justify the expense their salary presents to an employer or it could be a negative to people who are forced to use an ever-increasing quantity of "bad" software. It could be a positive to a programmer who wants to keep implementing new "features" or it could be a negative to a software user who dislikes "feature creep".
The dichotomy of good versus bad software is such a subjective topic that the term "we" really needs to be defined. Different groups of people have different interests and therefore different opinions.
Hey, I've written more C code than anything else and was generally considered pretty competent.
> The language is at fault. "Good" programmers do not blame languages for their mistakes.
I also know that my time to get something functional in higher level languages is often 10x less, and my probability of having very subtle bugs to hunt is much lower.
There's a super-weird tradeoff here. There's all kinds of modern, high-level techniques that improve programmer productivity and reliability of bigger systems.. They convert people who can't be productive C programmers into doing OK work.
But they're also slow and a bit opaque in how they work. And it's really easy to run out of performance and have to do exotic things to get them to scale, in which case you have to build bigger systems still and cede a lot of those advantages.
Whereas, if you write to the metal, you can get a lot of performance out of a single large computer.
I kinda knew this comment would show up. It's completely irrelevant / orthogonal to what I'm saying. But I knew someone would have to beat their favorite horse :D
Yes, Rust may have somewhat better programmer productivity than C, but it's not really at a massively higher level of abstraction.
That truth serum sure has a long list of memory vulnerabilities causing tremendous amount of damage.
Come on, C is not an exceptional language from any point of view. It just pretends that everything is a fast PDP-11, and thanks to the insane amount of work and mental gymnastics done by the compiler, it will output good code. It has shitty abstractability that hinders performance if anything (just look at the small string optimizations done by C++) and hurts maintainability and productivity. At least it should have a proper macro system, but preprocessor macros are just a disgusting hack.
It is if not outright a “bad language”, it should definitely be left behind outside of maintaining the litany of programs written in it, and one should start new programs in either managed languages (if absolute control over the hardware is not required), or at least in C++ or Rust.
Interpreters for popular languages such as Perl, Python, Ruby, etc. are all written in C. The original Go compiler was written in C. Even the people developing Rust used Flex and Bison when trying to formalise a grammar.
The question is whether C is useful. For example, in building things from the ground up. Or vetting programmers.
Another way to vet programmers is to observe how many negatives they try to cram into their sentences. For example, "Not being... doesn't... non-popular."
If they are unable to make clear, concise, positive statements the same deficiency is likely to carry over to writing programs.
Not being an exceptional language doesn’t make it non-popular. There is plenty of programs written in it and certain domains are simply 100% C (particularly operating systems)
Personally, I’ve worked through having this mindset myself. I “grew” up with the C and tots descendant family of languages. I wrote shellcode in C and ASM as an exploit dev. I later learned python and now ruby, arguably some of the most abstracted languages.
At each iteration I openly opined at how “C would let me do X without the handcuffs I put myself in by using Language Y”. It’s really just senseless complaining. There is a reason why someone chose Ruby over PHP. PHP over pascal, Pascal over Ada and so on. The point is, to scoff at the product as a bystander is just immature. Throwing shade on an entire industry based off some weird pet purist outlook is just immature :/.
Of course, the irony here is that this type of mindset is no different from the apprentice carpenter who scoffs at every new house he walks through that he didn’t build. It’s no different from the wine snob who cringes at what his mom puts out when company is over. It’s no different from the tens of submissions HN accumulated in a week with something of the tune “I rewrote X in Rust”.
Your classic C programmer take comes from a time and place where servers were pets and nobody had to, or could, scale very big.
The fact is that businesses either are or are converting to cloud-first because why incur the risks of maintaining the one special-snowflake server that runs your business when you can turn a crank and spin up as many instances as you need -- theoretically very many as you take on more customers and transactions and need to handle the load? And ypu can just restart them if they go down?
Stop thinking technology and start thinking BUSINESS. The cloud may be more janky and complicated than what you're used to but it serves the business's needs better.
Lots of companies excel at playing big business without actually being a big business (or having a remote chance of becoming one). They simply lug around the kind of infrastructure that might be suitable for a company ten times their size. Ironically, it isn't rare that that extra infrastructure and the cost and effort to maintain it limit them in the marketplace.
> The fact is that businesses either are or are converting to cloud-first because why incur the risks of maintaining the one special-snowflake server that runs your business when you can turn a crank and spin up as many instances as you need -- theoretically very many as you take on more customers and transactions and need to handle the load? And ypu can just restart them if they go down?
I'm pretty certain you can do the same with software written in C (or any other language) without resorting to AWS/Azure/GCP/etc.
You can do the same cranking on your own hardware, except you keep the hardware after the peak has passed and you need to plan for the peak.
Or, you can simply get a DO droplet, and crank out as many new instances as you need, and turn them off when you don't, with software written in C (or anything else).
There's no need to go all-in on the AWS/Azure/GCP solution, with the ability to orchestrate instances as cattle, when you have, at peak, 3 instances[1].
[1]If you wrote your software in C, or Rust, or Go instead of Python, or Ruby, you could potentially get away with fewer instances than you need, thereby reducing to treating the system as a system of pets because there are so few of them.
> We're going to have to rip everything back down to the foundation in order to make progress beyond finger-pointing (X is DevOps but Y is software engineering and Z is IT admin).
1. Never going to happen unless everything collapses, which is extremely unlikely.
2. We are progressing. AR, VR, GPUs with raytracing, complex web applications and global services, global cloud, etc.
Expecting all software in a properly written stack to be "good" is a particularly bad take, and one that leads to brittle software. Best to accept reality, deal with it and move on.
The big question is: Do you need that kind of functionality? I agree that very large and complex infrastructures have their place - the problem is just that they import a ton of complexity and usually cost a lot.
People are always suprised when they see a minimal webserver instance on a ten year old debian [0] handling tens of thousands of requests, without issue. It might go down once a decade because the disk failed, but it cost 1200$ to run over that decade. I don't think that's the perfect way, but modern infrastructures love to include a lot of complexity when it's not needed. The problem is that including something is very easy and the cost is only payed once it breaks. Also, hardware is really cheap (you might not think so, but if you compare it to western IT salaries, it is).
[0] I know outdated instances with no update strategy are not really a benchmark, but I encourage you to go out and ask people what their container base image upgrade strategy is - the situation really did not change that much.
> The big question is: Do you need that kind of functionality? I agree that very large and complex infrastructures have their place - the problem is just that they import a ton of complexity and usually cost a lot.
I think that's the key. Most of us aren't Google/Facebook/whatever. Yet lots of people want to mimic those. A simple stack with a relation database on a single machine still goes a long way. In fact, it goes a lot longer way today than it did 20 years ago. It's ok to scale when you need to, but I think premature scaling is the wrong way to go, especially since most things never get to FB/Google/etc size.
First, I'd say the key value store is actually older than the relational database, by at least a couple of decades. Of course, not the current ones, but we're also not using the relational database from the 80s so ...
Now to try and answer your question, I think a relational database is a lot more complex than a simple key value store, but if it's a good implementation, it will hide this complexity from the user, and using the database will be rather simple. It can be as simple as a key value store since you can just use a relational database like one. Note that this hiding of complexity is by design, it is one of the main goals of the relational model, it's not accidental.
A lot of relational databases offer a better (or more useful) consistency model than k-v stores, assuming your data isn’t just a bunch of keys and values. Having to worry about consistency can significantly your worsen complexity of your application code.
1. MongoDB is not complex, the driver structure sucks.
Example Go's mgo driver by Gustavo Niemeyer was simple and effective, but abandoned.
The official drivers is unnecessarily complicated.
And Go's need to have "context" everywhere adds to this, but MongoDB is not complex.
Idk I'm not some super genius and picking up MongoDB was really easy.
Querying (aka aggregation pipelines), you have to think of that as "pipes in bash". It's in the words aggregation pipelines. find | groupby | sort | filter | select.
Something like that. It's not SQL it's different, but not complex, sorry.
But it does have way to query, if you of course only know SQL and didn't bother to learn the MongoDB way to query then for you, the uninformed outsider, it might seem complex.
But so does ArangoDB or Neo4j or GraphQL.
Like, if you were never exposed to rxjs and are now trying to build things with it doing
$stream.pipe(
switchmap(),
filter(),
etc(),
)
It does seem more complex than
stream.map().filter().etc()
but it's only so because you haven't put in the effort to learn that way.
Now write DBA scale ISO SQL 2016 / Transact SQL / PL/SQL in that interesting Mongo flavoured language, including database engine debugger integration, and JIT compilation.
In what context? A hosted database might be "simpler" to operatoe, but BigTable is quite complicated and does a lot of complex, low-level, work to be very fast.
> a raspberry pi can pretty much serve 100k req. / second. An average dev laptop should be able to handle 1M req. / second without much issues
Depends on what you do. It's quite easy to hit these numbers with a low number of static pages, but apps that can easily work with static pages usually don't have DevOps people. That's why I went with thousands, to have a reasonable number for an app that actually does some complex processing.
> a raspberry pi can pretty much serve 100k req. / second
I appreciate the sentiment, but my intuition says that number is too high. What would that look like? I can't picture nginx or haproxy serving up a static 'hello world' response at that volume on a Pi3.
And if you do anything with a slow-ish 3rd party API you're probably going to hit the TCP socket limit before you can respond to 100k/s.
Legitimately curious about whether my intuition is wrong. I don't have an unused Pi handy to test it myself.
> This test pitted a native C++ server against a scripted Node.js one, so of course the outcome was given. Well, not really. If I run the same test using µWebSockets.js for Node.js, the numbers are a stable 75k req/sec (for cleartext). That’s still 8.5x that of Node.js with Fastify and just shy of 75% of µWebSockets itself.
Edit 2: Doing a bit more research and finding some other benchmarks online, it seems like 100k with µWebSockets is plausible. I recant my skepticism.
I'm not sure you have much experience working on websites. Multiple deployments are required for performance reasons, which is the main driving force for paying hefty bills for CDNs, and their recent investment in edge computing.
You deploy to multiple deployments in multiple regions in a good part to drive down those 300ms requests to sub 50ms. Otherwise your user gets fed up and drops you in favor of a competitor.
But if you work on a service who has only a dozen users in the same region and can wait in turns to see who can make a request, go right ahead with your production raspberry pi setup.
> You deploy to multiple deployments in multiple regions in a good part to drive down those 300ms requests to sub 50ms
one raspberry pi per continent :p
e.g. I'm in a small rural village in southern france right now and I get 2GB down / 1GB up more or less with a 9ms speedtest ping (which is slow, it's often 2ms) - anyone from europe would get < 50ms ping notwithstanding the latency between them and their isp
by the time my website has, say, > 500k req/second I can definitely invest in, idk, some used 2015 i7 laptop to give myself a x10. For reference, 1M requests per second is Instagram in 2015, one of the most used websites in the world - by the time you're at that scale you've been bought by Facebook anyways.
I ultimately agree with you, but in between a raspberry pi and “hefty bills CDNs” is something like a half rack at a Colo, a f5, and a couple PowerEdges.
And then you serve megabytes of js libs that will cause multi-second load times and log every single mouse movement, autostart a video with sound and pop up 10 different bullshit, so.. you would be at the exact same spot from a shitty dev laptop.
Heyo, I worked at a CDN for a few years on edge cache stuff. Almost every web site would be well served with a single Pi and a CDN. For almost every web site, the CDN is free.
> a raspberry pi can pretty much serve 100k req. / second. An average dev laptop should be able to handle 1M req. / second without much issues
How big are these requests? If you're returning more than 1kB including all the headers, you're not beating 100kreq/s on a standard consumer-level 1Gbit link.
The argument is about the power of a single machine. If your enterprise is somehow limited by 1Gb links, no amount of scaling is going to address that.
That, plus if I open dev tools and look at requests for even this site, the only things that small are three of the images (the orange Y in the corner, the vote arrow, and a 1x1 somewhere).
It wouldn't be possible to get that request rate outside of a microbenchmark, just because of the network speed the pi supports.
you can buy faster network cards for a RPi but this is missing the point: 100k req/sec is not a huge volume and a decent server should be able to handle it without issues.
At my current company we're at cusp when we need to address scalability to able to grow as a business, and the business is growing. It took several years to get to this point, but here we are.
I prefer simplicity as much as the next guy, but at some moment it becomes simpler overall to use cloud services controlled by k8s than to maintain large amounts Ansible scripts and Jenkins jobs.
With that, by the way, comes an inevitable untangling of various parts in the software, and simplification of it. To be fast you need the processing path to be simple.
> The big question is: Do you need that kind of functionality?
What functionality? Automated unit tests to ensure you're not shipping code that is so broken that it even breaks basic invariants? Automated integration tests that ensure you're not shipping code that fails to call other components/services? UI tests to ensure that your latest commit didn't made your prod service unusable for the end user? Performance tests that help you determine if your latest change doesn't suddenly exhaust your computational resources in peak demand?
Kind of depends on what you’re building, right? Avoiding building new features in order to limit complexity is probably a good idea in select cases but not in others. Using more complex, purpose-built technology over standard boilerplate is often very cost-efficient, but only if you run at a scale where the additional complexity is worth the savings…
I partially agree with what you're saying but let's not pretend we weren't handling payments in 2005, some of us where anyway. I think what changed is the scale of things: we had a lot less people online back then.
I think the increased complexity in our architectures is correlated with the DevOps role coming to the picture, but I'm not sure there's a cause link there. My recollection from having lived through the initial years of DevOps (I began working professionally in 2000) is that an important goal was to reduce the terrible friction between dev and ops (the latter including sysadmins, DBAs, etc). Whatever the extra complexity we have now, I would not want to go back to the days when I carried a pager and responded to find out the problem was caused by an application I had no insight into, and there was no corresponding on call person on the side of the dev team. Another important goal was to manage infrastructure a bit closer to how we develop software, including the big step of having the configuration (or the code that generates it) in SCM. Another thing I don't miss is logging into a server to troubleshoot something and finding /etc/my.cnf, /etc/my.cnf.<some date>, /etc/my.cnf.test, etc. It's much easier to just run tig on either the file or the ansible/chef/whatever that generates it IMHO.
This times one million. Back in 2005 there would have been so much shit you wouldn't even dream of doing and a lot of things you did do, you sure as shit wouldn't do now. . . we used to write our cache systems. Migrations were all custom scripts. Fucking nightly builds were a thing because we didn't kick of builds on commit.
Unit tests weren't even common practice back then. Yeah, most places had tests but there was no common language to describe them.
And as much as git can be a big complex pain. . . merging was a BIG thing back then too. I seldom deal with long lived branches and the nightmarish merges they often needed.
Also, to all the young folks who "want to work with physical servers" again. Have fun with the next set of device driver patches you need to roll out.
I heart my containerized IaC buzzword laden cloud existence. #FuckThatNoiseOps
It's as much a cultural problem as a 2005 vs now problem because I have experience working at a joint that uses AWS yet chooses to roll their own Kubernetes, uses GitHub yet flails around with long lived branches and 20 minutes CI stages, uses containers but still has 20 minutes build times, uses ArgoCD yet has tiresome "release events" that generally can't be rolled back and take at least an hour to "fix-fowards" after a bad release. Sometimes you can lead a horse to water and even push it into the trough, but it would rather eat dust than let go of its last-decade ideas about operations.
Well if that’s all you’re doing in 2022, there’s nothing wrong with deploying your code and grepping logs over ssh. Bash scripts and offloading some of the complexity of management to AWS (e.g. use autoscaling) goes pretty far
In 2002 I built a website that got about 3m page views a week, with about 50k concurrent users, and a whole bunch of fun things like payments, outbound email, etc. It ran on a single server.
Since then I've working on things that use several servers, things that use no servers with edge functions instead (which have servers, but still), and that use autoscaling across lots of servers.
No doubt some businesses need autoscaling but most don't. It definitely shouldn't be something you reach for before you actually anticipate a need for it.
Autoscaling? Do you mean facing so much cloud management complexity that it invites automation, in order to spend more money for a herd of excitingly ephemeral server instances than for a reasonably overprovisioned boring actual computer?
The fun part is that the 2005 architecture is still plenty sufficient for 99% of deployments but everybody architects for 100x the scale they actually need.
No, it's the ops version of premature abstraction - when devs build a whole extensible framework around a simple use case just because "requirements might change later" (which 90% of the time they never do, so on average the work is wasted much more often than not).
I typically see it happen in the other direction myself. Business folks describe the problem/product and engineering team says “oh yea we can do that, we’ll need K8s” when they do not, in fact, need K8s
Yeah, because obviously nobody handled any payment in 2005 … The same can be said for everything in your list.
And more importantly, it's unlikely that your business is more complex than what it used to be in 2005. And if you're spending more resource to deliver the same business value, you're wasting them. (That's why PHP is still ubiquitous btw, it sucks by most standards, but it's good enough to do business and that's what matters).
Extremely complicated architectures are a liability, not a feature to be proud of; increasing complication (i.e. costs) doesn't mean increasing functionality (i.e. value).
For example, why do you "queue tasks for later"? Do you have exceptionally punishing latency and throughput requirements that can only be met at a reasonable cost by not answering synchronously, or it's because your database doesn't do transactions?
Similarly, what do you do with "queues, caches, and so on"? Meet further extreme performance and availability requirements, or attempt to mask with additional complications the poor performance of inefficient complicated components?
In 2005, but also in 2000, web applications had already progressed past "just serving a web page", mostly without complicated architectures and therefore without the accompanying tools.
I think tool improvements made cargo-culting actually advanced state of the art software architectures and processes easy and affordable, creating a long term feedback loop between unnecessary demand for complexity (amateurs dreaming of "scaling up") and unnecessary development and marketing of advanced (but not necessarily good) complexity-management tools, often by the same amateurs.
I was working on large scale deployments with multiple datacenters, multiple databases, message passing networks and running multiple customer-facing products.
Load balancers and HA setups were already in use. LVS existed. VMs were popular. "CI/CD" existed, without that name.
The future is already here, it’s just unevenly distributed.
By 2004 we were doing perforce and CI. No Continuous Deployment because we were building on-prem software, but by 2008 I was doing svn and Continuous Delivery (similar problem, not hosting).
We would have been if it had been appropriate.
To be fair though, I think a lot of my coworkers suspect i’m talking out of my ass because they’ve been doing this stuff for 5-8 years and think everyone else has too. As tempting as it sometimes is to launch into a lecture., it wouldn’t really help. Everybody knows what they know, and they’re going to know it until they watch it catch on fire and then they will either want to hear about alternatives or they’ll share ideas for fixing it, not realizing they’re repeating something you told them two years ago. Hey that’s a great idea, wish I’d thought of it.
I was there too and some things were better, some thing were worse.
The problem is that we’re being sold a lie and it’s hard to swallow.
“Cloud will save you on staffing costs” - no, it just means you have specialists working on proprietary solutions, just like IBM mainframes or storage appliances or load-balancers/traffic-shapers of yore.
“This technology will make rollouts easier” - until it breaks down and is not easy to put together again, you’re at the mercy of your upstream and you had better hope you keep a really solid update cadence and not introduce something you can’t consume later.
“Layers on layers means things are composable” - sometimes, but you need more people to know more things to put it together properly. Running everything from QuickStarts is doomed to fail, but I see so much of it.
Our config management tools back in the day were crummy as hell, cfengine being the only notable one (which was awful), our monitoring systems required a lot more handholding and truthfully: people were not happy to talk through requirements; which was probably why “devs run production” was appealing.
these things have gotten better, but nearly everything else has definitely gotten worse.
All you did is describe the increase in complexity of the infrastructure. Which does not rebut the point, because that is the point. The infrastructure people are using now is much more complicated than what was common 15 years ago.
If you want to rebut this you need to demonstrate the kind of capabilities we have now thanks to this complexity that we could not have before.
> you handle payments
No you don't. Most people handle payments by integrating with a 3rd party service, most likely stripe or paypal.
I find this take apologetic. "But we're running on the cloud, we need this complexity", is a wrong take. Yes, the underlying services may need this, but the infrastructure managing these services don't need to be this complex.
Every complex tool can be configured to work in a simpler manner, and all configuration repositories directing these tools can be organized much better. It's akin to a codebase after all. Many code quality processes and best practices to apply to these, but sysadmins don't like to think like developers, and things get hairy (I'm a sysadmin who does development, I see both sides of the fence).
The sad part is, these allegedly more robust platforms do not provide the transparency neither to developers, nor to sysadmins, nor to users. In the name of faster development, logging/debugging gets awry, management becomes more complex and users can't find the features they have used to have.
Why? We decoupled systems and decided to add this most used feature at a later date, because it's not something making money. Now my account page even doesn't show my e-mail address, I can't change my subscription renewal date, or see my past purchases, why? They're at distant tables or these features are managed by other picoservices which are not worthy to extend or add more communication channels.
Result? Allegedly mature web applications with weird latencies and awkwardly barren pages, with no user preferences or details.
Adding layers and layers of complexity to patch/hide shortcomings of your architecture doesn't warrant a more complex toolset to manage it.
* 2022: run unit testing, deployment to beta, run integration testing, run UI testing, deployment to preprod, run internationalization testing, run regional/marketplace-specific testing, run consistency checks, run performance tests, deployment to prod.
Maybe, but the pipeline required to do all that in 2005 would be considerably more complex and brittle than the 2022 version using tools and infra designed for the job.
2005 was fullon enterprise websphere/weblogic era, so no, if anything it was much more complex (from architectural side) than today's python/nodejs or even spring boot solutions. Automation (bash/ant) plus ci like cruise control already there.
Integrating with other services makes your software more brittle as there are more points of failure from hardware and networking on your end, to API gateways, networking, and hardware on their end. Periodically those things tend to go sideways which means whatever it was your system was trying to do doesn't happen, or worse it only partially happens. Usually some poor engineer is then tasked with figuring out if this is an on-going problem or a one-off problem? Did it affect a single customer, or is it affecting all customers? Is it affecting a single instance, or is the problem happening across all instances? Is it a problem on our end, or is the 3rd party API misbehaving (again) today?
Generally you need solid APM and logging to support the operations of this.
Then there's queueing tasks for later and making sure you never lose messages between systems. All sorts of fun happens when you hit a capacity limit in your queueing system due to unintended/unnoticed design flaw and all of a sudden system performance starts to tank as things aren't getting processed due to too much stuff in the queue that isn't draining at a sufficient rate. You need to figure out is the queue actually draining or is it still increasing in size? Are all the worker nodes still running or did one (or several) of them hang and didn't die but simply isn't processing any jobs out of the queue but still consuming a spot in your auto-scaling group blocking a new healthy instance from coming online? Is it a problem in your software or is it actually a blip in operations from your cloud-provider?
It's way, way, way more complex than simply "chuck some API keys in a file and she'll be right!". That's hobby project level stuff, not realistic ops of a production system.
I’m literally working at automated finance and we are offloading these cases to client support, and late/next day status requests for b2b peers. The only devops-level technicality there is an in-process keep-alive system which prevents overwhelming the support repeatedly. What you’re describing tries to deal with these issues at the level where it is hard, and it is, but that’s exactly the road to complexity and infra costs (and infra issues as well). As a result, it trades simple client issues for hard technical ones, and you still need support. It’s cool to have an ideal incident-free system, but only when you can cover its costs - and more importantly its evolution barriers - by exactly that quality. Iow, don’t replace humans with machines either until machines learn to wipe their own ass, or until it’s just too much ass to manage efficiently anyway.
I’d fight at your side few years ago, but since starting to work in this field and asking around I realized nobody cares about that ideal way, cause it’s so expensive and hard to maintain, and the cost of change is prohibitive. It even left few scars on my mental health, exactly because I was too anxious to go this way after rotating in environments where devops is praised as something inevitable to survive. If you squint at us at the right angle you still can see devops, but I think we are just expressing two opposite sets of ideas, each applicable in its own type of a business env. You may still call that a hobby-level, and I’d even agree (cause it essentially is) if that hobby didn’t bring as much as to sustain and profit the company at the levels many companies would wish they were at. If it works and brings revenue who cares how it’s categorized.
If that was true, why has the economy since 2000 never hit the speed of growth that it had in the 1990s? It seems the big economic gains of the Internet came early, and what we've seen since is the Law Of Diminishing Returns: increasing inputs for decreasing outputs. Bigger investments for less economic growth.
> hit the speed of growth that it had in the 1990s? It seems the big economic gains of the Internet came early
It did, just not in the way you're describing it. This is how it's manifested itself:
> Apple, Amazon, Alphabet, Microsoft and Facebook all account for just under 20 percent of the market value for the entire S&P 500. With a collective value of nearly $5 trillion, these top tech companies easily dwarf other entire industries in the index, with companies like Berkshire Hathaway and JPMorgan Chase falling well short. Currently, the total valuation of the S&P 500 is almost $27 trillion.
That is not economic growth. A valuation of some companies on the stock market has noting to do with economic growth. "Create more than 2 orders of magnitude in terms of economic value" never actually happened.
>> Your automation may be an order of magnitude more complex than 2005, but it enables two orders of magnitude more functionality.
This!
The primary problems with DevOps are...
1. It is still in its infancy therefore it's changing quickly.
2. Bad (or no) documentation is much more painful than before. A single family house without blueprints can, usually, be adequately serviced; whereas, a 27 story office building cannot.
So someone overengineered something in 2022, and therefore, nothing's better?
How about my anecdata:
2010: your infrastructure is automated using a handful of Bash, Perl and Python scripts written by two system administrators. They are custom, brittle in the face of scaling needs, and get rewritten continuously as your market share and resulting traffic grows. Outages happen far too often, you think, but you would, because you're someone who gets paged for this shit, because you know a core system well... ...that got broken by a Perl script you didn't wrote.
2019: your infrastructure runs on EKS, applications are continuously deployed as soon as they're ready using Jenkins and Flux. You wrote some YAML, but it's far better than that Perl stuff you used to have to do. The IDE support is like night vs day. You have two sysops, or devops, or whatever, who watch over the infra. You've had to attend to a system outage once in the past two years, because an AWS datacentre overheated, and the sysops just wanted to be sure.
You write some YAML, the sysops write some CDK. Your system is far more dynamically scalable, auditable, and reliable.
My anecdote can totally beat up your anecdote.(In other words, this is a silly line of argument)
git repo
/var/www/domain_name
git clone git_url /var/www/domain_name/backend/
cd /var/www/domain_name/backend/
go build
Updates
git pull
go build
systemctl restart domain_name.backend.service
I pay 46€/month and I'm looking forward to halve those costs.
Server load is mostly <0.5
I call this the incubation server.
If a project takes off I rent a more expensive, but dedicated, server.
It's very unlikely that I ever need more than 1 single server per project.
I will never write microservices, I can scale fine with a monolith.
Lately I even moved away from JS frontends to render everything with Go on the server.
Yeah it requires more resources but I'll gladly offer those resources for lower response times and a consistent experience.
Sadly companies that are hiring don't see it that way. That's ok. I'll just stay unemployed and try building my own stuff until something succeeds again.
I had a 7 year long project that brought in 5-7k€/m. The server costed 60€/m.
I can do that again. I know it's not your kind of scale or income level, but it allowed me to have a good life living it my way.
I think it's somewhat disingenuous to compare DevOps requirements of 5-7k/m projects with systems run and operated by companies in the mid market.
That said, something I often wonder about is if you could minus out 100% of the cruft systems run by realistic sized companies, exactly how cheaply could you run them and with what DX? Half of the problem is things built by 100 people with competing and shifting priorities will never result in a clean, tidy, sensible system and it's mighty difficult to minus out the effects that the organization scale has on the end result.
I'm currently working through building a hobby project on that as far as I know will only ever have one user, but I'm enjoying the total freedom to take my sweet time building it exactly as nice as I wish the systems I wrangle in my day job would be and I'm 100% looking to run it for free or as close to free but with as much performance as I can get because why the hell not? It's a totally different ballgame.
I didn't understand half the things you wrote.
What's a company in the mid market? (nm I looked it up)
What cruft? Not a native English speaker, you way of expressing yourself is hard to understand for me.
DX?
Are there really things built by 100 people, if so, why do you need 100 people?
Why can't you do with 1 lead 1-2 database guys 1-2 code monkeys?
Why can't this be done in a monorepo and a monolith?
Why does it have to run -in-the-cloud- on other people's computers?
I had a project that was ranked in the top 1000, according to Alexa, on a single server.
But it didn't bring in enough revenue and took up most of my time so I shut it down.
I re-launched it 15 years later but it's dead. No one knows or remembers the domain anymore save for bots who keep hitting those "fkk-sex-nude-porn" etc spam links.
Back then you could make a site popular with 1€/day on Adwords, nowadays ... lol, this won't even lead to your ads being displayed, anywhere.
Go can handle 500 million visitors per month on 10 year old hardware (= 1 server).
Which "mid market" company has 500 million visitors per month?
Writing websites/apps isn't complicated. For me anyway.
You figure out the data models, write the use cases and you're half way. The other half is writing the display layer to make use of that data.
You make it all sound like the requirements or the product is different. It's not. It's all the same. You can have observability without k8s. You can scale without k8s. You don't need a managed database.
Man this stuff is simple. It's people how are trying to sell you cloud and microservices and whatnot that make it all sound so hard.
A good software developer if spending his knowledge and lifetime to build something for you that is built to last, because you can't apparently do it yourself or don't want to.
It will last even when he isn't part of your company anymore. He could've built the same for himself and monetize it, instead of bowed down to you (not your personally) and opted for steady income.
I understand how, let's be honest, when we talk DevOps we mean k8s, so I understand how sweet a siren's song k8s sings. But it's ultimately a waste of resources. It's a solution asking for a problem.
Until you reach proportions that require k8s you'll be completely satisfied with a 3 server setup, that is Go and pick a database. I promise, I guarantee.
That 5-7k/m project had 30 people concurrent at most. It had 30TB/m outgoing traffic at most. How much does 30TB egress cost in GCP,AWS etc? It used to be about 3k, I believe it's 1/3 of it now.
Why would the principles that are valid for a "small" project not apply to a "mid market company"? More features? So what? The principle remains the same.
Boring, same old. Data models, wrappers, display layer.
It's the same for all, your beloved FAANG, mid market, small business, single owner.
No one ever said a VPS with a shell script is terrible. You think of scale the wrong way. Scaling is not only about increasing from 10 requests / second to 1000 requests / second. Scaling is about organizational scale too, i.e. how do you ensure going from 2 to 20 developers increases productivity by at least 20x and not 1.5x?
Tools like Docker, Kubernetes and whatever absolutely help in that regard.
I for one do not miss hosts never being patched because all those slight modifications to systems files that were tweaked several builds ago and now everyone is too scare to touch.
I won't miss the 12 month projects to upgrade some dated software to a slightly less dated version of that same software.
From my perspective in Security, DevOps has made life much better.
The ability to spin up a box, have it run insecure code, and then spin it down; and the ability to do that all day long, is worth it for the security benefits that all this complexity entails.
> The ability to spin up a box, have it run insecure code, and then spin it down; and the ability to do that all day long
What's the best way to do that? I have some insecure code that needs to run about 6x a day, and so far my best thought has been an isolated box outside my network that does the internet based fetches, translates the data and then submits them over the web to another service that verifies/checks the output.
I run 50+ smallish applications on AWS using Bitbucket Pipelines, Fargate, Aurora MySQL, S3, Cloudfront and a few other services. Most of the setup is scripted using very simple Cloudformation scripts. I estimate that I spend maybe 10% of my time on this and the rest of my time on ordinary dev/architecture tasks.
Before Docker and AWS this would have taken me so much more time.
The only drawback is that we have a hard time finding other developers in the company that want and have the time to learn the setup. It's not very complicated, but require some familiarity with the AWS ecosystem. It can seem daunting to someone who has to learn it from scratch.
Those bash scripts probably still in place and working, meanwhile modern ways of managing the servers have been recreated a dozen times over the last 25 years as every time a new person comes in there's always a better way of doing things.
The difference is now you're a failure if you stay at the same job for more than 2 or 3 years.
It's usually always a solution with an overly complex chain of tools that only do a small part of deployment/security tasks because it's a food chain, where each vendor can eat a part of the company's budget consistently, based on a problem that really isn't consistently solved...
Apps are still no more secure because there are several points where they can be compromised, rather than just a few involved in a less automated, but more easily replicate-able process. Also, I don't need "flavor of the month" skills to get things done. There is always a revolving door of fly-by-night-hype tools and brands that regularly rise and fall in the IT world... I avoid them (new hyped products) like the plague. I'm fine with being the stubborn middle aged IT guy now. :P
It's all a food chain based on making money. What matters to me most is whether money is being made from the product that is deployed, and if it's simple, reliable, and secure enough to be worth development. I don't do my job to make a bunch of companies money by using their DevOps tools.
Screw impressing other engineers with solution complexity every time. Functional reliability always wins at the end of the day. Leveraging a massive list of Ops tools only creates a huge backlog of update work, designing efficiency and simplicity in most of my solutions is what ultimately pleases most of my clients.
In 2005 your infrastructure provisioning wasn’t automated. The complexity has increased, but so has what we get. Being able to provision new hardware stacks like software is amazing, in 2005 I had to get quotes from hosting providers.
2005: no cloud, had to order 1Us, wait, rack up. Needed a DBA for the database, a network sysadmin for the networking, all to serve a simple website with not the same level of HA. We are doing way more now, which needs some more complexity and yes, in many cases we are overengineering it.
2005: you had 4 people that understand what everything did
2022: you have a team of monkeys clicking buttons
Joking aside, it seems like the developers these days don't have the understanding that they did a while back. Not being involved with the nitty-gritty causes them to just write code willy-nilly.
Author claim is more about how to make people work together on deployment. Rather than a rant on devops tools.
You can do simple things with modern devops tools. You can go off rails with simple scripts. It's not the tooling, it's about engineering maturity and the requirements of what you're building.
IaC wasn't even prevelant or a production ready thing back in 2005. I'm unsure what magical bash scripting would do any of that, maybe it the data centres too!
IaC wasn't a thing because you honestly didn't need code to solve the vast majority of deployment problems. It was a configuration issue.
Not 2005, but a year later in 2006 I was using cfengine to deploy code and configuration to servers from an svn repository. The same svn repository had dhcpd configs that described pretty much every device on the network. The dhcp configs also pointed to a tftp service from which new nodes pxe booted to an installer which pulled down the node specific kickstart, and provisioned the machine.
We didn't call it infrastructure as code, but it sure fucking smells the same.
>to the people claiming that today's infra does more things... No, I'm comparing stuff with the same levels of availability, same deployment times, same security updates.
ok but in my experience it seems like more things are being done in places I see with devops nowadays versus back then. I mean I know you say that it's the same, but it's hard to believe your statement in a comment versus my lying eyes. It seems more likely to me that your two examples are both actually fictitious and thus it is easy for you to say that they are exactly the same in what gets output - or have you been at the same place for 17 years, seen the changes, yet have had no input on the company to stop the madness? Because if the latter that would also seem... weird.
Were microservices a thing back in 2005? Honest question, I always assumed that SOA was more of a newer philosophy in web software. The scale of what we build has changed a lot over the years, as well as the need to handle the variance of scale through techniques like auto-scaling. All of that adds an incredible amount of complexity in systems that surely didn't exist 18 years ago.
I don't think SOA was in common discussion in 2005, I think more about 2007 would sound right, Rest was pretty much the winner by 2009. But perhaps my memories here are warped by my personal career and not having to have the arguments after 2009.
> If you look at a DevOps engineer job description, it looks remarkably similar to a System Administrator role from 2013, but...
> If DevOps was supposed to be about changing the overall culture, it can’t be seen as a successful movement. People on the operations side of the fence will...
As someone who was keenly watching this stuff back 15 years ago, parts of this article connect with my understanding, but the core problem I have is that this article itself is somehow bought into the mistake that led to the failure and so almost can't see the failure for what it is: the entire point of DevOps was that "operations" isn't a job or role anymore and has instead become a task that should be done by the developers.
Ergo, if you even still have operations people to comment on it--or certainly if you are somehow hiring dedicated "DevOps" people--you aren't doing DevOps and have already failed. The way to do DevOps is to fire all of the Ops and then tell all of the Devs that they are now doing DevOps; you simply can't have it both ways, as that's just renaming the same two camps instead of merging them into a single unified group.
I've worked in ops roles since about 2000 after a few years in backend corp IT stuff.
I agree that what you've described was the original intent and goal of "devops" but in light of that failure, the "cross functional team" definition took over and then in light of that failure, the SRE was born and we're basically back where we started but now the ops people use git instead of rcs.
In my experience and opinion, developers are really bad at ops and sysadmins/ops are really bad a development. Anyone that is truly good at both is a unicorn that is probably carrying their team.
because it's not "profitable", it just doesn't make sense for 95+% of teams.
most teams are not building the next facebook. most teams have at least some stability, most teams benefit from delegating specific tasks to specialists (eg at project kickoff they talk to the various leads they sketch out a design, agree, and get back to their respective terf, and when it comes time to deploy it they again talk to whoever and in a few iterations it gets deployed, it goes into testing and then into production, and that's it)
sure, there's always bitching about how it's not agile, but ... like I said, they don't have next-facebook-like money.
maybe Netflix is the best example for this. everyone was in awe of them for how they are going all in with Cassandra on AWS, microservices, flamegraphs, circuit breakers, 30% of the Internet traffic, and many groups/startups started copying them. but forgot a few tiny details like paying half a million dollars a year to new recruits and having a cashflow that can sustain all the aforementioned things.
almost every group would benefit from a more holistic knowledge of whatever they are doing as a whole. but just as there seems to be a natural limit to how many peers one can comfortably have at the same time (eg Dunbar's number) it seems people naturally like to set up softer or harder boundaries for their IT knowledge. ¯\_(ツ)_/¯
... and yes, tooling seems to be the place where this kind of complexity should live, but then maintaining that tool becomes the real challenge :)
I would argue that it is profitable, as it is far less expensive than the current shitshow. In particular for smaller companies. I have seen team of "unicorns" and you can do the equivalent of 100 devs organisation with a couple team. WhatsApp or Discord come to mind.
Also that tooling could be opensourced and shared you know :D It does not have to be a price paid by everyone building their owns.
But it would indeed have to handle the real problem. Not like the current tooling that our ops people force down everyone throat :cough: K8s :cough:
> the entire point of DevOps was that "operations" shouldn't exist, and that operations is a task that should be done by the developers.
Ergo, if you even have operations people to comment on it, or if you are hiring dedicated "DevOps" people, you aren't doing DevOps and have already failed.
This. My first thought when I was reading the article. Spot on
> the entire point of DevOps was that "operations" isn't a job or role anymore and has instead become a task that should be done by the developers.
This akin to saying "frontent developer isn't a role anymore - both frontend and backend should be handled by a full-stack developer". This works for small companies/projects, but bigger ones can benefit from specialization and division of labor. Body of knowledge required to be a decent software developer and a decent ops engineer is too big to fit into one head. I've seen ops work being done by developers without ops experience and more often than not it was ugly - they didn't had enough experience/knowledge (and time/incentives to gain them) to do ops work well.
To me the best part of DevOps isn't about roles but about team structure - splitting all Dev in one department and all Ops into another usually is a bad idea. And a failure of this split was a motivation to start DevOps movement. Having Ops embedded into Dev teams in my experience works much better.
> but bigger ones can benefit from specialization and division of labor
I think it is central to the DevOps concept is that dev vs. ops segregation—at least on the small team level, perhaps not at the individual level—is a counterproductive division of labor that inherently fosters micro-optimizations on both sides of the divide that are counterproductive to effective value delivery. On a continuously available software service, the lowest-level product team should own it's components soup to nuts rather than having a dev team throwing hopefully-deployable code over the wall to an ops team.
And that's great until you have three-ish "lowest-level" product teams and they're all managing not only their own production systems, but their own deployments and testing pipelines and monitoring stacks and secrets management. At some point, you need to start unifying that stuff to keep things manageable, and if those systems are everyone's responsibility, they're no one's. So you make another team for whom the internal shared infrastructure is the product, and the rest of engineering are the customers - they still write their own Terraform modules and whatnot, but they run them on DevOps-supplied platforms. That seems to be what modern DevOps is becoming.
these are not mutually exclusive. if those product teams are under one big umbrella then nothing stops the one holding the umbrella (C-level or whoever) to set the standard. you can use these languages, these CI tools, these package repositories, etc. if you need something exceptional ask.
the important aspect is that there should be enough working knowledge about these tools/processes/systems in those teams that they can work efficiently, and they can respond as the whole business evolves. (eg. scale up/down, extract and hand over or accept and integrate components, integrate other APIs, etc)
> and if those systems are everyone's responsibility, they're no one's.
again, there's a cut off eventually. for example many companies just use GitLab for CI. at that point it's up to the big umbrella to decide whether they want to be in one or many GL orgs.
> That seems to be what modern DevOps is becoming.
sure, but that cutoff seems to be infrastructure vs product. which seems a bit more healthier than cutting the software lifecycle in half.
of course domain experts are real (or what's the term nowadays?), so specialization makes sense. comparative advantage and all. but the idea is to lower the (coordination, communication, conflict due to inevitable misalignment between separate teams) overhead by "onshoring" the basics (eg. writing tests, basic CI stuff, deploying)
the devops manifesto (which allegedly does not exists, but you get the point) basically calls for giving people tools, permissions and authority to do these basic things, giving teams ownership of their stuff. and of course this doesn't mean fire every sysadmin on sight :D (even if that would definitely help with the process of re-owning some ops taks to dev people)
> The way to do DevOps is to fire all of the Ops and then tell all of the Devs that they are now doing DevOps
That's like saying "agile is firing your scrum masters and tell your developers you're agile now".
The idea behind DevOps is that applications are provisioned by the people who know the application best, the developers. If everything works out as it should, this gives additional load with creating the deployments, but also removes the overhead when dealing with operations when deploying, updating and debugging - so a net zero in workload, but a gain in the way the application is hosted better and fixing bugs is easier. You still need operations, both for providing the underlying platform (getting a server ready is not a developers core business and it shouldn't be) and for guiding the developers. It should be leaner, but you still need it.
Of course, you can also fire all of infra and tell the developers "that's your job now", but that's like calling biweekly deadlines scrum (and leads to equally bad outcomes).
Yes - and it's also true for many DevOps people. Some would argue that's even true for many managers. It's just not a good idea if you want to get where these people where supposed to get you when you hired them.
I’d say that is absolutely untrue in the case of agile.
Those promoting scrum (and especially those with certifications in it) are going to lead to scrum, not an agile process that values producing useful, working software each iteration.
This can be true, but I would argue not always. Some DevOps teams work in the old mode of “throwing code over to Ops to run” - this isn’t what DevOps intended, but happens.
When they work well, they’re doing things like authoring reusable (by product eng. teams) infrastructure modules, or helping to build “you build it, you run it” tooling like monitoring stacks etc. They’re also helpfully/hopefully subject matter experts on CI/CD, your cloud/hosting of choice, security stuff - things that general developers have mixed levels of interest or competence in.
That is utter BS. DevOps means Ops and Devs working hand in hand in a crossfunctional team. Nothing more, nothing less. The main idea was to tear down silos.
Well, you can't simultaneously tear down silos and continue to have two silos... I would hope that would be obvious? The new cross-functional team is made up of the Dev people who are now doing Ops and the Ops people who are now doing Dev, with everyone else no longer fitting into the new DevOps reality, which was itself born from the premise that cloud computing was simply obsoleting the floor of dedicated systems administrators you previously had building machines and coordinating workloads and replacing it with a new deployment paradigm where a developer could develop their operations as easily as as they can develop anywhere else in the product's stack. If you have a special DevOps team you hire people into that is simply a renaming of the people you previously had doing Ops, you either haven't internalized this future or are actively rejecting it (which, I will emphatically state, is a perfectly fair position to maintain) and of course are going to "fail" at DevOps.
> Well, you can't simultaneously tear down silos and continue to have two silos... I would hope that would be obvious? The new cross-functional team is made up of the Dev people who are now doing Ops and the Ops people who are now doing Dev, with everyone else no longer fitting into the new DevOps reality,
There's a project I'm tangentially involved in, that has app tiers written in different languages where the devs in either tier don't actually know the language of the other tier. They're still on the same team; they talk at the morning meeting, they work together to get things done, they have the same overall set of goals.
Despite not doing eachother's work (or even being able to), they're not in silos. There's no throwing stuff over the wall.
Coming from the other direction, I've hired for DevOps roles a couple times and as a hiring manager one of my very first first questions to a candidate is to ask them to define DevOps, then ask them to define their previous role(s) against that ideal because DevOps is so broadly used as a title.
It's a great question because it finds misalignment rapidly. Post a job for DevOps and you'll get everything from old-school sysadmins to software developers of all flavors to AWS Certified Somethings applying.
On a more even note I would prefer you spend some time looking at the origins of devops and what it means, it’s a contentious term because it means different things to different people.
The original “Patrick Dubois” (founder of the term) meaning was Systems Administration in an agile fashion.
I suspect that you’re repeating what someone else told you and you’ve just adopted their definition, which is fine, but part of the issue I have with the term myself is that everyone has another meaning than everyone else.
At all the places I worked previously in the last ~20 years there was always a sharp separation of development/testing and production environments. I as a developer never had access to any system in production, apart from one place which had a very sophisticated security system in place which could grant you temporary access during deployments. Just think about customer data, and you'll understand why.
So when I hear that someone thinks devops is developers running their own systems in production I always wonder where this is actually possible, let alone whether it is a good idea at all.
Given that I've been perfectly capable of doing ops work in the previous 5 companies I've worked at, suddenly being unable to do so in my current company because I'm classified as a 'dev', is supremely frustrating.
Especially when you have more experience by yourself than the entire ops team combined.
You're just replacing one set of people with access to customer data with another.
In either case you should be implementing least-privileges. Only the access to data that a person needs to get their job done. More frequently this is developers than operations people.
> The way to do DevOps is to fire all of the Ops and then tell all of the Devs that they are now doing DevOps
That's a way to say you're doing DevOps, but it's not going to work very well.
> merging them into a single unified group
That's the right way to describe it. Just like the old "programmers build it to spec, then throw it over the wall to QA" is out of style, and now good teams have testing specialists in the same room (conceptually) as developers. the goal with DevOps was to stop doing the old "here's a build, deploy it" that caused so much wailing and gnashing of teeth and instead bring ops skills into the team as a first-class expertise. The CI/CD pipeline can mean that development, testing, and release are all together, rapidly iterating and responding to change.
By the way, pretty soon the AppSec/CyberSec people will be folded in, to, so instead of the old "it's done/deployed, run your pentests/analysis tools" secure by design will require those skills to be integrated, too.
> the goal with DevOps was to stop doing the old "here's a build, deploy it" that caused so much wailing and gnashing of teeth
From an ops point of view that might be the main selling point. From a dev point of view a key selling point was not having to wait a month for every minor configuration change to go though change management processes.
> our role is to enable and facilitate developers in getting features into the hands of customers
The problem here is this creates the wrong kind of incentives for developers… somehow elevating the to a level where they don’t have to care about how their code works in production.
As someone that remembers being a developer back in the days of sysadmins, we were AFRAID of upsetting the operations people. If your code brought a server down, you were at least going to face some very awkward conversations. The cartoon “The Bastard Operator from Hell” immortalized that era.
Meanwhile at one company I worked at years ago - an airline - the development team was responsible for keeping the system running 24/7. Nothing makes you think more carefully about your code in production than meeting a colleague on Monday morning who got woken up at 2am by your code failing.
While I’m not arguing for hostility in the workplace, giving developers incentives to care about their code in production seems to me to be one of the things devops got wrong
Well either way there are direct consequences that the author of the code will feel - which is the point here.
And usually pager duty is done in rotation, rather than you only get paged for your own code. It's one thing to ruin your own sleep, but if you ruin the sleep of the person that sits next to you, you start to think about consequences in production.
It amazes me, for example, how often I've seen developers leaving their application logging, via something like log4j, in a default configuration where eventually it WILL fill up a disk, bringing down a server, rather than investing the 10 minutes it takes to switch the configuration to say "rotation" so that only a finite amount of space will be used, or just writing a bash script to clean up old files. And this is something that's very hard to pass on as best practice without sounding condescending - like the only way to learn to take stuff like this seriously is by dealing with the consequences in production.
There's a reason people end up doing plain ops and calling it devops: it's often too costly to handle this as “a task that should be done by the developers”.
Specialization makes people much more productive, because they face the same kind of issues over and over and know how to fix them quickly. When you distribute the load in your organization, everybody is going to face problems, struggle, learn and never reuse that knowledge again.
Developers should have some grasp of ops work, and be able to deal with some ops-related issues, and take part of the design of the ops-side of the software delivery, so as to make sure the infra and deployment workflow they deal with works for them.
But yes it makes sense to me to still have people specialized in ops and infra in teams, collaborating with developers.
Basically, instead of having developers doing everything or just developing and throwing code out to an ops team, we should have developers educated in operations, working in teams with at least one operation specialist (or "DevOps engineer"). That way, you should end up with infra, deployment workflow that really works for the team and is optimized for the needs of the team.
Exactly. It's not about making developers to do more work, it's about having some "DevOps engineer" or Ops/Infra guy working close with the Dev team, by being involved on the day to day development and decision making. Instead of having to open tickets and waiting X time for it to be resolved by the Infra team (who by the way, a lot of time doesn't fully understand the applications, it's constrains, tech debts, needs, etc).
I was just reading the "Building Scalable Websites" book [1] released in 2006. At that time, "DevOps" was called SysAdmins. And there were also DBAs, Network engineers, among others.
> The way to do DevOps is to fire all of the Ops and then tell all of the Devs that they are now doing DevOps; you simply can't have it both ways,
I think this points at what happened: Startup scrappy culture started permeating new technology companies, which meant no budget for DBAs, QAs, SysAdmins and other similar roles. So decision-makers fired all those roles and ask Programmers to fill the voids. At the same time "cloud computing" started to mature, so there was a change from hardware/operating-systems tinkering to software related tinkering.
One just has to see the decline of "SlashDot" which was a very SysAdmin/Operating-System focused website, in favor of news.ycombinator and similar more software-oriented forums.
You're right but in reality DevOps teams in 2022 are managing Kubernetes clusters and gatekeepers to all kinds of cloud services to facilitate development.
Yes. I was confused reading this article because author seems to miss an important point. Devops culture is about "you build it, you run it". Not having a dedicated devops team that tries to make developers do things.
I Read a book lately on that topic : Team topologies. It explains this concept pretty well
> The way to do DevOps is to fire all of the Ops and then tell all of the Devs that they are now doing DevOps
That's slightly hyperbolic and I'd also argue that there's a fundamental error there since you throw away all your platform operations. Now you have dozens of operations engineers working in separate silos and no coordination and you've thrown away platform operations engineering.
What you really need to do is give all your developer teams pagers and point all their monitoring alerts at those pagers. If they setup an oncall rotation of the developers for their own software or they panic and hire a small team of operations engineers and hand them the pager it doesn't really matter. Then you have the problems of coordinating platform ops engineering with the ops team members in the software teams. Whoever handles Ops for the software teams acts as a kind of PM to interface with the centralized platform engineering roles which is responsible for coordinating across development teams to make things look consistent across the Enterprise.
The problem with the bad old ways was that software would write code and then toss if over the wall for operations to run, and all the shitty disk full pages and whatever other crashing software badness fell onto operations, and dev teams would each individually choose to ship software that was shitty to run and all the monitoring alerts would fall under a silo under a completely separate VP where they weren't responsible for their ops metrics and would chose to work on features to deliver for their management. Give the dev teams pagers and make them accurately feel the pain of running their software and then they can make choices about how much they want to abuse their own embedded operations people that they have to chat face to face with every morning at a standup. If you then fire centralized ops, though, you wind up with dozens of different operational fiefdoms inside of one company with everyone repeating the same mistakes and nobody doing the exact same thing anywhere.
I learned that back in 2006 before "DevOps" was a "word" and I don't know what you call it or if that is DevOps.
And it seems like Kubernetes is trying to be Conway's law applied to that. So you have DevOps embedded in Dev teams shipping containers which run on Kube clusters that provide compute as a service to the enterprise (often another company entirely) and the platform ops teams maintain those clusters. Except now you're entirely missing the communication that I outlined needed to happen between the operations virtual team composed of the platform ops and the embedded ops in every dev team. SREs will claim they don't need that any more and that old school SA operations is a dinosaur except that every now and then I see some SRE begging to know how to ship tcpdump to one of their containers to do some debugging and I know that they're dirty little fucking liars...
It was not hyperbolic, that was what happened to my team. We had two system administrators, one who specialized in Unix and one in Windows. I was the lone programmer, having escaped a previous sysadmin position. I can do it, I don't like it, but I can do it.
Then one day I was told that our two sysadmins had been traded off and we were getting two new programmers, but they wouldn't be programmers and neither would I, we were going to be doing DevOps. I had just escaped that!
I just quit a job partly because we lost our key DevOps guy and no serious effort was made to replace them. As a result I ended up wasting huge amounts of my time dealing with operations-level stuff that made it impossible to focus on the key parts of my role (feature development etc.). I subsequently turned down a job offer from elsewhere that explained their policy was not to have dedicated DevOps resources for their SaaS platform (devs themselves being responsible for all deployment and system maintenance), and would do so again. Good DevOps people are worth their weight in gold, and at least in many verticals (e.g. those involving payments ) it's virtually mandated that there is a separation of responsibilities between those writing the code and those responsible for delivering the product to customers. I can't see the need for dedicated DevOps resources going away any time soon.
> I can't see the need for dedicated DevOps resources going away any time soon.
But is what you're describing just.... "ops" without any of the "dev"? I'm not saying that there is not a need for dedicated infrastructure and operations teams at a certain size (and in some industries), but that's not an excuse for devs to feel like they can chuck a new feature over the wall and say "Well, I'm done my job. Please run it and make sure it doesn't break".
Agreed. It also goes back to the sys admin problem which devops optimised in that the Dev team owns there code in production.
I've seen success with a dedicated devops member on a team but having a dedicated devops team just introduces delays and latency when fixing pipelines or releasing. The very same problem we had with sys admins.
At a certain point, hiring dedicated Devops frees up x number of developers to continue to develop features depending on the amount of time each developer is spending on performing those devops duties. It’s just another area management can split up job roles to capture more value and allow deeper specialization among professionals.
Sounds like a terminology issue then, I consider it "devops" because they're mostly writing code that goes into our git repo etc. etc., they still have to do PRs and code reviews etc. But the code is to handle deployments, not to implement features.
Yes: the terminology problem is the is you are using the terminology wrong.
DevOps is about cross-functional teams, and has been co-opted by vendors to sell products. And delivery of software is the ultimate feature: without it, nothing of value can be produced.
Well then we're just getting into debates about what determines the "correct" meanings of word. I read the original article and most of the comments here using my understanding of the term, and it made sense that way...
(not to beat a dead horse, but I just finished some interviews I was involved in as an advisor and the term DevOps was used a lot in all of them. And in every single case the term was used exactly the way I've understood to mean - i.e., DevOps engineers are those who primarily look after the CI/CD pipeline, generally don't write application-level code or develop features, but do very much spend the majority of their time developing scripts and tools to enable CI/CD).
I keep hearing that "we are all devops" from my PM. The real kicker is that there is a dedicated DevOps team in my organization, they "just have too much to do already".
I get some people use the term that way, but in our case, our guy really did spend ~50-60% of his time developing IaC and other deployment scripts/tools (in various languages), and the rest handling the operational side of things. He occasionally touched the application code as needed (renaming configuration variables etc.), but he didn't do feature-level development, nor was he interested in doing so.
I used to work in an org with about 200 engineers, supported by a DevOps team of ~6 devops engineers.
They spent (very roughly) half of their time doing operations, and half of their time writing software to make devops easier.
Every other month they’d announce some new tool for us to run automated tests or speed up builds. It was awesome, and made our developers more likely to do their own portion of the ops work.
> They spent (very roughly) half of their time doing operations, and half of their time writing software to make devops easier.
This is what happens in my (non-tech-industry, but heavily invested in tech stacks) employer. A previous manager decided we were no longer System Administrators but now DevOps Engineers. But none of us develops code for our features and none of us wants to develop code for features. We want to build and maintain the infrastructure on which the code runs. Some of it has to be run in-house because we deal with medical information and all that entails.
Our current manager describes not as Developer Operations but Developing Operations. We develop the infrastructure and tooling that our code writers need and do everything we can to get the moving parts out of their way. If the code written by the code writers fails, it is their responsibility to deal with it but if the infrastructure underpinning that code fails, my team has failed.
Ime this unravels quickly as the team and stack complexity grows and product engineering starts shipping software that’s very difficult/expensive to operate. The whole idea behind devops was to avoid this death spiral by giving engineers ownership and responsibility to run their own software. The problem was (still is) that off-the-shelf tooling is just not there and requires very skilled/experience devs to use it effectively. Hence dedicated teams got created and the whole thing died on the vine. Devops as a movement simply doesn’t scale.
I don't see it as massively different to specialisation between front and back-end developers though - we're all devs, we're all comfortable with reading/writing code, but whereas my specialty and focus is on application-level (typically backend) code, the DevOps guys are specialised in writing IaC scripts etc.
Yes, in principle, any dev could do it all, but there becomes a point that the mental load from juggling too many technologies outweighs your ability to be productive. And some types of development just take different mindsets - I've done my share of front-end/UI-type development in the past, and can easily pick it up again if needed, but I can't say I find it super satisfying, and it's nearly always going to be produce a better result for everyone to hand it off to someone who does. Likewise for CI/CD scripting. And it also happens that if your primary focus is on the scripts and tools necessary to get software deployed to a particular hosting environment, you're likely to have the skills necessary to take care of the day-to-day sysadmin side of things for that environment (indeed, most "manual" changes to the environment are only done as a last resort, and would be wiped by the next deployment anyway, unless the scripts are modified appropriately too).
'devops' means I'm going to hire you ostensibly to develop software, but in reality thats just going to be your '10%' time if there aren't any operational fires burning too brightly.
Clearly it doesn’t have meaning if it’s so undefined.
FWIW the progenitor of the word defined it as “agile systems administration”: what you’re talking about is the 10+ deploys a day talk from flickr; which doesn’t mention devops at all (despite the conference existing prior).
If you are hired as DevOps, or if your organisation does DevOps without any dedicated personnel, and all the devs are now DevOps, you are damn right that everyone'd be fighting all the fires first.
The idea being that the people building and running the application are incentivized to prevent fires as much as possible.
As opposed to completely separate teams, where the devs have zero incentive to prevent fires. Everyone judges them on features/sprint, so optimizing for that is perfectly logical.
That's not been my experience at all. Developers absolutely do need to be roped in to help put out fires when the application is misbehaving, and most of us quite assuredly want to keep that to a minimum. And having reliable, smooth and well-regulated DevOps processes is a huge enabler for ensuring robustness and minimizing the chance of bad code getting deployed to production.
I think what it is is that nobody wants to force more responsibilities and complexity on Developers so they hire DevOps people. Then the DevOps systems are so complex that it needs a ultra high quality engineer to run them.
The whole seed of the DevOps movement was that developers needed to do more or the company would fail. Over time, management lost their conviction when developers push back and didn't want to risk losing devs so they "outsource" the devops skills.
Not hiring a dedicated DevOps resource and making your developers do it. Guess what? You've just made your developers do operations.
The work doesn't go away just because you've shifted it. I've seen those places too and worked in some, the developers aren't very productive let's just say.
If the dedicated guy helps devs write deployment scripts, write monitoring scripts, set up backup-restore-verify cycles, etc.. then it's devops. if the devs proclaim that the devops guy should do it, then it's just the old siloed workflow again.
note, that the old flow was not a total shitshow with absolute zero productivity ... it worked for quite a while in many places, but it was bad enough in enough places that a whole "movement" grew out of the recommended solution. it's about keeping the communications/coordination/responsibility-tennis overhead down. sometimes that's best done by saying that you deploy what you wrote in any way you see fit but here's the SLA, and so on. sometimes it makes sense to create infrastructure teams and let dev teams use internal tools to deploy, sometimes this require experts at the team level, sometimes not. and ... of course this can be implemented in the most employee hostile possible way and sometimes in better ways too :)
The reason that you need to do things the ops way is because ops knows how to run applications in production. There's a reason the meme "worked in dev, ops problem now" exists. You need to meet all of the requirements of an app that's running in production from a technical, availability, security, and policy point-of-view. It's not easy and that's why this will never work.
Software is hard, it's just that a lot of developers used to cut their code, run it on their laptop, and let someone else worry about it. It's different these days (although not as much as I'd like).
We don't make you use these tools because we want to, we use these tools because we're required too. No one cared about ISO27001, SOC2, or PCIDSS compliance for your crappy PHP app you ran on your cpanel. They didn't care back then you were using md5 hashes to "secure" passwords. The world is fundamentally different to what it used to be 10-15 years ago, and the requirements from business are astronomically different.
Edit: and to people saying "oh you could just run it on a single server", no you can't because certifications like ISO27001 require certain levels of availability and DR. You're not going to be able to guarantee that with a single server running in a rack somewhere.
I'm assuming you mean your comment, not the post itself.
> The reason that you need to do things the ops way is because ops knows how to run applications in production
Stability in production is one metric. Ops overindexing on this metric is exactly what causes the friction with developers.
Developers are trying to ship value to customers.
Uptime is only one part of that equation and for most businesses, it's not even a very important one.
The author points this out near the end. DevOps can't convince devs to use ops techniques if all the reasons for using those techniques are based on the flawed assumption that development velocity isn't important.
> DevOps can’t convince devs to use ops techniques
If “DevOps” is the name of a role, and part of the funtion of that role is “convince devs to use ops techniques”, then I feel like the concept of DevOps is lost. Devs need to own ops, including its costs, which is what convinces them to use ops-appropriate techniques, not some outsider jawing at them.
> “Developers are trying to ship value to customers.”
I’ve also seen this be fairly rare. Devs shipping nothing - not even aware if what they're merging will turn on.
They write something they haven’t really tested, merge it, and call it done - a user may never see it and they don’t have any knowledge about how the thing actually gets built and shipped.
Obviously this is worst-case, but in my experience this is a common default. The complaints about friction are because they’re actually forced to reason about how the machine works in order to ship something beyond merge.
> They write something they haven’t really tested, merge it, and call it done
This is a problem of incentives. For all intents and purposes, the dev organization ceases to care the moment something is merged. Nobody is rewarded for making sure everything is fine all the way to production.
Now if you ship two extra Jira tickets this sprint however...
I don't generally make a habit of basing my software development lifecycle methodologies on the lowest common denominator engineering org. Some shops ship value to customers with tests and metrics every day.
Certifications are a very good point because afaik ISO 27001 is now far more achievable for far more companies of smaller sizes with not that many IT staff. Sometimes even 3 good engineers can set up everything needed to pass ISO in a small company in like half a year or something.
Meh…DevOps is just System Administration, and Systems Administration is just Sys Ops. They keep changing the title/role but the work remains largely the same. I think is a bit disingenuous to throw “dev” in title, as a “DevOps Engineer” myself I don’t consider anything I ever do “dev”. Ansible is not “dev”, terraform is not “dev”, ci/cd pipelines are not “dev”, helm charts aren’t “dev”. But for some reason companies seem to love the term.
> But for some reason companies seem to love the term.
It's possible that I'm just getting very pessimistic, but at this point I'm fairly confident that companies love it because it makes it way easier to attract candidates and describe one set of responsibilities/position in an interview process, and then bait-and-switch it into what is effectively a systems administrator role.
I've certainly had interviews like that. In fact my first full time job out of uni was one of those and I made the error (in hindsight) of sticking it out until I could transfer into another role. Now I'm much more careful to screen for sys admin keywords in job descriptions.
Depends on what you think Developer Operations should be. Our developers instantiate their buckets, databases, cache instances etc. themselves, deploy microservices themselves and update configuration, traffic management and scaling parameters themselves. No 'system' people required. The system people are mostly just keeping the automation running and add features as needed.
The work also really isn't the same. Unless you're stuck in the 90's we aren't building servers, installing operating systems, installing applications and installing patches anymore.
> Depends on what you think Developer Operations should be. Our developers instantiate their buckets, databases, cache instances etc. themselves, deploy microservices themselves and update configuration, traffic management and scaling parameters themselves. No 'system' people required. The system people are mostly just keeping the automation running and add features as needed.
When I read this though, I just think about how much time your developers are not actually developing because they're doing operational-side work.
I have the situation where my developers do this stuff, then things break or need debugging and they don't really know how to dig into any of this stack in any meaningful way, so the problems tend to compound. Meanwhile, they're not writing code. The cadence of development seems massively slower to me (coming from a traditional background where they're writing to a clear Ops-set target environment).
The logical outcome is to hire someone who is an expert in all this infrastructure stuff to help manage it - ostensibly, a "DevOps" person, but really, a classic Operations person, just for cloud.
> When I read this though, I just think about how much time your developers are not actually developing because they're doing operational-side work.
Cool. Cool cool cool.
Now instead of doing these things that first time takes a few days, second time a few hours and after that are barely noticeable you dev your thing, then it goes to an Ops queue, which (hopefully next day) comes back with a big fat "heute leider nicht" because you did not understand how the queue works or something. You fix your thing, it goes back to a queue, etc etc.
So your flow is still messed up and yout Time to First Eyeball has increased quite a bit as you have high latency round-trips with external parties.
Obviously there is a tradeoff somewhere, does the dev know all the opsy things or does she know enough to only need to ops when the paved road is not enough or is it the good old "over the wall it goes".
So instead of having a team of people own something (a product) you are splitting ownership up in so many ways, nobody really owns it, and now you have to make everyone waste their time on KPI tracking because somehow that's better than someone just owning their small product as part of a larger business?
This is pretty much a waterfall vs. iterations discussion where the reality is that the world doesn't stop moving while you're working.
If a product team needs to deliver something (i.e. "allow users to select a delivery window for an order"), they might update their microsite with the components and API calls to do this, and then update their API microservice to exchange that information based on the customer's identity (so they can only update their own order delivery data). None of that knowledge is going to exist in a classic ops team, nor should it be if such a team were still relevant.
At the same time, that team might need to take into account that there might be order peaks during the day and they have to collect and act on the right metrics to know how well their code works, how many resources they are consuming and what their scaling policy should be. None of that has anything to do with ops either.
Then, at the end of the day, they'll use all of this information and domain specific knowledge to decide:
- is the feature complete
- can users make proper use of it
- do we need to invest time and effort in code optimisations
- do we need to extract this feature into a separate scalable entity to prevent disruptions to neighbouring features
None of this is relevant or knowledge for ops either. Nobody will be able to manage a feature like that better than a product team with a few developers, a few features, a (shared) product owner, and perhaps a (shared) domain expert or tester. But the people who wrote the code own the implementation (but usually they don't own the business requirements).
What ops (or devops enablers) would be doing in this case is:
- make sure metrics aggregation is working as expected
- make sure scaling is working as expected
- make sure deployments and configuration updates are working
- construct new deployment options if needed
- construct new metric aggregation systems if needed
- construct new scaling and traffic management systems if needed
- notify the teams/owners if they are using deprecated resources or features for too long
Perhaps a simple but too vague summary of the above would be: small feature-owning teams build and maintain their features, devops people enable many small teams to do their work by providing shared systems.
A developer isn't just a 'writes some code, clocks out' type of value to a company. If it was that simple, robots could do it, and there would not be a shortage of skilled people. There are many more dimensions to work, and unless you work at a very large scale company, or a very restrictive company, it's highly unlikely that you'll be isolated from the world and just write a bit of anonymous source code with no idea of what happens before or after.
> What ops (or devops enablers) would be doing in this case is:
> small feature-owning teams build and maintain their features, devops people enable many small teams to do their work by providing shared systems.
So I don't think I get your point. The stuff you've listed here just looks like stuff straight old Ops would do, set the parameters for Dev, and then things would just roll on. Saying "ops (or devops enablers)" like that is confusing me because the whole point of the original article (I thought) was to note that DevOps has basically become Ops, so what have we actually achieved?
i.e., that as soon as you distinguish between "devs" and "devops" people, you've basically already drawn a line between traditional dev and ops roles.
You are indeed not getting the point. Ops people do not 'set the parameters'.
DevOps doesn't become ops, and devs don't become ops either. The only ops left is perhaps contract management that really doesn't have anything to do with the developers.
DevOps can mean the practise of intersecting tasks that classically would be categorised as either development or operations, but it can also mean the people with enough skills to understand the needs of the developed features as well as the requirements of a useful infrastructure.
Take a very dated example:
- Operations might request new server hardware purchase orders
- Then they rack it, install an OS
- Someone else, say, application management, installs Jenkins on it
- Someone else yet again configures Jenkins so Developers can use it
- A developer now logs in to Jenkins, and clicks build on their product or entire project
- Operations gets a fax from Jenkins about a new build being ready to be installed
- Operations sends the build to a different application management person
- Different application management installs the new build
- Developer can finally see the end result
If at any step something goes wrong dependency-wise, the person involved is not allowed or not able to do anything about it because it is outside of their scope. This is classic operations. On top of that, all those layers are annoying and inefficient.
Instead of all that we have new tools, and new grouping of activities which run in parallel:
DevOps-enabling person:
- Notices a shared CI system would benefit everyone (developers, customers, finance)
- Proposes such a system and perhaps presents a demonstration
- Configures a virtual or container system from any XaaS provider to provide this CI system to anyone who wants it
- Adds integration with SCM
- Tries out a few builds to test SCM lifecycle
Developer:
- Writes code, commits to SCM
- SCM fires a CI pipeline
- Developer can see the end result
- Makes additional modifications, immediately sees the result again because nobody else is required to perform those tasks
- Maybe additional CI options are needed, create commit with CI changes and update CI to have those extra options available
- Instant results yet again
Classical operations only existed because there were no tools and no shared responsibilities. The world has changed and now shared responsibilities and tools to act on those exist. Net result: a whole lot less people waiting on each other because they all operate in their own fiefdom.
Operations is dead, unless you are still playing datacenter at the office. The closest remaining thing is the service desk and perhaps desktop management for end-users that are locked into some Citrix hell. But we were talking about DevOps which is the intersection of Development tasks and Operational tasks, not end-users and office equipment ;-)
> The work also really isn't the same. Unless you're stuck in the 90's we aren't building servers, installing operating systems, installing applications and installing patches anymore.
I guess I am stuck in the 90’s then, I absolutely do all of the those still.
I agree with most of what you say, and in particular with companies' love for the term, but I disagree that "the work remains largely the same". When I got started on this line of work, we used cvs to track program code and we used backups to 'track' infrastructure, including code used to manage the infra (mostly shell scripts, though not only that).
There's a long path from that to ansible and terraform on SCM.
Another big difference I have experienced: we used to literally celebrate server uptime (I mean as a celebration, I have a distinct memory of gathering around an IBM "fridge" to celebrate the uptime birthday of a particular RS/6000) while now a piece of infra with too much uptime is a red flag about potential vulnerabilities.
What does largely remain the same, I think, are the skills needed to be good at this. Then, and now, we need people who don't mind reading manuals, searching online (this was already a thing when I started, I guess you'd have to go back to the mid 90s for this to not be the case?) , who can keep track of where they've been during a debugging/troubleshooting session, that sort of thing.
Another thing that changed is that in the past some people considered it a badge of honor to be assholes to others not in sysadmin, even more to others not in IT (remember those "select * from users where clue > 0" t-shirts, or the BOFH stories?), while now that's typically frowned upon and quite a few companies are explicit about a no assholes policy in their hiring material (or perhaps I've just been lucky with my teammates and smarter when picking where to work at than when I was younger).
>but I disagree that "the work remains largely the same
I meant at a very high level. The basic responsibilities haven't changed.
* Deploy/configure infrastructure
* Deploy applications into infrastructure
* Monitor/Secure/Maintain infrastructure
* Scale infrastructure as needed
Sure in the 90's there was no Terraform, and deploying infrastructure meant getting physical hardware and racking it up. Now, you can use Terraform to deploy infrastructure to the cloud, on hardware you rent. So yeah, of course over the years the tools have changed. And sure, as you pointed out, even mentalities have changed (being proud to have a server with 300 days uptime, vs. being ashamed of that).
You can call it "Sys Admin", "DevOps", "Site Reliability Engineer", or whatever, these are all largely the same "Make sure the infrastructure works, is secure, scalable, and help deploy to it." Even with with "cloud managed" things, you still need to setup, config, and secure it. You can have "cloud managed" k8s, it isn't going to stop developers for using bad practices, like running containers as root, and not having a standard deployment process (because each dev is just doing their own thing).
I think the main problem is that the "DevOps" role significantly differs company to company. A company that desperately needs a solid administrator might not be able to attract the right talent and as a result, end up classifying the open positions as "DevOps engineer". At the same time, there are companies out there legitimately trying to bridge the divide between the two — software and system administrators — job families.
From my understanding, DevOps was never about technical solutions/processes. It was about giving the same business goals to Dev and Ops. The idea being to eliminate the tension netween these departments because the goal of Dev was to ship features and the goal of Ops was to ensure the system stayed up.
I think devops means something here, sysops would be about running infrastructure not dev infrastructure, devops would focus on producing dev envs, test envs, CI/CD. Not just setting up the runtime hardware / os configuration.
Meh... Dev is just clicking on whatever your IDE autocompletes for you. With copilot you do even less. I think some programmers out there have some big heads that need popping.
Building out and automating cloud infrastructure so your simple code can work is way more complex than most things you do every day. But ya, keep telling yourself how smart you are as your write "connect to database, return a value" for the 1000th time.
> the anecdotal evidence I’ve gathered has been that the conference are heavily attended by operations, and less attended by developers.
Most Ops people - fuck, even most actual DevOps Engineers - have no clue what the fuck DevOps is. They (rightfully) assume it's just a trendy new word for the same old Ops bull, but in the cloud and with Terraform.
DevOps failed because it never educated anybody except a very small handful of people who actively were looking to solve big organizational problems. It was too many things to too few people. It could succeed only if you brought everybody in the entire org through three different training courses. And that's because DevOps tries to make Operations uplift literally the entire technology organization.
DevOps is dead. The ideas are great, but we need to bring the ideas to people outside Ops. Until then it will just be a slightly-more-technologically-advanced Ops.
(disclaimer: I am a DevOps Engineer that hates the fact that this is my title)
My take is different. I think DevOps was wildly successful, most of our infrastructure is now software that can be managed by Software Engineers. The goal posts have shifted, we now have major software challenges where as before we had hardware and operational challenges.
Well written tools and cross-functional teams that do both operations, feature work and security are still the path forward IMO, we just need to refocus on developer experience.
We tried to hire senior devs to do DevOps work, but the ones that can pass the interview and have already been at a DevOps shop are too smart to be fooled a second time.
We still use all the DevOps buzzword stacks, but we stopped doing dev ops. Instead, we are building out a really good ops team. This makes it possible to hire developers again.
Personally, I'm one of the better ops people on the developer side of the fence, but I'd need at least 2x typical principle engineer comp to take another job at a DevOps shop, and also wouldn't get even a quarter of my normal productivity.
At that point, you may as well just hire a junior undergrad graduate and burn the rest of your cash.
As a software engineer I don’t want to touch your infrastructure code. I have been lucky so far and I have been doing pure product development instead of being a devops. I do believe though in the idea of cross functional teams: designer, developer, infra engineers, managers.
And now 50% time software engineers are writing infrastructure. Don’t know what is solution but cloud-native landscape has increased cognitive overload.
The point is obviously to pay less people to do more work. What I don't get is when developers themselves are in favor of it like I constantly see with this devops stuff. There's no way they have any kind of life outside of their job. There's no way they have a wife or children, otherwise I simply don't believe for a second they would be in favor of "developers own all the things yay!".
> What I don't get is when developers themselves are in favor of it
If you're comfortable with AWS and have built things as a software developer, it becomes clear very quickly. These things are intrinsically linked, and pretending they aren't is just kicking the can down the road until you have to solve some non-trivial problem.
There have been huge innovations and value-adds over the last 10+ years in cloud and serverless, yet everywhere I've worked that silos "DevOps" from devs has already baked in the culture that devs can just avoid knowing anything about AWS, that DevOps will be the gatekeepers, and that devs can just work within the "lowest common denominator" box of tooling that those gatekeepers think is appropriate. Meanwhile, infra costs are skyrocketing but it's all good because we're mostly "cloud agnostic."
I don't want to just be closing Jira tickets. I want to actually solve business problems well. And to do that, I don't want to be constrained to someone else's "box," throwing code over the wall to them, and hoping for the best.
I get paid pretty well fixing bugs and writing code, that stuff you call "just closing Jira tickets" and "throwing code over the wall". It's also enough to fry my brain and leave me exhausted. But it's obviously worth next to nothing in your view. So yeah, I don't care. I can't figure out if you're going to have to find people a lot smarter than me to do what you're looking for, or people a lot dumber than me.
> It's also enough to fry my brain and leave me exhausted. But it's obviously worth next to nothing in your view.
It's fine up until you have to solve hard problems. Once you need autoscaling or more of a datastore than you can get from vertically scaling a relational database, your brain will be REALLY fried trying to solve those things without touching anything at the Kubernetes or AWS layer.
Or it won't really be fried, because it won't really get solved. That's the pattern I've seen more often: just keep scaling up the CPU and RAM for individual containers / instances because devs can't solve it without DevOps, and DevOps can't solve it without devs. Cloud costs keep going up, and the problem's not really solved, but at least nobody had to understand more than they wanted to.
+1 tiny teams run infrastructure that previously was responsibility of whole departments at bigco and things mostly work well. Then as soon as they face some minor, totally solvable issues everyone loses their minds.
I disagree, i move in different circles from the author. In my little world DevOps is often more dev-heavy (and needs to mature it’s ops credentials), SRE has become the new-age Ops.
DevOps is primarily tackling a people problem, there’s certainly plenty of useful tools to help but at its core, it’s about people.
Encouraging people to work in frequent small increments (think XP) rather than quarterly releases. Getting rid of the change management beauraucracy, encouraging developers to consider security as they’re designing and writing software. DevOps is a broad church, it’s effective too.
People like to over-focus on the tools of DevOps but I’d take today’s world of version controlled (vs. “Ohh Brad has the config GUI code on his desktop, speak to him about making a change“), automated CI pipeline (vs. builds at the end of the quarter that require stop-the-world help and assistance from all developers, also RIP merge guy), automated deployment into many environments (vs. Excel list of tasks for QA team and different excel list of tasks for prod ops). I’ll take SOA over the-DB-is-the-network 00’s enterprise software architecture, oh and with automated DB deployment (albeit still without a real solution to automating stateful db rollback even today). These are all DevOps factors, from architecture to testing, from security to working and growing together as a team.
The tool-only view of DevOps is just yet another excuse to ignore the hard problem: helping large numbers of people work productively together.
It appears that you and the author both agree on the core definition of "DevOps" as a term.
Mr. Briggs' article has two main points:
* convincing developers to do operations is too hard, and
* what they actually need is another type of engineer to build and manage an operational platform for them
With regards to the second point, that already exists: platform engineering.
The first point is messier. I disagree with the point, but it also misses the point. Like you say, the real goal is to help large numbers of people work productively together. That's a nebulous goal, though, and difficult to guide people towards. I think that's why people are confused and frustrated by "DevOps" even this long after its conception. It's also why people will continue to be confused and frustrated by it long after the term itself has gone away.
Devops is a failure only when you have the wrong people on the job. Which is nearly all the time because only 1 in 20 engineers has the ability to do both operational and development work to a standard high enough not to put your organisation at risk.
If you have several of those 1 in 20 people they will self organise into a methodology which roughly resembles it.
It doesn't help that, at least in my experience, a lot of "pure" software developers absolutely want nothing to do with the ops side, preferring to work on new features and just code, and vice versa.
I happily also work on bugs, not just new features. The thing is, bugs and features completely fries my brain. Especially if you want it done right and not introduce bugs somewhere else. Forget devops for a second, what I'm saying is that what I described is _already_ enough to fry me. I'm done. You're getting nothing else out of me. When you hire an electrician to wire your house, ask him to also do the plumbing while he's at it. He should really "own" the entire house stack. Let's see how he takes it.
A DevOps position is partially sales role. Telling devs that their well-known and understood scp workflow is bad and instead the should use this tool they don't know to look at logs and this completely different tool to deploy the app is bound to fail. If you want to change the way developers work, you need to show them why the new way is better and make the switch as painless as possible. And if you can't do this, you should think hard about why you want them to make that change.
Telling developers to [do the ops team's job, the way the ops team would, assuming the ops people were actually doing their jobs and not just swamping the dev with weird high priority but boring bugs] is an extremely hard sell.
I greatly prefer it when the ops team tracks the time they spend on incidents, then stack ranks the ways in which my software sucks (in terms of hours of their time wasted), then asks me to fix it.
The same thing goes for $/perf discussions.
I don't care if they want to SCP or ftp or send my code over a QUIC tunnel through a bastion host on the moon.
If developers have to know there is a new or old way of doing that, then the CI/CD team has failed.
If you want me (the dev) to fix CI/CD, then I'll happily reimplement it with make and a fault tolerant NFS share. I'll even explain why that has been the best available approach since ~1985.
I actually prefer to "just code" by cleaning up old messes than by working on new features, but regardless - I absolutely want nothing to do with the ops side, and I will quit on the spot if anyone tries to hand me a pager. It's a different skill set; it's not what I do, nor want to do.
> It doesn't help that, at least in my experience, a lot of "pure" software developers absolutely want nothing to do with the ops side, preferring to work on new features and just code, and vice versa.
I have almost the exact opposite problem with some of my team, where they want to spend a lot more time on the Operational side. This is a mixed blessing - they'll learn the depths of various AWS things that are painfully boring to me - but I'd rather they were writing code instead of mucking around looking for some magic AWS solution to solve some specific use case.
This is true that from business perspective if there was a way to completely cut off engineers and replace them with low-code / saas offerings -> we all would be out of the job next week.
Fortunately for us engineering is quite complicated.
The problem with Docker and friends, from my perspective as a developer, is that it's a whole other skill-set you have to learn, keep up with, and context-switch into and out of. I think that's why it usually ends up being its own job: maybe those technologies bring some order to a huge messy problem-space, but they don't make the problem-space simple. The abstraction is way, way too leaky to use it without knowing and thinking about all the ins and outs of everything that's going on underneath. And you can't even just use the existing knowledge you might have of scripting and systems: you have to learn whole new technologies on top of those. And as a dev, who's already thinking about all the ins and outs of my own whole complex system, that just kneecaps my ability to stay in a headspace where I can get things done.
If you want me to do my own ops, they need to have a Heroku level of simplicity. That's what's required for them to not detract from my other work, and not drive me to burnout. Anything less, and I'm going to just keep tossing things over the fence.
I 100% agree with this. I love Docker, Terraform, etc. because it's much closer to the world I usually inhabit than patching drivers on physical servers, but I find it frustrating that once every month or two I'm asked to solve some issue that really requires the specialized knowledge I'd only develop if I was doing DevOps frequently. Instead, I get to the point of realizing there's some issue deep in the AWS networking stack and wish that I didn't have to spend the next day fiddling with subnets (and then not taking away anything tangible to help me solve the unrelated issue that'll crop up in two months' time).
I think a lot of good ideas in tech get communicated to people who want a cargo cult and don't understand the underlying good ideas until agile becomes a flavor of waterfall, test driven development becomes development with tests, and devops becomes just another wall between developers and their end product.
The actual job title was meant to be “Agile Systems Administrator”.
People got the conference confused with the “10 deploys a day” talk from Flickr.
DevOps is a failure because it’s a meaningless term that changes depending on the bearers own understanding: the issue is that everyone is right. It’s a nebulous idea.
I wrote, rather frustratedly, about this. I even did the research and learned something new myself at the time.
Trying to do DevOps without any [sysadmins, network engineers, DBAs, security specialists] is like trying to write an accounting system without any CPAs, or manage a chemical processing plant without any chemical engineers. Reading the books as a substitute for experts with experience only works a little way.
The point of DevOps was to take that experience and multiply its effectiveness through the tools of modern software development.
In practice, modern DevOps is more like having chemical engineers set all the knobs at one plant.
Instead of reading the books or hiring more chemical engineers, you have a computer copy the knob settings from the old plant to the new, unrelated plant.
It's interesting to read the proposed "SoftOps" approach - it's something I've been trying to do in my own little bubble, working on making the lives of developers easier (especially when getting their code from repo to prod), with the developers creating systems. I volunteer for a convention run entirely within VRChat, with a handful of supporting services (API for tracking players, instances and registration, frontends for admin and attendees, and so on), and while the department is named "DevOps" (too late to change it, and everyone knows what we mean), I landed on the infrastructure subteam. While part of the job was building out our new Kubernetes cluster (mostly for scaling. I'm still working on the blog post about this sort of stuff), I wanted to make sure that the developers could be completely abstracted away from _where_ their code was running, and as much as possible _how_. Of course, they're free to mess with whatever tooling I setup for them (mostly Docker images, GitHub Actions, etc), but I've found it very fulfilling to "exist to serve" (because infrastructure? servers?) the developers pushing out code. If we need some custom internal tooling to support that development cycle I'm more than happy to get something written and deployed (something for the blog post). It may be barebones but I wouldn't stop a developer from another team contributing (provided they don't have something more pressing).
All this to say... I really like supporting developers in their work, as a sort of meta-engineer.
edit: worth pointing out I do something sort of similar at my current employer, but it lands a bit more on the type of devops that the author describes as having failed, and I'm actively working to see if I can find a position that better suites my skillset!
Just use the cloud, honestly. "No, but there is this weird thing we do, and need custom yada yada", no just stop, conform to the cloud, stop being a snowflake. "Its so expensive!". No, its still cheaper than on prim and paying dedicated teams to manage it. Just use the cloud in a sane way. Teach your devs with the infinite guides and docs freely available online. Your life will be easier. If your devs aren't interested in your infrastructure, they are bad devs.
It's the same as many other things in software - if all you care about is writing software, then yes, you are not a good developer. You need to have an appreciation for the environment it runs in, interfaces between you and other teams, the key business requirements, the constraints it's under, etc. etc.
Critic: Even if Helm and Helm Charts makes it easy to install a set of apps into a Kubernetes cluster, surely some actions are more complicated than a simple install? What about upgrading a PostGres database?
Advocate: Way ahead of you! We worked out that problem a long time ago! You just use a Helm Operator! It’s all really simple!
Critic: Really? And this solves all the problems of upgrading PostGres in a stable and reliable way?
Advocate: Uh, well, it’s supposed to. It’s, uh, all really simple?
Critic: Supposed to?
Advocate: Sure, so long as you have a working Go environment, you just use the Operator SDK to generate the scaffold for your Operator.
Critic: The scaffold?
Advocate: Sure, the scaffold sets up the basics, figures out the permissions, the dependencies, everything needed to install the Helm Chart. Then you can build the Operator container, and install it in your Kubernetes cluster. It’s all really simple!
Critic: This sounds complicated.
Advocate: Way ahead of you! The good folks at RedHat knew people like you were going to whine about stuff, since you obviously like to whine about stuff, so they created the Operator Lifecycle Manager to make all of this a lot easier.
Critic: Doesn’t it seem like we keep piling new technology on top of new technology, to manage the excessive complications of the previous layer of technologies?
Advocate: Hey, think of the alternatives. You don’t want to go back to the bad old days of the past, do you?
Critic: You mean, the bad old days when stuff mostly worked and I didn’t have to learn 3 new alpha technologies each day?
Advocate: That’s a ridiculous exaggeration! Some of these technologies are beta.
Critic: And you seriously regard these piles of code, heaped upon piles of code, as an improvement on the old situation?
Advocate: Are you kidding? It’s like night and day. I’d rather drink arsenic than get dragged back to the bad old days when I had to write Ansible scripts. We live in the future now. We’ve escaped the old world where every attempt at devops became a painful, confusing, unmaintainable disaster after 2 years.
Critic: How long have you been using Docker/Kubernetes in production?
Mechanical engineers had DevOps figured out long ago. They call it DFM - Design For Manufacturing.
It doesn’t mean every designer has his own metal sheet press on his desk. It means he understands his component is used in a bigger picture, should fit into the robot assembling it, use the same type of plastic and standardized screw as all other components, and so on.
Same goes for software, running your own shit sounds romantic and all, but having every developer in your company picking their own monitoring stack or understanding the intricacies of kubernetes networking is really not effective. Some central guidance defining the standard screw head to use is needed here as well.
I agree that DevOps is a failure, but not for the reasons that OP states. Or maybe exactly for the reasons OP states.
People that would traditionally be called Ops were turned into DevOps overnight, and felt like they suddenly had the mandate to try and barge into development, offering true, but ultimately misguided help.
If you haven't asked me what I need yet, don't bring me a platform that is "going to solve all my problems".
I imagine this is why the developers in OP's story do everything themselves. At least they're able to fix something if it breaks, instead of having to rely on the ops team that is off chasing the next shiny thing.
Tired of working on infrastructure stuff while holding the “software engineer” role. Give me product features, bugs, interesting and boring product problems to solve. Keep your yaml, k8s, gitlab ci/cd, terraform, on call rotations. Or better, hire a dedicated infra engineer to handle all that stuff.
Both DevOps and Front-End hysteria are enormous. My conspiracy theory is that FAANG+MS does this to burn down resources and prevent anything interesting created outside of their sphere of influence. In one workplace I was barely keeping my head over the surface in the Angular+NGRX+Observable cesspool then started to talk about corresponding infra and the backend person was all like "let me configure and provision an autoscaling geodistributed self-healing supercluster". KILL. ME.
DevOps is a failure ---- where you work and how you see it.
Just because you play bs bingo for naming things does not change your internal processes and no i dont mean the "culture" but the need for the term "SoftOps" because they are "for developers" just explains to me that your org probably has a huge communication problem and that is bad to begin with.
Look we use terms like "DevOps" Engineer to explain that we do all things around the stack, but serving the devs is only a tiny tiny part of this. First and foremost our dutys are for the CTO and the Vision. Now the CTO should and does appreciate and use the input of his team to perfect the vision but you get the point.
OPS isnt there to serve you or limit your spending either, OPS should take care of compliance in all areas needed and to keep the operation of the org running.
DevOps ___for me___ means that the person I am working with can handle the software in the context its being brought up. That includes people operations just as much as the workflows we put in git. It should have been the next level of the Senior Programmers, but since we are truly lacking Programmers that actually do unterstand stack, code and sourrounding requirments, we foget what the term really means.
"DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality.[1] DevOps is complementary with Agile software development; several DevOps aspects came from the Agile methodology" (wikipedia.com)
its not a synonym for "dude who sets up your cluster"
From a developer perspective, DevOps is when you fire all the competent operations people, and then force developers (that chose not to be in ops) to operate the code they wrote.
This burns out developers (bait and switch tends to do that), and with silicon valley's standard 4 year vesting cliff, the people that wrote the software just quit and go elsewhere (or get promoted out of the team) about a year after whatever system they built hits prod.
Now (since hiring competent ops people isn't shifting left or right or whatever), you need to hire more systems developers, and pay them hazard pay to operate other people's abandoned garbage. (It is garbage because the previously burnt out group of systems engineers was constantly distracted by things they do not care about. They were constantly distracted with prod issues because they are bad at ops.)
I've worked in devops teams, and with competent operations teams. The latter ends up creating a much higher quality of life for the developers and ops team and also a better product.
In related news, hiring an optometrist to clean your teeth is a bad idea, even though they went to med school.
I've decided "full stack" is just the 21st century was what we called "web developer" in the 20th, which usually mean "one person doing the work of two or three". Except now the expectation is to know a lot more.
In the 20th century being a "web developer" was a reasonable thing. One person could pretty much do it all without much of an issue, at least for smaller sites. Today, with the countless frameworks and stacks, mounting complexity, and sprawl of code across dozens or hundreds of files... it no longer seems realistic.
This is a very good point. Companies would love to hire someone doing work of 3-4 dedicated ppl for the salary of 2.
As a DevOps Engineer I pretty much am required to know the whole ecosystem of:
- JavaScript
- Java
- Golang
- Python
- C/C++
On top of that companies often expect you have also at least associate architect for:
- AWS
- Azure
- GCP
And ofcourse a security expert that can write OAuth2 implementation in any of the languages mentioned while performing security audits for security certifications!
I prefer to blur the lines and work with cross functional teams where everyone owns everything. On my current team our front end engineer is able to fix infra, our backend engineer designs and deploys infra. No dedicate devops needed. We use TypeScript for everything: AWS CDK, TypeScript backend, React (TypeScript) for frontend. These are relatively small systems though.
That sounds really fun, because it's not enough to absolutely fry my brain every day fixing obscure bugs or developing new features. I'd like to also fix the infrastructure. I might also be able to handle the responsibilities of a DBA. My wife and newborn son will understand why I can't stop working until at least 7 pm everyday and yell at them because the stress never stops and only seems to be getting worse.
At a certain point operations becomes development and vice versa.
When you really need to be good at managing computers and servers, you have to understand how those things work. At a certain point you're literally writing shell or even Python scripts.
When you need to be really good at programming computers and servers, you need to understand the underlying OS and hardware, for maximum performance and to solve particularly-annoying bugs. At a certain point in optimizing your development and configuring your code package / repository / etc. you're basically doing ops stuff.
Of course there are people who focus on developing and not really managing their systems, and people who go deep into managing the systems and not really learning to code. But I would imagine any half-decent ops person has at least basic knowledge in software development. And while there are people who can truly be both full-time developers and full-time operations, they probably don't do both in their job.
I don't think DevOps can really be a failure anywhere as it isn't old enough or solidified enough to have a universal definition and any group of activities itself can be mis-performed regardless of what you call it. That includes being a bad webmaster and having a system operations department that isn't able to actually deliver anything.
If Development and Operations intersect for some activities, and sharing responsibilities for those activities helps, then it's a win. That includes developers not having to wait for some resource to be provisioned because they are "not allowed to click the button themselves", and operations people not having to do the same "create resource" task all day long.
The worst thing about DevOps engineers having Dev part of their title is that they started to believe they know better than the devs what needs to be done and how.
Oh you need a queue here.. Nah you don’t need Kafka..
The Author here is right.. first listen to your customer - the developer
The point of the DevOps movement was to unify Dev and Ops, not rename SysOps to DevOps. The Devs aren't the customer of DevOps, they are the people who are now doing Ops as Ops and Dev were to be merged into a single DevOps, with developers simply coding the operations in the same way they code everything else in the product.
Every company has a different definition of what DevOps is and the success metrics surrounding it. I think the main principle should be though that you help developers move at their own pace. The way you do it will be different at every company. Some resorted to creating internal tools or using something like Backstage. Others literally forcing developers to learn AWS and Terraform so that the developers know exactly how their applications run. I think nowadays, there is more benefits for developers to have experience on how their application get built, deployed, and ran.
excellence in software engineering isn’t something you can buy. a university can’t bestow it. a title or certification can’t assert it.
it’s something one grows over time because they are passionately curious. it grows because they love to explore, to discover, to understand.
there isn’t a product that can be purchased and consumed that makes you thinner and fit. software engineering is no different. the only way to grow is to exercise, and the best way to exercise is to love doing it because it is fun.
manipulating metal, cloud, laptop, browser, and server are not very different. all can be made dumpster fires. all can be made magnificently simple, or magnificent in some other aspect. all are discoverable, and easy to experiment with. all are at your fingertips, cheaper and more accessible than ever.
DevOps is a product, best to avoid the purchase.
devops is a fantastic idea. one should be good with infrastructure, because that is where code runs. one should be good with code, because that is what runs. one should be able to reason about their systems from top to bottom and inside out. how else could one simply intuit what is wrong, and make it less so?
excellence in software engineering is about seeking out that which is ugly, and making it beautiful. seeking that which is hard, and making it easy. seeking that which is impossible, and making it hard.
why would one accept any arbitrary boundary, and think that this reality applies only on one side of it?
where we’re going, we don’t need roads. we need curiosity, imagination, and fun.
A bad devops team becomes what I call "dev obs" which is short for a phrase I made up, developer obstructions. I use this to describe devops teams that have forgotten (or maybe never knew) their primary purpose and instead do their own thing independent of the developers and business they are supposed to be supporting.
This can include:
Upgrading, replacing, deprecating, turning off key parts of infrastructure without any notice or warning
"Scheduled" changes to systems that are scheduled nowhere but their own minds and then become angry when this is pointed out
Turning off logging because "it's expensive to store logs" (get a better logging system then) totally ignoring the huge costs of production failing with no visibility as to why
Enforcing their own ideas of unit test coverage and the like without speaking to the developers, as if they know more about it than the developers
Enforcing package management security in a way that simply isn't compatible with a particular language's tooling leading to the scenario that all deployments, even security hotfixes, cannot be deployed without needing to escalate and get into arguments
Taking a generally antagonistic and hostile approach to communicating with the developers rather than again supporting them
> Taking a generally antagonistic and hostile approach to communicating with the developers rather than again supporting them
This point is very ironic given your entire comment.
It sounds like many of your complaints may be due to you not understanding the requirements that another team may have. You're complaining about costs and requirements that you are trying imposing on another team ($$, lax security).
As an example, logging can get extremely expensive. A "better system" can't always fix an ungodly amount of unnecessary data.
It might be best if you try to understand why they are working the way that they are. Trust me, they are not doing unnecessary work just trying to fuck with you.
From a developer's point of view, it seems to me like the dominance of cloud and containers forced the change to devops. Hitherto your org would have a CM team and a separate group of admins/Ops for your deployment infrastructure. Much of it was manually done (follow procedure docs rather than scripting) and software released on a slower cadence.
So at the risk of seeing the past with rose-colored glasses, I'd say things were more straight-forward, less reproducible, slower, and (for developers) simpler.
On the subject of bringing developers to the table: At a previous employer the DevOps folks replaced the CI/CD tools we were happily using and broke my team's integration testing setup. The new tool didn't support what we needed and had been using successfully. Right at the time we were building out new services that depended on having solid integration tests.
I was DevOps at my last job but accepted a new job just as a developer. When I was DevOps, I was also system admin, SecOps, etc. I was always on call and had no real backup. Luckily most things I had set up just worked, but honestly it wasn't worth the stress. It wasn't consant stress, just a sudden surge of stress during certain deployments or problems.
Eh, I can make a nice and similar argument against "full-stack" developers. I'm one of those, but I specialise in the front-end. Because that's what I'm really, really good at.
When I see "full-stack" engineers do their thing in the front-end I almost always spend an outrageous amount of time on their pull requests. No semantics, unnecessary CSS, no accessibility features, not using the right tag for the job, not familiar with browser APIs, not familiar with cross-browser concerns, nesting tags that shouldn't be nested, not familiar with paint/composite/layout and other performance tools, using `grid` where they should use a `flex` and using 3 layers of nested `div` tags where zero would suffice.
If they suck this bad at the front-end, as a supposed "full-stack" dev, I can only imagine how little they also know of the back-end, let alone the operations side of things, not to mention testing, CI/CD, etc.
There have been a few clear changes over time. It used to be that you delivered software to the ops team who then took it upon themselves to create servers to run that software, package up the software, testing it, etc. Delivering software usually went in the form of an email suggesting that this or that person might want to put software on a server or use some pre-built artifact.
These days software teams deliver ready to run software; usually in the form of some containerized binary produced by a continuous integration environment. No ops team is involved with deploying that container either because the deployment is fully automated. The CI/CD environment is typically provisioned via SAAS. We use Github actions for example. Our deployment process is "merge to the production branch".
The role of ops has been reduced to providing the automation that does several things:
- provisioning the infrastructure, typically via some IAAS platform like AWS, GCP, etc. with some automation (terraform, cloudformation, ansible, etc.)
- automating the deployment of software on that infrastructure. This normally is some kind of one liner in a CI environment that triggers the right scripts to run.
- automating the building of the software
- dealing with backups, monitoring, etc.
For small teams that work is easily handled by a specialized person for whom this isn't even close to a full time job. I know because I'm that person but I mostly do other things. In fact, I get really grumpy if I have to spend time on this stuff because it is a bit of a time-sink and it blocks me from doing more interesting things.
Ops teams still exist of course but they are employed by IAAS providers or companies that still run their own hardware. They don't get involved with how software is packaged and deployed anymore. They just keep the infrastructure up and running. Most senior engineers know enough to automate most of the above. And becoming a proper devops engineer just means stepping up and learning how to do all of it.
"DevOps" became a job title because the level of knowledge required is enough to warrant one. DevOps Engineers need to know their stack well enough to respond quickly to production issues. It's not enough to live at a high abstraction layer when you're responding to non happy-path events.
On top of sysadmin and networking knowledge DevOps-ers need to know a bunch of specialist tooling (Ansible, Terraform etc.), need to know the ins-and-outs of their cloud provider AND learn Kubernetes which is an extra abstraction across cloud providers and comes with its own complexity and quirks.
All this does make me wonder where we've come from ten years ago when we thought of Docker as a useful. Is it really easier to run a backend now?
> If you look at a DevOps engineer job description, it looks remarkably similar to a System Administrator role from 2013, but with some containers and cloud provider management instead of racking and stacking servers.
This is true of many new methodologies/philosophies in software engineering. Many orgs implement Agile in a way that looks more like waterfall than something described by the Agile Manifesto. Remember microservices? Many large orgs that implemented them ended up building gnarly "Enterprise SOA" messes.
It seems the old guard will always bend their deeply-ingrained habits only enough to stay relevant. So a lot of the time, that means carrying forward the old habits and diluting the philosophy of newer methodologies.
This sounds like people who say cars from the 70s are better because they are only mechanical and people can work on them with their own hands. And modern cars are worse because they're so complex. This, of course, ignores that cars are better in basically every single way now. They're safer, more efficient, have more features, and admit it or not, more reliable.
So yeah, pre-devops, managing infrastructure could've been simpler, but it did simpler things. There's an exponentially growing amount of data that servers need to handle. And 2005 tools can't handle it. The industry didn't scale out of it for fun, but necessity. Modern devops is more complicated, but it's better, so, whatever.
I think the overhead here is not devops itself. Docker and the tools like ansible that have evolved until now are really much more useful than the scripts, etc we had back then.
The overhead is paying thousands for AWS/Azure/whatever cloud services for your little internal company server or small SaaS when you could just get some on prem servers and host it yourself. A lot of people have been sold on proprietary, fancy stuff like ec2 and lambdas that lock you in just to suck more money out of you.
Of course, failure redundancy should be also worked into this, but even hosting in 2 locations is much cheaper than feeding the hungry cloud beast.
I already "knew" this, I totally agree with it, but I still open job positions in my company for "DevOps engineer". It's the easiest way to get ops people that are more focused on Kubernetes and CI/CD.
Every time we standardize mediating layers of leaky abstractions we are creating novel territory that accrues side-effects. Once there’s enough people working to manage these side effects, experts, products and services will emerge to promote the legitimacy of this new layer. There have been many attempts to solve this by revisiting first principles, but most of us are oblivious and/or too vested in one of these layers or too averse to disruptive changes. Hence, when we run up against bottlenecks and side effects we deploy more leaky abstractions and the cycle continues.
Same as some ppl waking up every day to run a mile, then drink an energy drink and drive their kids to school -> Being a real DevOps takes effort, commitment and rigourous routine.
After some time you get used to it. But not a lot of ppl can actually get to that point.
Someone asked me how to grow DevOps culture in a company -> it's like asking "How to make ppl run 1 mile every day?".
You cannot "make" ppl do that, you have to find correct ppl. And here is the reason why DevOps as a culture failed in the real world.
Big part of my job is convincing the bosses we don't need X new shiny thing... we still waste years chasing random tech that doesn't benefit the customer at all though
I simply quit a job with 2 months after they pinned devops duties in addition to software development. And i've yet to be proven that this can be done by a single person.
I work in a team of 45 devs. We develop, deploy and maintain our code. Maintenance is done on rotation, everyone is responsible for deployment of their code all the way to prod and we maintain the CI/CD pipelines ourselves. K8s is the only managed service we use on aws. On Azure, we manage our own K8s. It's not "a single person" but as a team, we all do it.
The problem with anything like this where one person has multiple “roles” is that it is hard to excel at one or the other and very very easy to not be good at both.
Unless you are given time to spend on both roles independently then you end up cutting corners, it is inevitable as one or the other will be more important at one time and you can’t be in two places at once!
> The problem with anything like this where one person has multiple “roles” is that it is hard to excel at one or the other and very very easy to not be good at both.
In the immortal words of Ron Swanson - never half-ass two things. Whole-ass one thing.
Ive worked in places that have done just that. Essentially developed who are operations but focused on dev. They hated being on call, anxiety went through the roof, burn out and eventually would leave the company, it was rolled out to a subset team and not all.
These were seasoned devs the company had lost and now the codebase is dysfunctional and has put the project/roadmap at risk and company revenue.
The thought of having to do Ops, write code and be on call gives me anxiety and panic attacks.
Systems administration and operations are tightly scoped. DevOps is, in my opinion, an umbrella term representing all non-core engineering work: servers, build pipelines, on-call response, infrastructure, and more.
The tricky part is that a lot of these areas are relatively new, and the older ones like server admin have changed dramatically over the past two decades.
DevOps hasn't failed. DevOps is in its infancy and is going through technical, strategic, and philosophical growing pains.
DevOps has pivoted I think from the more original goal of having Devs do a lot of Ops things. This isn't really a viable solution as almost every compliance system mandates the people writing code and the people deploying/running systems are different.
What we have now is Ops that use things like source control. Write scripts that are more modular, reusable and composable. A new application can have cloud resources created and allocated in a day rather than a month of tickets to several other teams. This is also granting visibility to things that Ops may have been doing, but it was scripts on a PC or build server that only Ops had access to.
The “culture” and “movement” was marketing by someone selling conferences and books. “What is your devops culture” has to be one of the most asinine questions of the last 10 years.
It all described a way to get to continuous deployment and IaC, all of which suffers from entropy the moment someone decides that is how it’ll be. None of the tools available to do this lofty goal are are really up to the task.
There is good, there is bad. A failure it is not but it is not a success either.
I've always looked at DevOps as Development absorbing Operations but never through the lense of the other way.
We develop, we ship, we maintain. Never had any real problem with it, we manage the infrastructure by writing a lot of yml files, some of them compiled by templates. Anyone who wrote software is expected to also write the infrastructure that supports that software. And the team as a whole take care of operational duties.
I believe a large part of the devops culture came from the early clouds. It was difficult for ops to fix anything without involving developers. And for developers to troubleshoot basic things like network issues. This because the poor quality and visibility of the cloud services. The clouds are much better today and requires less skill to use. The unfortunate consequence of this seems to be that more ops tasks end up being done by devs.
It's so sad to see hyperbolic articles like this saturate HN.
DevOps was a stepping stone and it had many lessons to teach us just like the traditional IT before it and just like the roles that will come after it.
It was a success some places and a failure others but in general it has been an improvement over most companies processes and to say otherwise is just being salty about how it worked for you at your particular job.
I think the point is that programmers implement automation and operations. Ops cannot fix program bugs. Programmers need to think about monitoring, automate deployments.
Kubernetes and microservices solves some problems but adds complexity. Distributed tracing vs local debugger.
Lots of yaml. Kubernetes Network abstractions overlays where pod network is non reachable from dev machine without proxy etc.
Remembering the ideas from DevOps at its inception, the point was to remove the friction between developers and operations: have operations stop saying “no” and developers stop saying “yolo lets ship it”.
Well, that's pretty much doomed to fail by design. If you are on the engineering team and not writing features that deliver direct value to the company, wtf are you doing? You're damn right as a dev I expect to "Yolo" it if you're getting paid to make operations safeguarded, streamlined, and resilient to downtime. I don't literally mean I intend to be sloppy, and there are some great tools (esp. on CI and alerting frameworks) that the DevOps culture built, but I can buy that as a service. If you are on DevOps and you are not accelerating devs in the only way that matters to the business (velocity of features with fewer bugs), then what the hell are you doing? Scale should be my problem as a dev to be concerned with because I have deep knowledge of the use case, and should know my stack. I'm sorry to sound harsh, but IME, DevOps people have a terribly inflated view of their contribution to a company's bottom line. Devs are an awfully scarce resource, and these guys are often more than capable of producing real value to drive a business, but they seem hell bent on spending it on derivative, second order concerns that just aren't really that important to employer.
as a long-time consultant in this space, i don't fully agree with this take.
the entire point of devops was to make devs more ops-y and ops more dev-like, i.e. everyone is a developer. to some extent, the movement has been successful on this front. devs generally have a better idea of how to run platforms, and ops generally can write enough code to kind-of support them. tooling like terraform and kubernetes have helped a lot here.
there are many companies that have definitely lost the plot (by forming devops teams that are glorified cloud admins or release engineers), but i'd say overall the bar is higher now. a lot of sysadmin jobs require shell experience and Git, which was definitely not the case ten years ago.
the real reason why the culture part of devops hasn't "happened" is complex. in many bigcos, devs and ops are completely separate business units with different corporate agendas and kpi's. dev teams are also restricted from admin'ing their own stuff in production bc of audit risk, so even if they wanted to, it's an uphill battle. however, dev teams at these companies are by large incentivized by what they ship; ops is not as high of a priority.
what i've seen a lot of lately are ops teams becoming platform feature teams with dev teams as their customers. many ops teams have embedded into dev teams like consultants to help them move stuff onto these platforms, and they make admin as hands-off for themselves as possible. devops is to thank for all of that.
before "devops," zero-touch platforms were the hotness for ops teams, but they didn't have the chops to write the code needed to do that, so they became huge exercises in COTS integrations.
i do agree that devops is very infra-biased, /r/devops is /r/syadmin redux, and that forming a "devops team" requires serious questioning
Devops works for me. I'd rather put time into automation than time into training a crew of people who will fat finger things. Any time I can turn a management or organization problem into a technical problem, its a win.
If I can have machines do it, why would I have people do it? I can test what the machines do, and machines can scale a lot better than humans.
> and having the developers learn and maintain all of the operations practices just isn’t scaleable or feasible.
I was on the same boat until I've switched to a company where devs were also doing ops. With AWS, GitLab CI and Terraform, this works surprisingly well.
There are still ops persons for the hard parts, but most things can be done by the devs themselves.
predev ops, I had to put in paperwork for servers, they would come back not set up correctly, we would have test and prod being different, it would take a long time, and it would cost us a lot.
post devops, I put in the scripts, and we use that for deployments. The ops teams review the scripts.
I don't know what failure looks like for other people, but it is a LONG way from where I am standing.
We put in CDK scripts in as part of our development now, and it works pretty damn well.
The different teams who want to review different parts do so. Deployments are fast, and more importantly, they are the same as the dev deployments, which we don't have to go though months of trying to get set up and running.
We have a capped dev AWS account, and that solves a LOT of issues.
Depended on the project, if the project was using Lambda functions etc, then it was in the same repo.
But typically the projects made docker images etc, and how they were to be deployed was not important to the containers themselves, so they ended up separately.
In part because we wanted to opensource the containers themselves.
I've really enjoyed learning about all aspects of creating software. Whatever we decide to call it, I'll always enjoy doing things that don't always fit into a neat little job description.
the word devops reminds me of being in constant development and never being able to reach a stable state. constantly new features, solutions, trials, alternatives and other things...
Everything is good and I like the new term SoftOps, but I need more examples of SoftOps in the real world or to me it's just DevOps with a different name.
Why is DevOps is so hard in other companies yet so easy in Netflix? I mean, the entire monitoring stack of Netflix was automated by one person and managed by four. Two people in Netflix used to manage their entire messaging pipeline, including suro, Kafka, Zookeeper, and some Druid stuff. Their Hadoop ecosystem used to have a handful of people and their end-to-end ingestion pipeline that aggregates and demuxes data into Hive with 15-minutes of maximum delay was written and maintained by a single person. Their key manager, well before Amazon KMS existed, was implemented and maintained by a single person. Their engineers in the cloud platform team were oncall 24x7 yet they only occasionally got paged. Oh, their cloud platform team had fewer than 20 people too. Their entire Cassandra team used to have fewer than 20 people. They predictively autoscaled their clusters with a system created and maintained by merely 2 or 3 people. According to their engineers, they did a few things on top of the excellent foundation of AWS, and they just did them without any fuss:
- API-based full transparency. If there's a function, there is an API. Anything you can do is explicit in API.
- Powerful monitoring support so "instrumentation to death" is not just a slogan.
- Decentralized control. For instance, teach team has freedom to decide how to load balance their traffic and how to handle unresponsive nodes.
- Assuming everything can fail and build mechanisms to account for that assumption. Chaos engineering. Autoscaling from get go (again, thanks to Amazon for such a powerful feature from day one in EC2).
- A shared culture: don't tell me to learn your shit. So, no friction from handling shit like Puppet, like Chef, like Terraform, like HCL, like whatever yaml or DSL that folks on HN passionately advocate. Like learning how to embed a jinja template in a job description with 5000 lines of Ruby code plus some system-specific half-assed DSL just to update an environment variable? Never gonna happen. Don't get me wrong: they are probably nice technology. They are just too damn low level and irrelevant to most engineers.
- Instant gratification. If I make a config change, I want to see it in production in seconds, safely. If I make a change in my code? The change will deploy with all the guardrail in seconds. So, a puppet change that takes on average 15 minutes to materialize? That's just garbage.
- No surprise. So, a Chef script goes behind my back to update my OS and screws up my production service? Never happens. Immutable infrastructure was implemented from day one.
So the question is, why are those thing hard in other companies? Do engineers enjoy getting paged at 2:00am? Do they enjoy spending at least 1/3 of their time handling so-called operations? Or do they enjoy writing rants like DevOps is a failure?
I don't get how 1 person maintaining key management service at Netflix works, I suppose there is something obvious I don't understand.
In most organisations having 1 person fully manage / own a thing (in the usual sense of that) is a serious problem because that person needs to take holidays and can quit. Depending on type of org it can also be a security issue (e.g. you are likely to have a hard time if the service is in PCI scope).
Usually you'd want at 4 people to be able to sustain a service reliably through seasonal flus and to cope with natural organisational attrition. At least 2 fully up to speed (so they can review/approve each other's MRs/PRs) and 2 shadows who have access to everything required and can be hands on.
Yeah, bus factor is real. I'm sure they later hired more people to manage the service. The point was that one person was able to manage the service without worrying about operations. The service does not go down. People provision keys and certs via an API so business continues even if the person went on vacation -- the only impact was that no one would be there to add new features.
Yeah I think I'm aligned with you. So many processes at my workplace are pointlessly "service now" driven and would be enormously more efficient if they were provided by API.
I think a lot of legacy organisations are held back by having large departments of "non-coders" in infrastructure roles.
It's a sticky local minimum. You can't automate without enough code-literates around to support things. But the non-coders tend to hire more people like themselves. They are swamped with their existing problems and believe more of the same to be the solution. So things don't get automated...
It’s hard because far from everybody is Netflix scale.
A monitoring stack maintained by five dedicated engineers is factor 10-100x more than any other small or medium sized company can afford to throw at it. Most places you’d be lucky to find even one engineer. One engineer working full time with only a specific tool is a luxury and one such engineer will be able to achieve a lot.
My bad for not stating my assumptions explicitly. When I see article, I was thinking of, probably unconsciously, the companies that build and operate their own services with sufficient complexity. Say, they have databases, computation platforms, analytic platforms, and etc. . Such assumptions, of course, are unfounded.
DevOps is sysadmins making unreliable continuous integration/delivery/deployment tools out of bash scripts, YAML files, and webhooks, which are in some ways inferior to things Heroku did 10 years ago.
Even thought this is downvoted and it kind of underplays the overall devops role, I understand the sentiment here.
That might be a reason why Vercel, Netlify, Render and others are picking up usage. YC has been investing in several similar companies. Big cloud is going in the same direction with AWS Amplify/App Runner, GCP Firebase/Cloud Run and Azure.
2022: your infrastructure is automated using 10 extremely complex devops tools. You automated the two system administrators away - but then had to hire 5 DevOps engineers paid 2x more. The total complexity is 10x.
They wrote YAML, TOML, plus Ansible, Pulumi, Terraform scripts. They are custom, sometimes brittle and get rewritten every 3 years.
EDIT: to the people claiming that today's infra does more things... No, I'm comparing stuff with the same levels of availability, same deployment times, same security updates.