DevOps – SRE – Platform Engineering
Growing voices are saying DevOps is Dead. At the same time, the search for Platform Engineers seems to be picking up. And I am sure we all came across organisations harbouring talented Service Reliability Engineers.
So what is going on?
I have been reading articles on my feeds, such as https://thenewstack.io/devops-is-dead-embrace-platform-engineering/ , or podcasts that mention Devs don’t want to do ops anymore, like this The Cloud Cast Episode.
Maybe because of the (claimed) decline of DevOps, SRE (Site Reliability Engineering) gained a more relevant role in the organisations. SRE’s role is to reduce errors and increase the reliability of the delivered solutions and products.
Also confirmed is the fact that there is a growing hype around Platform engineering (PE), and some say it is the next logical evolutionary step after DevOps.
I am somewhat biased in not disagreeing with such a statement. This is supported by the fact I had the privilege of managing a Platform engineering team not long ago, which in turn was an evolution of an infrastructure team.
In all good honesty, DevOps enabled achieving excellence in productivity and efficiency. Remember The Phoenix Project?
DevOps seemed like the only way to go back then, right?
What is not going as well, then? The “You build it; you run it“ concept behind DevOps seemed to have facilitated the acceleration of delivery by taking down the wall between developers and operations people.
The reality seems to have gone down a slightly different path. For software developers, the efficiency was ( and is ) visible, but adding operations responsibilities to their job roles may not have worked as expected, leading to some discrepancies in skillsets inside the organisation and inefficiency. Some developers ended up overloaded with the operations side of things, and this snowballed into what we are perhaps witnessing in real time today: Devs don’t want to do Ops.
But there is an excellent lesson to take away from DevOps: having a self-service enabled to the developers increases the quality of the delivered product, productivity, efficiency and many other aspects.
So what happens now?
Enter SRE. Thank you, Google 🙂
The Site Reliability teams started to drive a change in IT culture because they seemed to be able to help eradicate the “downsides” of ” I don’t want to do ops”.
These teams are responsible for a considerable part of the operation, if I may say so.
Think about their goals in terms of SLOs, and what they have to ensure to achieve them. Monitoring and proactively managing resources, from networks to operating systems (even if there are specialised teams to assist in more specific problems), change management, On Call and attending to emergencies. Let’s not forget constant capacity planning as well.
So, SRE seems to be the answer to the fall of DevOps, right?
Not so fast…
Let’s reflect on what can happen to SRE teams in some organisations.
They can become a costly support team. Let’s keep in perspective that many times, in the present day, an SRE is someone coming from a more traditional ops world. Perhaps there is a natural inclination to react to problems and mitigate them instead of proactively developing a roadmap to enable the developing communities with a self-service stack of tools.
In a somewhat short-sided observation, some may say this ends up being getting back to a mentality before DevOps. I can’t entirely agree with this, but I understand why this may be said.
Then what?
You know the answer by now. Platform Engineering.
By what seems to be the definition of PE, the responsibility of establishing workflows and creating toolchains to enable the desired self-service end result, ends up leading to delivering all the necessary internal platforms and also providing cover to a complete life cycle of a product.
This seems to translate into a measurable and significant improvement in the performance of an organisation.
Another result of this newer PE mentality is a new momentum in closing of the gap between devs and ops (some may say again, but it is different this time).
Some people I chat to regularly, say the success of these teams ( PE ) may rely on the fact of them operating under a very product-oriented focus. Platform engineers build from a roadmap, using iterations and ensuring their products are communicated efficiently across the organisation, which makes the focus shift from problems to “doing the work”.
I oversimplified the concept in the previous lines, but it is starting to look like PE will be the successful replacement for DevOps.
I wonder if you agree with any of this?
Drop me a line, and let’s have a chat!