The Top Technical Skills You Need to Be an SRE in 2022
February 14, 2022

Jayne Groll
DevOps Institute

When it comes to IT professions, skill development remains essential to human transformation. The ability to collaborate, solve problems and continuously upskill helps advance the DevOps journey and creates stronger, more effective individuals, teams and organizations.
Site Reliability Engineering (SRE), remains a top practice — with SRE-specific roles making it to the forefront of many organizations' hiring objectives. As more organizations look for qualified SRE candidates or look to upskill their internal SRE's, technical skills remain critical to building top performers in this role.

While an SRE needs many skills, not all skills are created equal. Certain technical skills are essential to have in the digital age. DevOps Institute's Chief Research Officer, Eveline Oehrlich, weighed in on one of the most critical SRE skills in 2022: "One skill we feel is essential is contextual listening. In a discipline whose main purpose is to serve a variety of stakeholders, such as developers, architects, and other stakeholders, such as customers, it is crucial to understanding problems. Contextual means to understand meaning from the context in which the details, data or other factoids are received. This is the foundation of doing engineering correctly. Unfortunately, listening is not often formally taught in engineering curricula across the universities. We discuss more on SRE vision, principles, practices, skills and in the SRE SKILbook."

In addition to contextual listening, various skills make a strong SRE in 2022. For further insights, we reached out to DevOps Institute Ambassadors, who identified several other SRE skills.

Here are the top SRE skills as identified by DevOps Institute Ambassadors:

Helen Beal, Chief Ambassador, DevOps Institute

"There are two: firstly, being able to instrument and teach others to instrument observability into digital products and services along with the ability to leverage multiple monitoring streams to discover problems and reduce MTTR quickly. Second, being able to automate onerous and wasteful tasks out of the value stream's processes."

Maciek Jarosz, DevOps and Process Expert, Business Practitioner

"It may not be a technical skill per se, as I'd say it's a shift in how we look at software development where we no longer pass our work to the production environment and let somebody else maintain it. The shift encompasses looking at software development as a one-off fire-and-forget type of work to continuous work on one service or product where people who develop a product also need to think about how THEY will maintain the product or service at hand. It is a different paradigm in my opinion."

Mark Peters, Technical Lead, Novetta

"There is no one technical skill that makes someone an SRE. An SRE understands the entire process, from idea to delivery, and can work at any stage. They also support the culture through learning, and leading teams to find their own problems early. If there was one technical skill, they wouldn't be so hard to find or so expensive to hire. The truth is that the SRE must be a critical thinking expert who excels at collaboration and can implement fixes without stepping on toes within the teams or angering management through impeding a pet project. If you want the best SRE, pick someone from your organization, who understands the process, has been there for a while, and is looking for a chance to excel."

Parveen Kr. Arora, co-founder and director, VVnT Foundation

"Site reliability engineer (SRE) is someone who is constantly analyzing every change for its risk and what its impact could be down the road, not just today. One of the key skills of SRE is automation, as essentially the role requires replacing human labor with automation, generally by creating self-service tools for developers. This is how SRE would enhance the availability, performance, efficiency, monitoring, emergency response, and planning of production services and software."

Supratip Banerjee, Solutions Architect, Principal Global Services

"There isn't just one technology/tool that SRE needs to know to perform his responsibilities properly. He needs to be proficient in one or more areas mentioned below:
a. Utility development: SREs are responsible for development's utilities. Hence they need to know at least one programming language. Automation testing is also a part of it.
b. Infrastructure: Varied tools in DevOps area, e.g., GitHub, API gateway, CI/CD tools
c. Security: security-related tools.
d. APM: Application performance management process tools."

Stephen Walters, Field CTO, CEM Digital

"I am unsure if you would class this as a technical skill, but for me, the number one skill is understanding how to communicate. The key purpose behind DevOps and SRE is to break down silo walls, and without that, all you have is engineering, which is exactly what we have had before, just with different tools with different names. Much of the issue of communications can be alleviated by the proper use of ChatOps and Digital Operations Platform tools, so understanding how to make sure of them correctly would be of huge benefit."

Craig Cook, Principal Engineer, Catapult CX

"First, there are some essential skills such as Infrastructure as Code, cloud, automation and CICD, which are all standard practice in software teams, so any Site Reliability Engineer needs these capabilities as a starting point. To start to differentiate themselves, a Site Reliability Engineer should develop skills in observability. Core to SRE is metrics-based decision making, and to have metrics systems they also need to have great monitoring to the point where they can gain visibility over all the moving parts to ensure they are properly 'observable.'"

Samer Akkoub, Senior Alliances/Channels Solutions Architect (APJ), GitLab

"It is a must to understand operations' terms (SLAs, RPO, RTO, thresholds …) plus knowledge in DevOps or automation platforms."

Suresh GP, Managing Director, TaUB Solutions LLC, USA

"While there are a number of technical skills that are needed to be developed for a site reliability engineer, I would insist on picking up the aspect of knowing about Containers and Microservices that would be more impactful to organizations.
One of the biggest challenges that organizations surmount is to manage the future of legacy environments. There is a huge push towards application modernization, and SREs play a pivotal role in designing the transition from monolithic applications to containers or microservices. This spearheads the movement towards immutable infrastructure that becomes an important tenet for building reliable and resilient systems. It also creates the jump start for the team to improve productivity, reduce toil and reduce planned and unplanned downtime. Finally, it gives confidence for organizations to see the light at the end of the tunnel to improve deployment frequency and velocity for legacy systems."

Learn more about today's most important SRE skills and practices at SKILup Day: Site Reliability Engineering on February 17, 2022.

Jayne Groll is CEO of DevOps Institute
Share this

Industry News

May 25, 2022

JFrog introduced Project Pyrsia, an open-source software community initiative that utilizes blockchain technology to secure software packages (A.K.A Binaries) from vulnerabilities and malicious code.

May 25, 2022

Kasm Technologies, in partnership with Docker, has developed Kasm Workspaces as a Containerized Desktop Infrastructure platform for streaming remote workspaces directly to your web browser.

May 25, 2022

Cascadeo announced the integration of Amazon DevOps Guru with, Cascadeo’s cloud monitoring and management platform that provides users with a single view of multi-cloud or hybrid infrastructure environments.

May 24, 2022

Oracle announced the availability of Java 18, the latest version of the programming language and development platform.

May 24, 2022

Docker announced the acquisition of Tilt, makers of a development environment as code for teams on Kubernetes.

May 24, 2022

F5 announced the release of F5 NGINX for Microsoft Azure, an Azure-native service offering developed in partnership with Microsoft, that helps customers deliver modern applications on Azure with just a few clicks.

May 24, 2022

Pegasystems announced a strategic partnership with Google Cloud that will help enable joint clients to accelerate their digital transformations with Pega’s low-code enterprise software on Google Cloud’s highly scalable cloud services.

May 23, 2022

Sauce Labs announced the release of contract testing with mocking on the Sauce Labs API Testing Platform.

May 23, 2022

Pure Storage announced a series of updates to its Portworx portfolio.

May 23, 2022

StackHawk has secured $20.7 million in capital.

May 19, 2022

Jellyfish announced the launch of Jellyfish Benchmarks, a way to add context around engineering metrics and performance by introducing a method for comparison.

May 19, 2022 announced the addition and integration of Cilium networking into its Gloo Mesh platform, providing a complete application-networking solution for companies’ cloud-native digital transformation efforts.

May 19, 2022

Aqua Security announced multiple updates to Aqua Trivy, making it a unified scanner for cloud native security.

May 18, 2022

Red Hat unveiled updates across its portfolio of developer tools designed to help organizations build and deliver applications faster and more consistently across Kubernetes-based hybrid and multicloud environments.

May 18, 2022

Armory announced public early access to their new Continuous Deployment-as-a-Service product.