On Your Path To Site Reliability Engineering

Cristiano Cunha's profile
Cristiano Cunha

Solution Architect & Testing Advocate

Challenge Description

Site Reliability Engineer by definition is an engineering approach to IT, these engineers are development-focused engineers who solve operational/scale/reliability problems. Knowing that SREs are vital to supporting the DevOps change and being an SDET, how can you apply what you already know from your engineering approach to testing that can be applied to this scenario? Using the context of your companies, sit down and think about what it could mean to make such a change in your context.


Define context

Sit down with your teammates (or alone) and describe the context of the company that would benefit from the creation of an SRE team, use the aspects of your company to bring some realism to this activity (or be creative and include problems you would like to have discussions over them and solutions suggested).

If you prefer you can use the following example:

“In this company, you have an infrastructure/operations team that is the one responsible for everything happening in infrastructure and in production. This team is being drowned by tickets and resolving issues using manual actions. They use some scripts but spread in diverse machines with no versioning. They also do on call and support production 24x7.“

Generate plan

Reflect on the situation described and discuss it with your teammates. What different approaches will you take to implement such change? Define 3 to 5 points that you and your team think are the most important to be addressed (explain how to implement it and what is the outcome that you expect for each point).

Starting to use Source Control tool

  • How
    • Online training on source control.
    • Sharing sessions on how we can save scripts in source control.
    • Make sure all scripts are now “downloaded” from source control and contribution is done in it (No more local scripts).
    • Ensemble programming for everyone to see how it should be used. 
  • Outcome
    • No more scripts in local machines.
    • Code starts to flow into the source control tool and a process starts to be designed for sharing and contribution.


Choose 2-3 volunteers to describe their context and their plan to make the change and open a Q & A to discuss the approach.


Understanding what an SRE is and the set of responsibilities they are accounted for will help in the decision of considering a role in this area or not. You will also have an overview of the possible challenges to expect when doing such a change and possible solutions to try (or adapt) in order to invest in the change while moving forward.

What you’ll learn
  • Identify approaches for adopting site reliability engineering


Forensic Testing: Uncovering Quality Issues Using Your Organization’s Code Repository
Getting Started With The Cypress Recorder
Testing Real Devices in The Real World
Testing Ask Me Anything - CI/CD and Delivery Pipelines
The "Do Nots" of Testing - Melissa Tondi
A Guide to Patent Development from a Tester's Perspective with Angie Jones
Laying Off Testers? Think Twice Before You Act!
Testing In Tabletop Roleplaying Games - Michael Mingers
Different Testers, Different Processes
🕵️‍♂️ Bring your team together for collaborative testing and start for free today!
Explore MoT
Episode Four: The Practitioner
The Testing Planet is a free monthly virtual community gathering produced by Ministry of Testing
The Building Blocks of the Internet
Learn the fundamental technologies that make up websites and web pages