By Wolfgang Gottesheim
In a traditional organisation, handoffs between developers and operators cause friction as these groups follow different goals. Developers and business drivers within the organization want customers to use new features and benefit from other improvements, while operators seek stability and want to provide a stable environment.
One of the ground-breaking books around DevOps, “The Phoenix Project” by Gene Kim, Kevin Behr and George Spaford, describes the practice through “The Three Ways” of systems thinking, amplifying feedback loops, and providing a culture of continuous experimentation and learning. Systems thinking means to focus on overall value streams and to make sure that defects (for example, broken builds), are not passed on to downstream units (like the Ops department). Amplifying the feedback loops translates to providing proper communication channels between Dev and Ops, and to achieve this without creating an overly complex framework of processes. Creating a culture of experimentation and learning encourages everyone to take risks and learn from failures.
A lot of people see DevOps as the extension of agile practices from developers towards operations to overcome cumbersome processes.
DevOps – a Definition
DevOps aligns business requirements with IT performance, and recent studies have shown that organizations adopting DevOps practices have a significant competitive advantage over their peers. They are able to react faster to changing market demands, get out new features faster, and have a higher success rate when it comes to executing changes. The goal of DevOps is to adopt practices that allow a quick flow of changes to a production environment, while maintaining a high level of stability, reliability, and performance in these systems. However, the term nowadays covers a wide range of different topics and consequently means different things to different people.
The Foundation of DevOps
There are a number of definitions and interpretations for DevOps floating around, and a way to look at it is in terms of CAMS: DevOps means to adopt a Culture of blame-free communication and collaboration, to embrace Automation to allow people to focus on important tasks, to introduce continuous Measurements to get feedback on the quality and usage of features and bug fixes, and to encourage Sharing of these measurements. This underpins the fact that DevOps is not about standards or tools, it is about enabling communication and collaboration between departments in an organization.
Plugging Performance into DevOps
When we talk about collaboration, a key aspect is how we prevent finger-pointing between teams when problems occur. We have to handle and prevent failures by continuously ensuring high quality, but while almost every definition of software quality mentions both functional and non-functional requirements, the non-functional aspects like usability, deployability, and performance are only rarely measured automatically. This becomes a problem as performance issues are among the hardest to solve – they are heavily dependent on load, deployment, and user behaviour, and Ops teams need help in identifying these issues and communicating them to Dev in an actionable way.
In order to focus the entire team on performance, you must plug performance into the four pillars of CAMS:
- Culture: Tighten feedback loops between Dev and Ops,
- Automation: Establish automated performance monitoring,
- Measurement: Measure key performance metrics in CI, Test and Ops,
- Sharing: Share the same tools and performance metrics data across Dev, Test and Ops.
Four Milestones that Companies Should Have in Mind
Culture – Tighten the Feedback Loops between Dev and Ops
Culture is the most important aspect because it changes the way in which teams work together and share the responsibility for the end users of their application. It not only encourages the adoption of agile practices in operations work, it also allows developers to learn from real world Ops experience and starts a mutual exchange that breaks down the walls between teams. From a performance perspective, it is important to establish a shared understanding of performance between Dev, Test, and Ops. This enables collaboration based on well- known measurements and metrics, establishes a shared language understood by all teams, and allows all teams to focus on the actual problems. Finger-pointing between teams has to be replaced by a practice that enables them to get to the root cause of performance issues, and working together on current issues enables developers to become aware of performance problems and their solutions.
Automation – Establish a Practice of Automated Performance Monitoring
Operations and test teams usually have a good understanding of performance, and they need to educate developers on its importance in large-scale environments under heavy load. Providing automated mechanisms to monitor performance in all environments, from CI and test environments to the actual production deployment, allows the shared language of performance to be spoken.
Measurement – Measure Key Performance Metrics in CI, Test, and Ops
With performance aspects being covered in earlier testing stages, performance engineers on testing teams have time to focus on large-scale load tests in production-like environments. This helps them to find data-driven, scalability, and third party impacted performance problems. Close collaboration with Ops ensures that tests can be executed either in the production environment or in a staged environment that mirrors production, thus increasing confidence when releasing a new version.
Sharing: Share the Same Tools and Performance Metrics Data across Dev, Test, and Ops
The more “traditional” testing teams are used to executing performance and scalability tests in their own environments at the end of a milestone. With less time for extensive testing, their test frameworks and environments have to become available to other teams to make performance tests a part of an automated testing practice in a Continuous Integration environment. The automatic collection and analysis of performance metrics ensures that all performance aspects are covered. This once again entails defining a set of performance metrics that is applied across all phases, as this is beneficial to identifying the root cause of performance issues in production, testing, and development environments.
The first step in adopting a performance culture is to enable a shared understanding of performance through a set of key performance metrics that are accepted, understood, and measured across all teams. These performance metrics allow all teams to talk about performance in the same way, and reduce the guesswork and finger-pointing of- ten associated with troubleshooting performance problems. Once these metrics have been defined, their automated measurement and analysis is the next step that makes performance a part of a DevOps practice.