Keeping awareness for load and performance testing on a consistently high level can be difficult. Leadership is willing to invest several months after severe production issues occurred but their motivation declines the more time goes by. You might see cut's in your performance engineering budget if the speed and reliability of your apps are too long too good.
Well, you could introduce performance problems and deploy them on production to get the attention back, but this is more like playing with fire.
Better options are, from my perspective, initiatives allowing you to build a strong case for performance engineering in your organization.
Let me share a few such forward-thinking performance engineering campaigns.
Recent outages at Facebook made it very clear - the best SRE's in the world can easily make mistakes resulting in severe outages on production and millions of losses in revenue. Performance testing alone can't protect businesses against such risks. Still, together with operational teams, a load test could be executed during which SRE's introduce specific issues such as network problems, stopping processes, or testing their SRE toolbox. The term gamedays is very famous in business practicing chaos engineering. Usually, they schedule such gamedays upfront once a week or a month, layout the experiments and the expected outcomes, execute them and solve identified issues to achieve their target reliability.
#2 Autonomous Performance Tuning
Are you still executing all your performance tests manually? Sure, it's just a click, but the test result analysis and comparison with your performance requirements is an effort too. Wouldn't it be much better to configure the load test execution schedule along with the service level objectives and indicators and let the machines do this work? Thanks to recent developments, automated performance gates, and autonomous performance tuning are a reality now. The significant benefits are that your teams are no longer tight up, you have a consistent performance baseline in place, you find issues earlier, and your performance engineering effort goes down.
#3 Performance Quality Metrics or SLOs
Not all our applications and IT services are equally important. For instance, a parking slot management application might not have a high business criticality, and you won't consider it a candidate to execute load and performance tests. Other services such as online banking are customer-facing, and outages or performance degradations are not acceptable. As a good practice, you should introduce a process allowing your IT teams to check if the change they deploy to production requires performance validation. SLOs and performance quality metrics do not stop here. They allow you to validate and report the state of your application's performance. Usually, you specify the target rate and a warning indicator. An error budget will let you know if the agreed SLOs are at risk.
#4 Close the loop
Do your operational teams share feedback about bottlenecks they identify on production or workarounds they implement to keep applications running? Knowledge about technical shortcomings is significant for all engineers in your value stream. Ideally, there is a two-way flow of information, from performance engineers to operational teams and the other way back. You should update your ops teams regularly about the latest performance findings, not solved performance problems and performance trends. Teams taking care of production should share their struggles with current releases and provide performance feedback from end-users. Such a closed-loop communication avoids technical debts, increasing trust, reduces troubleshooting efforts, and creates a much better-streamlined performance engineering approach.
There is no need to introduce performance bottlenecks on production to remind decision-makers that reliable applications require a consistent investment. Instead, a much better way is to increase maturity by adopting the four performance engineering strategies shared in this blog post.
Our team is here to lift and shift your performance engineering approach to the 21st century.
Use our Performance Engineering Model to evaluate your maturity, compare yourself with peers and get advice how to close identified gaps
Link to our Performance Engineering Maturity platform:
Keep up the great work! Happy Performance Engineering!