Chaos Engineering and Testing


Hello folks, in this blog we will learn about chaos engineering and testing.

Traditional and Agile Testing

Software testing methods have been in use since the word software development came into the picture. It has advanced along with the practices within software development and has helped in making software more resilient.

First came traditional testing, it used to be performed incrementally. It followed a top-down approach and the next phase of testing began after completion of the previous stage. However, traditional testing had a lot of backdrops because of that Agile Testing came into the picture. Agile Testing is being used by most software companies. It has quickly become an industry standard. Agile Testing begins at the start of the project and involves continuous incorporation of development and testing.

Where does Chaos Testing come in the picture?

In 2010, the development team at Netflix was migrating their infrastructure to AWS. They faced a lot of issues during this process and eventually, they came up with an idea to identify errors within a system actively to prevent negative impacts and outages for the user.

Chaos testing is another way to deal with programming advancement and testing intended to identify weaknesses in a system that could potentially lead to outages. The idea is to conduct controlled experiments in a distributed environment that would help in building confidence in the system’s ability to stand the unavoidable future failures. In other words, it means we intentionally break the system in order to find out its weaknesses.

This would help in fixing issues before they even happen. We intentionally test it in production because the real-world response gives us sufficient information to fix issues.

Chaos Testing is a good DevOps practice but not a replacement for traditional or agile testing. This is just another step to testing done mostly by DevOps engineers. QA and developers also perform this type of negative testing. The DevOps engineers’ goals are a little different as they focus on experimenting or knowing when to shut it down if things start going sideways.

Chaos Testing ToolKits

  1. Chaos Monkey : Netflix created this open source tool. It performs by implementing continuous unpredictable attacks. The core strategy used by Chaos Monkey is to terminate one or more virtual machine instances.
  1. Gremlin : Gremlin is the first hosted Chaos Engineering solution aimed towards boosting web-based reliability. Gremlin, which is available as a SaaS (Software-as-a-Service) solution, can assess system robustness using one of three attack types.
  1. Litmus :  Litmus is a free and open-source Chaos Engineering tool for cloud-native infrastructure and apps. By executing controlled chaos testing, it aids teams in discovering system flaws and outages. To carefully control and manage chaos, Litmus employs a cloud-native technique.

If you find this helpful please do share it with your friends.

Leave A Comment

Your email address will not be published. Required fields are marked *