Trouble Maker is tool for development teams that introduces failure into your software systems intentionally. Trouble Maker is an open source project hosted on Github that can be found here.
Randomly Takes Down Services
Trouble Maker will randomly take down application services during normal business hours.
Web Console For Stability Tests
Provides a web console to perform stability tests against servers.
Failure As a Use Case
For too many reasons to count, it is nearly a guarantee that your production software systems will fail in some way.
Our answer: engineer failures into your platform’s production environment.
Sound crazy? Maybe. But Netflix has done this using a framework called Chaos Monkey, which can be configured to randomly take down AWS resources (i.e. load balancers, etc.) during normal business hours. When this occurs, automated or manual procedures should occur to remediate problems, while still continuing to operate and serve users.
Netflix’s Chaos Monkey is based upon Amazon EC2 API. So, we wanted to implement a solution that was not dependent upon the cloud and could be used within an enterprise environment.
Trouble Maker was implemented for Java-based web and Microservices-based applications. Here’s a diagram of how it works:
Trouble Maker communicates with a servlet registered in Java-based client Microservice. It also communicates with a Service Registry that is used to determine location of services to operate against. (By default: Eureka.)
Here’s a screenshot of the Trouble Maker dashboard:
Trouble Maker can be configured using a cron expression to randomly select a Java app server instance and kill it. Doing this in production might seem a little risky, but if you desire a stable Microservices platform, this will test its durability.
Please feel free to make any suggestions or submit pull requests. Our goal is for this to help organizations that are implementing Microservices to implement stable and durable platforms.