Build resilient applications
Test the resiliency of applications and serverless functions with safe and secure application-layer fault injection—no infrastructure access required.
Hundreds of finance, retail, and technology organizations worldwide trust Gremlin
Take the tour
See how easy it is to test and manage serverless reliability with Failure Flags.
Deploy fault-tolerant applications
Keep your applications running reliably, even when you don’t control the underlying infrastructure. Gremlin’s Failure Flags feature lets you run Chaos Engineering experiments in your application code, so you can improve reliability on any platform.
How does Failure Flags work?
A “serverless application” is any application that can be deployed in an environment where you have no control over the system it’s running on. Serverless architectures enable developers to build and deploy code without having to focus on provisioning hosts, installing operating systems, or managing dependencies. The serverless provider assumes these responsibilities, freeing up developers to focus on creating great software.
However, this doesn’t mean serverless platforms are free of reliability risks. Multi-region redundancy, load balancing, and automatic failover are just some of the techniques needed to keep large-scale serverless deployments running reliably. Even if developers build resiliency into their applications, they need a way to verify that these systems will work when faced with a real-world incident.
With Gremlin, you can run Chaos Engineering and reliability tests on any application or Kubernetes container, no matter where it’s hosted. Validate failover systems by dropping network traffic; tune your health checks by adding latency to incoming requests; test your unhappy path by triggering errors under certain conditions; or bring your own custom tests. You have full control over your Failure Flags, and just like feature flags, you can turn them on and off with the click of a button.
Test your managed applications
Reliability testing often focuses on infrastructure, but what happens when you don’t have access to the systems your code is running on? Failure Flags removes this hurdle by letting you run experiments directly in your application. Whether it’s a Java function running on AWS Lambda, a Node.js function running on Azure Functions, or a Golang function running on Google Cloud Functions, Gremlin helps you verify the resilience of your functions.
Test your Docker and Kubernetes containers
Managed Kubernetes providers take responsibility for the reliability of the control plane, but what about the containers themselves? With Failure Flags, you can run test inside of your containers to ensure they’re fault-tolerant, replicable, and scalable. Gremlin makes it easy to embed Failure Flags into your container images: just add the Failure Flags SDK to your code, deploy the Failure Flags sidecar, and start experimenting. Test your containers on Amazon Elastic Container Service (Amazon ECS), Google Container Engine, Azure Containers, and others.
Run your own tests your way
Gremlin gives you complete control over your Failure Flags experiments. As with Gremlin Fault Injection, you can create and manage experiments in the Gremlin web app, view your actively running functions, and immediately halt running experiments at any time. Gremlin will immediately halt and roll back the test, returning your function to its normal operation in seconds.
Shift from observing to improving
Gremlin enables teams to proactively improve reliability at every stage of maturity.
Robust, customizable chaos tests to safely replicate any incident scenario.
Pre-built test suite to cover the most common reliability risks. Get started in minutes.
Standardized scoring tools to identify and prioritize risks, and build reliability programs.