WEBINAR

Running Your First 5 Chaos Experiments on Kubernetes

Reliability and high availability are key features of Kubernetes, but even the most resilient systems can fail. Applications crash, hardware breaks, and nodes can go offline. These failures can have damaging and unpredictable consequences for organizations, especially those that are unprepared.

In our upcoming webinar, we’ll be exploring how to improve the availability and reliability of Kubernetes clusters using the discipline of Chaos Engineering.

On-demand

Register Now

Thank you for registering for this on-demand event. You will receive an email momentarily with a link to watch the session.


About this webinar

Reliability and high availability are key features of Kubernetes, but even the most resilient systems can fail. Applications crash, hardware breaks, and nodes can go offline. Thesefailures can have damaging and unpredictable consequences for organizations, especially those that are unprepared.

In our upcoming webinar, we’ll be exploring how to improve the availability and reliability of Kubernetes clusters using the discipline of Chaos Engineering.

You will also have an opportunity to ask questions of our experts during our live Q&A segment.

Agenda
  • You will learn how to use Chaos Engineering to safely inject failure into your applications and nodes in order to detect weaknesses
  • Additionally, we’ll walk through specific Chaos Experiments for you to run on Kubernetes to ensure you’ve designed a reliable system
  • By the end of the session, you’ll have specific recommendations for how to harden your infrastructure, improve reliability, and keep your applications running smoothly
About the speakers

Andre Newman

Sr. Reliability Specialist
Gremlin

At Gremlin, Andre promotes the benefits of Chaos Engineering and reliability testing to engineering teams around the world, including at some of the largest enterprise organizations. Prior to Gremlin, he created technical content explaining Kubernetes and containerization, the shift to cloud computing, DevOps, observability, and more. His work has been featured in The New Stack, DZone, Software Engineering Daily, TechBeacon, and StatusCode Weekly.

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.

Product Hero ImageShape