WEBINAR

Recreating 3 Common Outages with Gremlin Scenarios

In this live tutorial, we show you how Gremlin Scenarios can be used to recreate complex failure conditions and proactively prepare your systems to withstand them.

Save your seat

On-demand

Thank you for registering for this on-demand event. You will receive an email momentarily with a link to watch the session.


About this webinar

In October of 2016, large swathes of the internet were knocked offline. The cause, as we would eventually learn, was a distributed denial-of-service (DDoS) attack against the DNS provider, Dyn. Many of the internet’s most popular sites that relied on Dyn experienced extended downtime, resulting in a significant impact on revenue, engineering velocity, and brand reputation.

Since the Dyn outage, many major services have built-in redundancy around their DNS provider, allowing them to gracefully failover to a backup service should the primary provider become unavailable.

This is just one example of a failure scenario that caused widespread outages. However, there are other more common scenarios that can cause problems.

You'll walk away understanding how Gremlin can be used to recreate complex failure conditions and proactively prepare your systems to withstand them. You’ll also have the opportunity to have your questions answered by our experts during our Q&A segment.

Agenda
  • In this live session, we will introduce 3 failure scenarios that can cause downtime, ranging from simple to complex: autoscaling errors, unreliable networks, and DNS outages.
  • You'll see real-world examples of incidents caused by these failure scenarios.
  • We will demonstrate how you can recreate these failure conditions and test your systems for resilience against them using Gremlin’s new Recommended Scenarios.
  • Finally, you will get a framework for building your own Custom Scenarios specific to your use case.
About the speakers

Lorne Kligerman

Director of Product
Gremlin

Lorne currently leads the product team at Gremlin, helping companies improve reliability and avoid outages by running proactive chaos engineering experiments. He last worked at Google Cloud as a Product Manager on App Engine, empowering developers to build applications on a fully managed and resilient platform.

Ana M Medina

Sr. Chaos Engineer
Gremlin

Ana Margarita is a Senior Chaos Engineer at Gremlin and helps companies avoid outages by running proactive chaos engineering experiments. Before Gremlin, she worked at various-sized companies including Google, Uber, SFEFCU, and Miami-based startups. Ana is an internationally recognized speaker and has presented at: AWS re:Invent, KubeCon, DockerCon, DevOpDays, AllDayDevOps, Write/Speak/Code, and many others. Catch her tweeting at @Ana_M_Medina about traveling, diversity in tech, and mental health.

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.

Product Hero ImageShape