Announcing the Gremlin Chaos Champion Program
Today, we’re thrilled to announce the launch of the global Gremlin Chaos Champion program. As chaos engineering advocates ourselves, we’re always eager to find other practitioners who are advancing the practice with the assistance of Gremlin. This new program is an opportunity for the chaos engineering community to nominate Gremlin practitioners who are advocating for chaos engineering practices and demonstrate technical excellence within their organizations.
Our inaugural class of Gremlin Chaos Champions have gone above and beyond the scope of their role to make reliability a central tenet of their job function. They proved that by practicing chaos engineering they can thoughtfully inject harm to identify failures and weaknesses in their systems before they become, ahem, a chaotic customer experience. They’ve helped train other engineering, site reliability engineering, and quality assurance teams within their company on chaos engineering practices, running their own GameDays and integrating Gremlin within their CI/CD pipelines. They’ve successfully onboarded new teams on Gremlin to empower them to conduct their own experiments. In doing so, these champions have helped improve their applications’ and services’ availability and stability, leading to positive customer experiences and ultimately driving greater business value.
But don’t just take it from us, many of the champions below were nominated by their colleagues and managers for the work they’re doing to improve reliability through chaos engineering.
Meet the Freshman Class of Gremlin Chaos Champions
Chaitanya Krant
Manager, Cloud, DevOps & Chaos Engineering
Chaitanya Krant kick-started the “Enterprise Chaos Engineering @NAB” initiative. He has set the chaos adoption roadmap for incident reduction, to measure service reliability, deliver the automated framework for continuous chaos experimentation, take chaos practices to production, and ensure 100% adoption. This program has successfully trained 30+ teams, completed 2000+ experiments, and executed five bootcamps and two enterprise gamedays in just a span of one year.
For Enterprise systems, reliability is key. Never lose a customer to poorly designed systems again. Kick-start your (R) reliability journey with the Chaos-first approach today!
For Enterprise systems, reliability is key. Never lose a customer to poorly designed systems again. Kick-start your (R) reliability journey with the Chaos-first approach today!
Jenn Riemer
Quality Engineering Manager
Jenn is being recognized for her technical excellence and community participation in the field of Chaos Engineering at SAS. Jenn has been using Gremlin with her team to run GameDays and identify critical failure modes before they impact customers. Jenn has been spreading the value of chaos engineering across SAS and getting more teams to embark on the journey with her.
Cloud providers have made it easier than ever for businesses to orchestrate their critical software infrastructure in highly available and scalable fashions. When downtime can cost our customers millions, responsible software providers focus on building reliable and resilient offerings that recover quickly and minimize blast radius. Rigorously testing your software with intentional disruptions helps to surface potential issues and increases confidence in the robustness of your products.
Cloud providers have made it easier than ever for businesses to orchestrate their critical software infrastructure in highly available and scalable fashions. When downtime can cost our customers millions, responsible software providers focus on building reliable and resilient offerings that recover quickly and minimize blast radius. Rigorously testing your software with intentional disruptions helps to surface potential issues and increases confidence in the robustness of your products.
Adam Margherio
SRE/DevOps Engineer
Adam has been driving the practice of Chaos Engineering.
Chaos Engineering has had a significant impact on my approach to constructing and operating resilient cloud platforms and services. Because of the efforts and actions taken to stress our offerings to their limits and gain valuable understandings about how they fail, I’m proud to say I’d step into an on-call rotation for any of our platform-critical infrastructure without hesitation. Chaos Engineering has changed my approach to demonstrating true platform resiliency and fault tolerance; I’m thrilled at everything we’ve been able to prove so far, and I’ll be a lifelong advocate.
Chaos Engineering has had a significant impact on my approach to constructing and operating resilient cloud platforms and services. Because of the efforts and actions taken to stress our offerings to their limits and gain valuable understandings about how they fail, I’m proud to say I’d step into an on-call rotation for any of our platform-critical infrastructure without hesitation. Chaos Engineering has changed my approach to demonstrating true platform resiliency and fault tolerance; I’m thrilled at everything we’ve been able to prove so far, and I’ll be a lifelong advocate.
Matthew Simons
Senior Engineering Manager, Ames, IA
Matthew has been driving Chaos Engineering as a practice across Workiva. He’s expanded their practice to now include teams across Workiva. Matthew has engaged his team to join him on the journey of identifying failure modes with Gremlin and developing a deep technical expertise in Chaos Engineering for new technologies such as Kubernetes. Matt is a fan of preparing for and being ready for Black Swan events with his team and also sharing his knowledge with the community. You can catch Matt speaking about Chaos Engineering @ Chaos Conf on October 7.
Can chaos coerce clarity from compounding complexity? Certainly.
A Gremlin Chaos Champion is:
- A Gremlin user who champions the use of the technology within their company
- A fanatic about reliability, and someone who’s constantly thinking about building greater resiliency
- Someone with a proven track record for reliability success and has quantifiably improved reliability metrics (SLOs & SLIs) and results (e.g. reduced downtime and high severity outages)
- An advocate for Chaos Engineering and DevOps practices
- A team player and coach - someone who’s willing to put in the extra effort to help their team succeed in adopting new practices and technologies
- Someone who openly seeks opportunities to talk about their experiences with Chaos Engineering amongst the broader engineering community
Why become a Gremlin Chaos Champion?
- Meet quarterly with the Gremlin advocacy and product teams - see new product updates first!
- Be recognized as a leader amongst your peers - we’ll promote your success through case studies, blogs, social media
- Author content on the Gremlin blog and speak at Gremlin events
- Connect with Gremlin Chaos Champions to sharpen your skills and share success stories
Is there someone at your company that deserves to be recognized for the work they’re doing to improve reliability? Or, are you that person? It’s okay to nominate yourself, we won’t tell ;). We’ll review all nominations and determine new Gremlin Chaos Champions on a quarterly basis.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
sTART YOUR TRIALWhat is Failure Flags? Build testable, reliable software—without touching infrastructure
Building provably reliable systems means building testable systems. Testing for failure conditions is the only way to...
Building provably reliable systems means building testable systems. Testing for failure conditions is the only way to...
Read moreIntroducing Custom Reliability Test Suites, Scoring and Dashboards
Last year, we released Reliability Management, a combination of pre-built reliability tests and scoring to give you a consistent way to define, test, and measure progress toward reliability standards across your organization.
Last year, we released Reliability Management, a combination of pre-built reliability tests and scoring to give you a consistent way to define, test, and measure progress toward reliability standards across your organization.
Read more