Manage your reliability work more easily with Gremlin’s newest features
Reliability testing is ongoing work, and tracking that work can be difficult in large organizations. Engineers run one-off experiments, scheduled Scenarios run in the background, and, for more mature teams, CI/CD workflows fire off automated tests on demand. According to our own product metrics, teams run an average of 200 to 500 tests each day! With so much happening, it’s hard to keep track of everything going on in Gremlin—until now.
Today, we’re excited to announce the launch of three new screens for managing activity in Gremlin: Now Running, What’s Scheduled, and What Ran.
Scheduling improvements for the entire Gremlin platform
In the past, tracking activity in Gremlin meant navigating between multiple areas of the application, including experiments, Scenarios, Failure Flags, services, and the notification bell. This made it hard to see which tests had been run on a service, what your teammates were doing, and what you’d worked on recently. With our latest release, keeping tabs on your activity is much easier.
Each page surfaces these Gremlin activities: experiments, Scenarios, reliability tests, and Failure Flags.
Now Running
The Now Running page is your single-pane-of-glass view of everything in your Gremlin team. With it, you can see who (or what) started an activity, what time it started, how long it will run for, and which system(s) or service(s) it’s impacting. Users with the correct privileges can also halt running activities directly from this page. This page effectively replaces the old notification bell previously shown in the top-right corner of the Gremlin web app.
This page is particularly useful for synchronizing your reliability testing with your coworkers. If anyone in your Gremlin team is running a test or experiment, it will show up here. This gives you a quick and easy way to see which resources are being tested so you don’t accidentally run multiple tests simultaneously. This isn’t a problem for services since they only allow one test at a time, but this restriction doesn’t apply to experiments or Scenarios.
What’s Scheduled
Looking to plan your next week of reliability work? The What’s Scheduled page is for you. It shows every activity scheduled for the next week, estimated start times, and impacted services.
What’s Scheduled helps teams plan their reliability work for the week ahead. It’s easy to pull up on a Monday morning (or during standup) and review which services are being tested and when those tests are running.
This feature also helps when running larger-scale reliability initiatives involving multiple teams, such as availability zone and region redundancy testing. If you’re planning a test that involves services used by other teams (availability zone or region outages, auto-scaling tests, etc.), check the What’s Scheduled page first to make sure your testing doesn’t conflict with other planned tests.
What Ran
Lastly, check out What Ran if you need a recap of your reliability work over the past week. This page shows you all the tests run over the past week, who (or what) ran them, and which services were impacted.
Where “What’s Scheduled” is a useful report to review at the start of the week, “What Ran” is useful for end-of-week reviews. By seeing what was done, you can better plan for future reliability work. This includes identifying services that were not tested and tests that were not run. It’s also useful for determining how or why a recent test failed since the result of each activity is shown with a direct link to the test details page.
What Ran is also a great accompaniment to Gremlin’s existing reports: the Company Summary, Team Score, and Team Risk reports. The reports show the current state of your reliability testing initiatives, while What Ran shows how your team arrived at that state. This makes it a great way to surface details to management, or for planning future reliability work.
Scheduling improvements are available today
The Now Running, What’s Scheduled, and What Ran screens are available today for all Gremlin users. To get started, simply log into your Gremlin account. You’ll see the new pages in the left-hand navigation menu. You can also click the following links to go directly to Now Running, What’s Scheduled, and What Ran.
If you’re not yet a Gremlin user, you can see these features in action with our latest product tour:
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
sTART YOUR TRIALGremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
start your free trialMeasuring the impact of your reliability work with reports
Learn how Gremlin’s built-in reporting tools track your reliability work, find high-priority reliability risks in your environment, and demonstrate your progress towards greater reliability.
Learn how Gremlin’s built-in reporting tools track your reliability work, find high-priority reliability risks in your environment, and demonstrate your progress towards greater reliability.
Read moreYour reliability scorecard: How to measure and track service reliability
Learn how Gremlin helps you track and manage your progress towards improved reliability with its comprehensive, built-in reporting tools.
Learn how Gremlin helps you track and manage your progress towards improved reliability with its comprehensive, built-in reporting tools.
Read more