How to Install and Use Gremlin on Fedora

Rich Burroughs
Community Manager
Last Updated:
December 11, 2019

Gremlin is a simple, safe and secure service for performing Chaos Engineering experiments through a SaaS-based platform.

This tutorial will show you how to install the Gremlin agent on Fedora hosts, and how to perform your first Chaos Engineering experiment, a CPU attack.

  • Step 1 - Installing the Gremlin Agent
  • Step 2 - Running your first CPU experiment
  • Step 3 - Halting an attack

Prerequisites

A Fedora host. You need to have sudo or root access on the host. This tutorial was tested with Fedora 30, but should work with other versions.

A Gremlin account (sign up here).

Step 1 - Installing the Gremlin Agent

Connect to your host with ssh:

BASH

ssh username@your_server_ip

Add the Gremlin repo:

BASH

sudo curl https://rpm.gremlin.com/gremlin.repo -o /etc/yum.repos.d/gremlin.repo

Then install the Gremlin agent.

BASH

sudo yum install -y gremlin gremlind

The next step is to configure the Gremlin agent with your Gremlin Team ID and Gremlin Secret. Log into the Gremlin web UI with your email address and password, and then go to Company Settings and click on Teams.

Team Settings screen

Click on your team in the list. Then click on Configuration.

Team Configuration

To configure the Gremlin agent you’ll need the Team ID and Secret Key. Both are generated automatically when your company is created. The Team ID is displayed on this screen, but the Secret Key is hidden. If you don’t know your Secret Key, you can hit the Reset button to create a new one.

Resetting the key will require you to update the key on any other agents you have running. After hitting Rest, you’ll see a popup screen explaining this and asking for confirmation. Hit Continue.

Reset Secret Key

Next you’ll see a window where you can copy the Secret Key:

Copy Secret Key

Make sure to make a note of your Secret Key, as this is the only time you will be able to view it. If you lose it, you’ll need to hit the Reset button again to generate a new one.

Now that we have the Gremlin Team ID and Secret Key, we can finish configuring the agent. Go back to your SSH session on the Fedora host and run this command:

BASH

gremlin init

Input your Team ID and Secret Key when you’re prompted for them.

The setup is now complete and you’re ready to begin running Chaos Engineering experiments!

Step 2 - Running your first CPU experiment

On your Fedora host, run the “top” command. This is how we’ll view the CPU usage for this experiment.

Top command output

In the Gremlin web UI, click the Attacks link in the left navigation bar, and then click the New Attack button.

Click on New Attack

There are several ways to target which hosts or containers you want to attack. The default is Hosts, and we’ll use that. Click the Exact button and select your Fedora host.

Host targeting

Scroll down and click Choose a Gremlin. Select Resource and then CPU.

Select CPU attack

Scroll down again to enter the settings for the attack. For this first attack we’ll set the length to 180 seconds, select All Cores from the pulldown menu, and leave the CPU percentage at the default setting. Then click Unleash Gremlin, which will start the attack.

CPU attack settings

You’ll then see the attack listed as Running.

Go back to your SSH session on the Fedora host and examine your top output. Once the attack changes to a Running state, you should see much more CPU activity than previously.

Top command output

The attack will end after the 180 seconds have passed. You’ll then see it listed on the Attacks page as Completed.

Step 3 - Halting an attack

It’s a recommended practice to define abort conditions before running Chaos Engineering experiments. Abort conditions are things that would make us want to halt an experiment immediately, because we are concerned about the safety of our systems. The abort conditions for an experiment could be defined as an increase in error rate, an increase in latency, or specific alerts we receive.

For abort conditions to be useful, our Chaos Engineering tool needs to allow us to halt experiments immediately. Gremlin allows us to halt individual attacks, or all running attacks.

In the Gremlin UI go to the Attacks page and hover over the three dots on the right of the attack you just ran. Click on Rerun Attack.

Attack listing

This will put you back in the targeting interface. The attack will default to all of the same settings you used last time, so just scroll down to the bottom of the screen and click Unleash Gremlin.

Once the attack is in the Running state, there are two options for halting it. We can either click the Halt button to the right of the attack, or the Halt All Attacks button at the top of the screen.

Halt attack

In this case either would work, as we only have one attack running, but in some situations we might want to halt one attack without impacting others.

The ability to quickly halt all running experiments is an important part of Chaos Engineering, and allows us to experiment in a safe way.

Conclusion

At this point you have a Fedora host running with Gremlin, you’ve run your first Chaos Engineering attack, and you’ve learned how to halt running attacks. Congrats! For next steps you could try running some other types of attacks, like Memory, Latency or DNS.

To learn more about Gremlin you can read the documentation, which explains the other types of Chaos Engineering attacks you can perform. To learn more about Chaos Engineering join our Chaos Engineering Slack, and read more tutorials on our Community page.

No items found.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
start your trial

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.

Product Hero ImageShape