Highlight Chaos Engineering experiments with AppDynamics and Gremlin webhooks
Chaos Engineering with Gremlin is a powerful way to tune your monitoring to ensure you are gathering actionable data and to train your teams to leverage these tools so that observability expertise isn't siloed in your organization. AppDynamics is an application performance management tool used by companies worldwide to monitor their workloads. Used in combination, these two tools can help you lower your mean time to detection (MTTD) and increase the availability of your applications. This tutorial will walk you through how you can correlate an attack from Gremlin to its impact in AppDynamics.
Prerequisites
- An AppDynamics account (get a trial here)
- AppDynamics App Server Agent and Machine Agent installed in an application and server or instance
- A Gremlin account (request a free trial)
Step 1: Make a custom event role in AppDynamics
First, you need to add a role for Gremlin to post custom events to AppDynamics. Go to Settings -> Administration
Select Roles -> Create. Provide a Name: <span class="code-class-custom">Gremlin Events</span>. Then, select Applications -> Check “View” and click on “Edit” and check “Create Events”.
Click “Save”.
Step 2: Make a custom event user
Now you need a user that has the Gremlin Events role. Go to Users -> Display Users from “AppDynamics”, click “Create”.
Enter Username: gremlin, Email: <span class="code-class-custom">{your_email}</span>, Name: <span class="code-class-custom">Gremlin Events</span>, Password: <span class="code-class-custom">{password}</span>.
Roles -> Add “Gremlin Events”
Click “Save”.
Step 3: Gather your account name and app ID
You’ll need to get your user path and endpoint to send to AppDynamics. In AppDynamics, go to Settings -> License
Go to Account. Take note of your Account Name next to “Name”, you’ll use that in the next step.
Open the Applications dashboard and select the application you wish to experiment on. In the URL, you’ll find <span class="code-class-custom">application={app_id}</span>. You can see mine below is <span class="code-class-custom">10691</span>. Grab that app_id number for Step 5.
Step 4: Encode your login key
Now that you’ve gathered that information, you’ll need to encode it for the Authorization header in the webhook. In your terminal (Mac or Linux) enter:
or Command Prompt (Windows) enter:
Save that output for the next step.
Step 5: Create 2 Gremlin webhooks
The next step is to create two webhooks - one for when the Gremlin attack starts and one for when it finishes. Go to Settings -> Team Settings
Select Webhooks -> New Webhook. Enter the Name <span class="code-class-custom">AppD Basic Webhook Start</span>, your Description and the following URL with your own controller’s address:
Then add a header key:value with the key <span class="code-class-custom">Authorization</span> and the value using the key generated from the previous step:
And select “Attack Running” and Save.
Add a second webhook with the Name <span class="code-class-custom">AppD Basic Webhook Finish</span>, your Description and the following URL with your own controller’s address:
Then add a header key:value with the key <span class="code-class-custom">Authorization</span> and the value using the key generated from the previous step:
And select “Attack Finished” and Save.
Step 6: Create a dashboard in AppDynamics
You’ll need a way to visualize the attack. In AppDynamics, go to Dashboard & Reports -> Create Dashboard. Enter the Name <span class="code-class-custom">Gremlin Attack Dashboard</span>.
Click “Add a Widget” -> “Time Series Graph” and click the + sign under Data. Under “Select Data Source” select “Servers” and under “Select a Metric” select “Hardware Resources|CPU|%Busy” and Save.
Under Events, select “Show Events” and the Data Source select the application you grabbed the app_id from in Step 4. Under “Filter Criteria”, unselect all items and select “Custom”. Click Save.
Click “Add Widget” again and select “Health Rules & Events” then “Event List”.
Under Events select Show As “Timeline” and the Data Source as your application you chose in Step 4. Under Filter Criteria, unselect all then select Custom. Click Save.
Your simple dashboard is all set up.
Step 7: Run a CPU attack
In Gremlin, go to Attacks -> New Attack. Select the host(s) where you have the AppDynamics agent(s) installed. Select “Choose a Gremlin” and Resource -> CPU. Set the length to <span class="code-class-custom">300</span> seconds, CPU Capacity of <span class="code-class-custom">60</span>%, and <span class="code-class-custom">All Cores</span>.
Click “Unleash Gremlin” and head back over to your AppDynamics dashboard. In the dashboard, you can see where the attack started and the CPU spike and when it finished and the CPU wound down.
Conclusion
The CPU attack is a great first attack, but using Gremlin and AppDynamics together, you can do many more experiments, like tracing the impact of a little backend latency to front end latency to watch for exponential latency. Additionally, using Gremlin, you can test your thresholds to tune your AppDynamics alerting to prevent noisy alerts. Fire up an attack and make sure your alerts fire at the appropriate time. Target random hosts to make sure you cover your application.
We look forward to seeing what you build!
Avoid downtime. Use Gremlin to turn failure into resilience.
Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.