How to load-balance across multiple availability zones for improved redundancy
Load balancers are some of the most important load-bearing (pun intended) components in cloud environments. They perform multiple critical tasks: network switching, packet inspection, and of course, routing. Most cloud-based load balancers focus on load balancing within a single zone, but what if you have resources spread across multiple zones?
In this blog, we’ll explain how cross-zone load balancing works, why it’s important to reliability, and how you can enable it in your own cloud deployments.
What is cross-zone load balancing, and why is it important?
Cloud environments like AWS provide multiple isolated availability zones (AZs) that you can deploy resources to. Each AZ acts as its own separate data center, ensuring that failures in one AZ don’t spread to other AZs. Even though AZs are separate, you can still network resources together using Virtual Private Clouds (VPCs). For instance, a Kubernetes cluster with worker nodes in separate zones can still communicate with each other as if they’re located on the same rack.
However, traffic coming inbound from the public Internet (e.g. from your customers) isn’t aware of your VPC topology. For public traffic to route to the right systems, you need a load balancer to receive and redirect traffic. Normally, load balancers only route traffic within a single AZ, since they’re only aware of resources located in that AZ. With cross-zone load balancing enabled, load balancers in different regions communicate with each other and take all resources into consideration when routing traffic.
For example, imagine you have six EC2 instances: two are running in us-east-1, and four are running in us-east-2. You also have two load balancers: one for each region. Traffic entering us-east-1 will be split 50/50 between the two nodes, and traffic entering us-east-2 will be split four ways across the four nodes. If cross-zone load balancing is enabled, traffic is split six ways across the instances in both regions, leaving each instance with around 17% of the total traffic. Your cloud topology stays the same, but the load distribution changes to use all available resources more efficiently.
How do I enable cross-zone load balancing?
In AWS, cross-zone load balancing is enabled by default for newly created Application Load Balancers (ALBs). You can confirm this by logging into your AWS account, navigating to EC2 → Load Balancers, selecting your load balancer, and selecting the Attributes tab.
This flag can also be toggled for individual target groups. By default, target groups inherit this setting from the load balancer, but you can explicitly enable or disable it from the target group’s settings. We recommend leaving it to inherit from the load balancer, and leaving cross-zone load balancing enabled on the load balancer.
For detailed information, see the Application Load Balancer section of the AWS documentation.
How do I validate that my services are load-balanced across zones?
There are a few ways you can confirm that traffic is being load-balanced across zones.
If you’re running EC2 or EKS, one direct method is to have each of your endpoints return its own IP address. For example, you could set up the Nginx web server on each EC2 instance and add the following rule:
When your load balancer receives a request, it will forward the request to one of your instances. The instance will respond with its IP address. As you send additional requests, the IP address in the response should change, indicating that it’s coming from different instances.
Alternatively, Gremlin can automatically detect this for you. In Gremlin, you can define your load balancer as a service. This means that you can measure, test, and track the reliability of the load balancer and all of its downstream instances as a single functional unit. As soon as you add a load balancer as a service, Gremlin automatically detects several potential reliability risks. This includes whether it has cross-zone load balancing enabled. If it doesn’t, Gremlin will flag this as a risk. You’ll also get recommendations on how to fix the issue directly in the web app.
Once you define your load balancer service(s) in Gremlin, Gremlin will keep monitoring it for these and other risks. You’ll also see reports for any other services that you (or someone in your Gremlin team) have added, along with their risks. These risks ultimately feed into the reliability score, which is a measure of how reliable the service is.
A third way to verify that you have cross-zone load balancing enabled is by running one of Gremlin’s pre-built reliability tests. The default Gremlin reliability test suite comes with a zone redundancy test that simulates an AZ outage. It does this by running a blackhole experiment, which drops all network traffic to and from the AZ (excluding traffic from Gremlin) for 5 minutes. During this time, it uses Health Checks (which you can configure yourself, or let Gremlin configure automatically) to track the state of your service. If your load balancer keeps returning successful responses throughout the test, that means cross-zone load balancing is working. If it returns an error, Gremlin automatically stops the test, restores connectivity to the AZ, and records the test as having failed.
What about cross-region load balancing?
Unfortunately, load balancing across regions isn’t quite as easy to set up. You’ll need to use a different service, like Route 53, to connect multiple ELBs together. Similarly to ELBs, you can configure Route 53 to monitor the health of each load balancer and only route traffic to healthy ELBs. You can also specify whether to route traffic based on geoproximity to the user, lowest latency, or a number of other policies.
How do you verify that your cross-region load balancing method works? Using one of Gremlin’s Recommended Scenarios. A Recommended Scenario works the same way as the reliability test shown above, only it allows you to select the targets yourself. In this example, you can select an entire AWS region, and Gremlin will only drop network traffic to targets in that region. All you need to do is customize the Scenario, select the region you want to target in the blackhole experiment, and add a Health Check. The Health Check should monitor your Route 53 endpoint, not your ELB endpoint.
When you’ve configured the Scenario, click Run. Gremlin will drop traffic to the region and monitor your Route 53 endpoint to ensure traffic gets rerouted successfully. If it can reach the endpoint for two full minutes, the test is successful, and you know that your cross-region load balancing is configured correctly.
What else can I test using Gremlin?
Gremlin provides a number of other experiment types and pre-built reliability tests. You can recreate AZ and region outages, add latency, check for expiring TLS certificates, create packet loss or jitter, or simulate a DNS outage. These are just the network-based faults—you can also use the other experiment types in Gremlin’s library to recreate other failure modes, including CPU and memory load, disk exhaustion, process exhaustion, and desynchronized system clocks.
To start testing load balancer redundancy, log into your Gremlin account or sign up for a free 30-day trial.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
sTART YOUR TRIALGremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
start your free trial