Elastic Load Balancing
Last updated
Last updated
Load Balancing is a networking concept that is used to distribute loads or in simpler terms, let's just say traffic to different servers of the same application. This is particularly useful to improve fault tolerance and performance of applications running in AWS.
It integrates with EC2, EKS and Lambda easily.
One point to be noted here that there are two types of scaling.
ELB also supports Availability Zone Balancing; meaning if different servers hosted on EC2/EKS/Lamda for the same application or use case are distributed across different availability zones, then ELB can be used to serve the user with nearest one, for enhancing the user-experience.
Load balancers have a listener that receives traffic and forwards it to internal web or app server depending on the application architecture.
Usually the recipient of the load balancer traffic is configured in form of target group.
Also important to note that load balancer configuration has something called as target type, which can be either instance or IP. When instance is there, the traffic is forwarded to its Internal IP Address on primary interface. And when IP is selected, private IP has to be specified which can be associated with an interface, other than primary, considering the instances has more than one IP other interfaces.
The members of the target group are also subjected to a health check to know if traffic can be passed to it or not. This health check is usually done in form of a GET request and successful response ensures that the instance member has passed it or not.
There are four types of Load Balancers:
For passing traffic based on request attributes.
It uses Round Robin algorithm to select the target each time.
Supports WAF to be used in conjunction with ELB.
Please note that it terminates the client connection by responding to them post looking at HTTP headers and can use any port from 1-65535 for HTTP and HTTPS.
It supports Path and host-based routing (particularly useful for microservices based architecture)
For passing traffic based on layer 4 protocol and port number.
Supports any TCP Connection.
It uses Flow Hash algorithm for even selection of all targets.
It passes the traffic as it is and does not terminate the HTTP or HTTPS client connections. Hence, the target remains the same tillthe TCP connection is timed out.
It does not support host or path based routing
For passing the traffic through security appliances.
For load balancing based on EC2 instances individually.
This is usually used for Network of old EC2 Classic Instances (applicable before 2014) and hence not recommended for VPC deployments
Note: Sticky sessions make sure that when a user connection received to ELB gets forwarded to a server, all the subsequent connection remains to that server for a specified period of time to prevent bouncing of connection to other load balanced server
Note: Idle Time outs ensures a client connection that is no longer being used, gets closed after a specified number of seconds.
Consider an application with following architecture:
webs-1, webs-2 and webs-3 are three web servers in three different availability zone and are part of subnets websnet-1, websnet-2 and websnet-3 respectively which is having private IP range of 172.31.1.x/24, 172.31.2.x/24 and 172.31.3.x/24 correspondingly.
To load balance, the first thing needed is a target group, which can be configured as follows:
Now, let's see how one can find the creation page for a load balancer on AWS portal:
Note, if the load balancer would have been internal, then instead of internet facing, we would have selected internal in the scheme area as shown above.
Note that for HTTPS, one can set the listener to 443 and since it uses TLS, a certificate is necessary to be given to load balancer listener as shown below:
Rest from below, common wizard options are shown:
Note that one can create an alias A record to load balancer for routing traffic via domain name.
Now, if one wishes to do rule based routing, one can setup the rules accordingly. Example: There is one domain configured to a load balancer. The requirement is that when the path after the domain name is different in the URL, a different target group (of maybe the same target servers but the service is listening on different port number) should be reached out, so that the resource can be loaded accordingly. To do this, one can set the rules accordingly.
Other rules can be on the basis of:
The process is almost the same as Application Load Balancers. The only difference is that instead of HTTP/HTTPS, TCP has to be selected.
and rest of the process is exactly the same.
Please note for having HTTPS connection over Network Load Balancer, each target must have the certificate installed individually because one can't install the certificate on load balancer itself as the connection is pass through and terminates directly on the target itself.
Sticky sessions ensures that a particular client connection always gets forwarded to the same target that it originally gets load balanced to. To do this, the load balancer uses cookie which has a unique encrypted value that changes with each request. When different clients and load balancer are communicating with each other, load balancer identifies a particular client with the help of that changing value cookie and on basis of that, the load balancer forwards every subsequent request to a particular target of the target group. Stickiness duration can range from 1 second to 7 days. It can be configured as shown below:
Idle Timeouts maintains the duration till how long a TCP connection for a client is open as it is this TCP connection in which HTTP request and response traverses. idle simply means no data is passing between client and server and default timeout is 60 seconds. If simply due to business or non performance of the a particular target of the target group result into timeout, then client receives a gateway timeout error with 504 code.
Just like idle timeouts are for connection waiting time for clients , keep alive time is for waiting time of load balancer to receive response from the a particular target of the target group before terminating the connection.