Auto Scaling helps you to provision and remove instances based on traffic and demand on your workloads. With auto scaling, you can quickly deploy additional servers when certain parameters are met and then release those servers when you load decreases. This can significantly drive cost saving strategies and ensure that resources are only provisioned when required. Auto scaling is an important concept to study for the AWS Certified Solutions Architect, Certified Developer and SysOps Administrator exams. You need to be able to demonstrate how to use Autoscaling to design highly available and scalable solutions in the cloud as well as administer and troubleshoot your auto scaling configurations. This article, Amazon Auto Scaling Exam Tips will help you in your revision for the AWS Certification exams.
Typical Uses Cases for auto scaling include:
- websites that focus on sport events
- eCommerce sites that need to scale out during festivals like Christmas
- Income tax sites that provide additional capacity during certain times of the year to allow users to submit their tax returns
Benefits of Auto Scaling
Adding Auto Scaling to your application architecture is one way to maximise the benefits of the AWS cloud. When you use Auto Scaling, your applications gain the following benefits:
- Fault Tolerance – Auto Scaling can detect when an instance is unhealthy, terminate it, and launch another instance to replace it. You can also configure Auto Scaling to use multiple Availability Zones. If one Availability Zone becomes unavailable, Auto Scaling can launch instances in another Availability Zone to compensate.
- High Availability – Auto Scaling can help you ensure that your application always has the right amount of capacity to handle the current traffic demands.
- Cost Management – Auto Scaling can dynamically increase and decrease capacity as needed. Because you pay for the EC2 instances you use, you save money by launching instances when they are needed and terminate them when they are not needed.
Auto Scaling Components
- Auto Scaling Group – You launch EC2 instances into auto scaling groups so that they can be managed by the auto scaling policies. The auto scaling group defines configuration options which determine when new instances should be launched and when they should be terminated.
Key points to note:- You need to specify the group name
- Specify the maximum and minimum number of instances in the group
- Optionally specify the desired number of instances in the group. This will ensure the group maintains a specific number of instances always. If there is no desired number specified, auto-scaling will ensure that the minimum number of instances is available instead.
- Key Note – You can only specify On-Demand or Spot instances when configuring auto scaling. You cannot use auto scaling to launch Reserved Instances.
- Auto Scaling Launch Configuration – Your auto scaling group uses a launch configuration, which is essentially a template for EC2 instance configurations. When you create a launch configuration, you can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances.
- Key Note – As part of your launch configuration process, you need to ensure that you do not cross limitations associated with any services. If you need to provide services that result in crossing the default limits, you will need to first raise a support request.
- Scaling Policies – You can associate CloudWatch alarms and scaling policies with your auto scaling group. When a CloudWatch threshold is crossed like CPU Utilisation, CloudWatch can send an alarm to trigger a change such as increasing the number of instances in your fleet. Auto scaling will then execute a policy to scale your group. It is recommended to scale out quickly but scale in slowly to enable you to respond to bursts and spikes and do not terminate ec2 instances too quickly. You can also configure a ‘cooldown’ period when you want to suspend scaling activities for a short time for an autoscale group
Key Note – The Launch Configuration can be modified independently of the Auto Scaling Group it is associated with which means you can modify the configuration to roll out new versions of your EC2 instances, for example where you may have patched them or updated them with new software.
Auto Scaling Plans
There are several plans you can use to control how you want to deploy your auto-scaling solution.
- Maintain Current Instance Levels – specify the number of instances to maintain always; if an instance fails, auto-scaling will launch a new instance in the appropriate Availability Zone to compensate and ensure that the required number of instances is available. Auto Scaling does this by performing health checks on running instances within an Auto Scaling Group
Key Note – You can use auto scaling to ensure that your production environments have a consistent number of EC2 instances available and online always.
- Manual Scaling – specify maximum, minimum and desired capacity of your auto scaling group. You can use manual scaling when resources need to be increased for responses to infrequent events like launching a new product.
- Scheduled Scaling – used to increase or decrease the number of instances in your auto-scaling group due to a specific anticipated need or predictable need. For example, you need to scale out additional servers to cope with quarterly or yearly processing workloads
- Dynamic Scaling – control your auto scaling deployments based on dynamic changes to CloudWatch thresholds – e.g. increase the number of web servers when you identify incoming web traffic going above a threshold
Auto Scaling Lifecycle Hooks
You can perform a custom action as Auto Scaling launches and terminates EC2 instances. This is enabled via a process called lifecycle hooks and enables you to perform an action such as install and configure software on new launches or download certain log files from an instance before termination. Adding lifecycle hooks to your Auto Scaling group gives you greater control over how instances launch and terminate.
Key Steps:
- Auto Scaling will respond to triggers to either launch or terminate an instance
- The instance is then placed into a ‘wait’ state and remains in the state until you tell Auto Scaling to continue or the timeout period ends
- Perform custom action as required
Note – Instances remain in a wait state for one hour and Auto Scaling will continue to launch or terminate the instance. If you need to continue to the next step before the timeout period ends, you can use the ‘CompleteLifecycleAction’ operation. You can restart the timeout period if you need more time by recording a heartbeat and calling the RecordLifecycleActionHeartbeat operation.
When Auto Scaling launches or terminates an instance due to a simple scaling policy, a cooldown takes effect. The cooldown period helps ensure that the Auto Scaling group does not launch or terminate more instances than needed. When auto scaling that is configured with a lifecycle hook launches an instance to handle increased load, the instance is put into a Pending:Wait state for the duration heartbeat timeout parameter. During this time, it does not accept traffic. Furthermore, scaling activities are suspended during this period. Then when the instance enters the ‘InService’ state, a cooldown period starts and once the cooldown period expires, additional scaling activities can resume.
In addition, if you add a lifecycle hook to perform actions as your instances launch, the health check grace period does not start until you complete the lifecycle hook and the instance enters the InService state. Important Note – a lifecycle hook does not prevent a spot instance from terminating due to a change in the Spot Price, which can happen at any time.
Temporarily removing instances from auto scaling group
You can place put an instance that is in the InService state into the Standby state. This can be used to run maintenance task on your instances as required. When in the standby state, instances do not accept traffic but are still part of the auto scaling group.
- When you put an instance into standby state, it will remain in that state until you exit from it
- Instances that are registered with load balancers are deregistered from the load balancer.
- Auto Scaling will decrease the desired capacity of your Auto Scaling group when you put an instance on standby. This prevents Auto Scaling from launching an additional instance while you have this instance on standby. You can choose not to decrease capacity. This causes Auto Scaling to launch an additional instance to replace the one on standby.
- Auto Scaling increments the desired capacity when you put an instance that was on standby back in service. If you did not decrement the capacity when you put the instance on standby, Auto Scaling detects that you have more instances than you need, and applies the termination policy in effect to reduce the size of your Auto Scaling group.
Termination Policies
There are two types termination policies; default and custom
Default Termination Policy
- Auto Scaling identifies an Availability with the most number of instances and identifies an instance not under protection from scale it
- Auto Scaling then terminates the instance which uses the oldest launch configuration to terminate
- If there are multiple unprotected instances with the oldest launch configuration, it terminates the instance that is closest to the next billing hour
- If there is more than one unprotected instance closest to the next billing hour, Auto Scaling selects one of these instances at random.
Custom Termination Policy
- OldestInstance – Auto Scaling terminates the oldest instance in the group.
- NewestInstance – Auto Scaling terminates the newest instance in the group.
- OldestLaunchConfiguration – Auto Scaling terminates instances that have the oldest launch configuration. This is particularly useful when you want to replace your instances with ones that use a new launch configuration
- ClosestToNextInstanceHour – Auto Scaling terminates instances that are closest to the next billing hour.
- Default – Auto Scaling uses its default termination policy. his policy is useful when you have more than one scaling policy associated with the group.
Instance Protection
You can choose which instance Auto Scaling can terminate and which instance it cannot use Instance Protection. You can enable the instance protection setting on an Auto Scaling group or an individual Auto Scaling instance. Note that instances detached from a group will lose the protection and if attached again will inherit the protection setting from the group.
If all instances in an Auto Scaling group are protected from termination during scale in and a scale in event occurs, Auto Scaling decrements the desired capacity but cannot terminate the required number of instances until their instance protection settings are disabled.
180 Practice Exam Questions – Get Prepared for your Exam Day!
Our Exam Simulator with 180 practice exam questions comes with comprehensive explanations that will help you prepare for one of the most sought-after IT Certifications of the year. Register Today and start preparing for your AWS Certified Solutions Architect – Associate Exam.