Attend a Live Product Tour to see Sedai in action.

Register now
More
Close

CONTENTS

Using Amazon ECS Spot, Savings Plan and Reserved Instances to Optimize Costs

Published on
Last updated on

June 28, 2024

Max 3 min
Using Amazon ECS Spot, Savings Plan and Reserved Instances to Optimize Costs

Summary

  • There are four AWS pricing models: On-Demand, Reserved Instances, Savings Plans, and Spot Instances, each tailored to different usage patterns and budgetary constraints.
  • Spot Instances can achieve up to 90% savings compared to On-Demand rates, while effectively managing risks associated with their potential interruptions.
  • Combining On-Demand and Spot Instances within ECS using Auto Scaling Groups to dynamically adjust to workload demands, ensuring optimal cost efficiency and service availability.
  • Reserved Instances and Savings Plans can provide discounts for long-term commitments based on your consistent workload requirements.
  • Implement a comprehensive purchasing strategy that incorporates all available options, optimizing cost efficiency while mitigating risks like Spot interruptions or overcommitments.

AWS Pricing Models 

Compute in AWS is primarily offered with four purchasing options:

  • On-Demand
  • Reserved Instances
  • Savings Plans
  • Spot Instances
Amazon EC2 Purchase Options

On-demand instances follow the standard pay-as-you-go model, where you are billed by the second. This does not require any long term commitments, free’s you from the complexities of planning and purchasing compute, and are best suited for highly available fluctuating workloads.

Reserved Instances and Savings Plans are pricing models, where you commit to long term compute usage (usually 1 or 3 years in a more-or-less flexible way), in exchange for significant discounts. RIs and Savings Plans are perfect for workloads with steady usage or to handle the base load of unpredictable workloads .

Spot instances constitute the spare compute capacity in AWS which are available at steep discounts of up to 90% compared to on-demand prices. These are best suited for stateless or fault-tolerant workloads as AWS can claim these instances back to serve rising demand.

Spot Instances

The purchasing options provided by AWS, all rely on the same underlying EC2 infrastructure which behave the same way.  Spot is no exception. 

Spot instances are the idle EC2 instances that are not being used to fulfill on-demand requests, and hence are made available at cheap prices with discounts of up to 90% off compared to on-demand prices. The prices of spot instances vary with time and demand. 

All these price benefits come with a catch. Spot instances can be interrupted to fulfill rising on-demand requests. The instance will be given a 2 minute notice, after which, it is terminated. Fault-tolerant or stateless workloads hence work best with spot instances, as an interruption will have little to no impact on it.

AWS spot capacity is divided into spot instance pools. All spot instances of a specific instance type running in an AZ constitute a spot instance pool. For example: All C5.xlarge spot instances running in us-east-1a form a spot instance pool while all C5.2xlarge spot instances in the same AZ form another pool. Likewise, if you use the same instance type on three different AZ’s, you are  consuming capacity from three different spot pools.

The prices and capacities of these pools fluctuate independent of each other. Tying this back to spot interruptions, the more instance pools you use, the more diverse you are, the less downtime you face, when demand increases.  The “don't put all your eggs in one basket” concept applies to spot.

As of March 2024, with all the new capabilities we have, data shows us that spot interruptions have become fairly infrequent, with only 5% of spot instances interrupted in the last three months. The more we stick to best practices and proper diversification, the better spot as a whole will function.

Spot on ECS

ECS comes with built-in support for spot instances. Compute diversification across instance pools to automatic replacement of interrupted spot instances, are all taken care of by ECS.

A task in ECS can be launched in two ways:

  • You can opt to run it in an EC2 instance, where you have full control over the underlying instance type and operating system. 
  • Or you can choose to run your ECS task in Fargate mode, where AWS relieves you from the operational burden of having to maintain servers.

In both these approaches, you can opt to make use of spot instances to bring in significant cost reductions, with EC2 spot generally providing higher discounts compared to Fargate. EC2 spot also requires you to choose the backing instance pools, whereas in Fargate, AWS takes the decision. 

ECS automates spot lifecycle management by integrating with AWS Auto Scaling Groups. When there is an interruption, the ASG will try to provide you with a replacement instance, from another spot instance pool, depending on your configuration. For most fault-tolerant workloads, replacing an interrupted instance is more than enough to ensure availability.

ECS also supports automated spot instance draining. This can be enabled by passing a parameter to the ECS container agent via user data of your container instance. 

Once enabled, ECS will place instances in a draining state when it receives the 2 minute spot interruption notice. All the tasks running on these instances will first be sent a SIGTERM signal, and then a SIGKILL signal 30 seconds after. This lets you stop your application gracefully, or even do that last mile log collection. ECS also deregisters all such tasks from the load balancer target group, while trying to reschedule them on the remaining available instances.

Mixing on Demand and Spot
 

‍A mix of on-demand and spot capacity can bring in considerable savings while ensuring availability. For example: suppose you have a fleet running entirely on on-demand, which you want to overprovision by 50%. By using spot capacity to overprovision, you will only need to pay 5-10% more than your original cost. 

With ECS, it is easy to achieve dynamic capacity type splits. You can specify the weight of each capacity type and the backing ASG handles the rest. For example: If you provide on-demand with a weight of two, spot with a weight of three, and you have 10 instances provisioned by the ASG. Then four of those instances will be on-demand and six of them will be spot.

Savings Plans and Reserved Instances

Choosing Savings Plans and Reserved Instances can provide further gain. Savings Plans offer more flexible usage patterns than Reserved Instances.  Below is a comparison of effective discounts relative to on demand prices for a range of purchase options.

Type Compute SP for EC2 EC2 Convertible RI EC2 SP EC2 Standard RI
1yr 3yr 1yr 3yr 1yr 3yr 1yr 3yr
No upfront 21% 45% 21% 45% 31% 52% 31% 52%
Partial upfront 24% 49% 25% 49% 34% 55% 34% 55%
All upfront 26% 50% 26% 50% 36% 58% 36% 58%

Discounts from On Demand Prices

Source: Mark Butcher via LinkedIn

Overall Impacts of Purchasing Strategy

Your final implementation should take all these options into consideration while keeping in mind their potential downsides. You can have a dynamic mix of on-demand and spot capacities wherein spot interruptions have minimal impact along with optimal multi-year commitments backed by proper understanding of your workload requirements. 

In a very rough scenario of a mix of on-demand, spot, Reserved Instances, and Savings Plans in equal shares at maximum discount levels, you could achieve a 50% reduction in the overall cost.

Was this content helpful?

Thank you for submitting your feedback.
Oops! Something went wrong while submitting the form.