SlideShare a Scribd company logo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ec2-spot-india@amazon.com
Sunday, 25 Aug 2019
Amazon EC2 Spot Instances
Compute at up to 90% off. Scale more. Get faster results. Build resilient services.
Chakravarthy Nagarajan
Specialist Solution Architect, EC2 Spot
Sridhar Bharadwaj
Business Development Manager, EC2 Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
• EC2 Spot Instances overview
• Pricing model
• Major features and functionality
• Interruption details
• Spot orchestration options (Auto Scaling Groups, Fleet)
• Console Demo
• Monitoring price and usage
• Use Cases – where to use Spot
• Main takeaways for success with Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Spot – cool Cost Savings
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 purchase options
Spot Instances
Spare EC2 capacity at
off On-Demand prices
Fault-tolerant, flexible,
stateless workloads
Reserved Instances
Make a 1 or 3-year commitment
and receive a
off On-Demand prices
Committed &
steady-state usage
On-Demand
Pay for compute capacity
with no
long-term commitments
Spiky workloads,
to define needs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
To optimize Amazon EC2, combine purchase options
for fault-
tolerant, flexible, stateless workloads
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spare capacity at scale
Clemson university – 1.1 Million cores
https://tinyurl.com/clemson-spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot Instances - basics
Price changes infrequently based on long
term supply and demand of spare
capacity in each pool independently
Just request capacity and pay the current
rate. No Bidding
Interruptions only happen when OD
needs capacity. No outbidding
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Large customer base
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 Spot integrations
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Flexibility is key to successful adoption
Instance flexible Time flexible AZ flexible
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
$0.27 $0.29$0.50
1b 1c1a
8XL
$0.30 $0.16$0.214XL
$0.07 $0.08$0.082XL
$0.05 $0.04$0.04XL
$0.01 $0.04$0.01L
C4
$1.76
On
Demand
$0.88
$0.44
$0.22
$0.11
EC2 Spot pools- instance type flexibility
Each instance family
Each instance size
Each Availability Zone (60)
In every region (20)
Is a separate Spot pool
R5
M4
C5
I3 M5d
R4 D2
C4
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Time flexibility
examples
• Model training
• Genomics
• Development
• Testing
• One-time queries
Time sensitive workloadsTime insensitive workloads
examples
• Web services
• APIs
• Analytics
• Grid computing
• Containers
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot Blocks
• Defined duration workload without interruptions
• Commit for 1-6 hours. Instances terminate after duration.
• Lower discounts compared to Spot
• No Auto-Scaling and No Instance Diversification
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What about interruptions?
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Interruption behaviors
Terminate HibernateStop
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot Instance Advisor
https://aws.amazon.com/ec2/spot/instance-advisor/
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Spot pricing history
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
aws ec2 describe-spot-price-history --start-time 2018-05-06T07:08:09 --end-time 2018-05-06T08:08:09 --
instance-types c4.2xlarge --availability-zone eu-west-1a --product-description "Linux/UNIX (Amazon VPC)“
{
"SpotPriceHistory": [
{
"Timestamp": "2018-05-06T06:30:30.000Z",
"AvailabilityZone": "eu-west-1a",
"InstanceType": "c4.2xlarge",
"ProductDescription": "Linux/UNIX (Amazon VPC)",
"SpotPrice": "0.122300"
}
]
}
EC2 Spot pricing history – API access
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot orchestration options comparison
Auto Scaling Groups Spot Fleet
Maintains target capacity (upon
interruptions or failures)
Instance type diversification
Availability zone diversification
Allocation strategy (N Lowest Pools,
Lowest Price)
Autoscaling (target tracking, stepped with
custom metrics)
ELB integration
*Detach on interruption notification requires automation
On-demand capacity mixed with Spot
Lifecycle hooks, termination policies,
protection, detach, processes
Weights Roadmap
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring Spot usage – Cost Explorer
Y e a r b y m o n t h – l o n g t e r m t r e n d s
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring Spot usage – Cost Explorer
M o n t h b y d a y – s h o r t t e r m c h a n g e s a n d a n o m a l y d e t e c t i o n
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring Spot usage
• Filter and group by: account, instance type, region, tags
• Data is available via API
• If you need:
- Deeper insights
- Hour-level data resolution or one hour data freshness
- Resource ids
Use Cost and Usage Reports or Spot Instance Data Feed (easily query with Athena or visualize with
Quicksight / Tableau / Looker /…)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring Spot usage – Savings Summary
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Is my workload Spot Ready?
Stateless Fault-Tolerant Flexible: Multi-
AZ and Instance
Flexibility
Loosely Coupled
Looks familiar?
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Main takeaways for success with Spot
• Be instance type agnostic and let ASG/Fleet provide the required capacity at the lowest price
• Adopt Launch Templates to benefit from new ASG and Fleet features
• New instance families generally have higher interruption rates
• Architect for fault-tolerance to be Spot compatible and increase your availability
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you
https://aws.amazon.com/ec2/spot
ec2-spot-india@amazon.com
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Usage of EC2 Spot Instances with specific Workloads
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Stateless web application or API frontend
• Spot diversification before all else
• To mitigate risk, launch On-Demand in a
different pool if Spot capacity is insufficient
• Availability and performance should not
be impacted, ensure low bootstrap time & some
over-provisioning
https://tinyurl.com/SpotAppnextBlog
M o s t l y i d e n t i c a l f o r q u e u e w o r k e r s
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 Auto Scaling
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Amazon EC2 Auto Scaling?
Amazon EC2
Auto Scaling
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Auto Scaling Group (ASG) introduction
 Logical group of instances
for your service
 Minimum and maximum bound
for the number of instances that
can be in the ASG
 Launch or terminate instances
to meet the desired capacity
Desired
Min
Max
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scheduled
scaling
Dynamic
scaling
Predictive
scaling
Manual
scaling
Amazon EC2 Auto Scaling
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Predictive scaling in Amazon EC2 Auto Scaling
Machine learning technology behind the scenes
Machine
learning model
Load metric
and forecasts
next two days based on
the pre-trained model
Performs
regression analysis
between load metric
and scaling metric
Schedules
scaling actions
for the next
two days, hourly
Repeats
every day
Capacity provisioning on-premises Capacity provisioning with dynamic scaling Capacity provisioning with predictive scaling and
dynamic scaling
Time
Load/Capacity
Time
Load/Capacity
Time
Load/Capacity
Provisioned capacity Actual capacity demand
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Auto Scaling Groups with Multiple Purchase
Options and Instance Types
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Before: Multiple ASGs to use Spot, On-Demand, and RIs together
m4.large Spot ASG
m5.large Spot ASG
c4.large On-Demand ASG
Availability
Zone 1
Availability
Zone 2
Availability
Zone3
One ASG for
each purchase
option and
instance type
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
After: Include Spot, On-Demand, and RIs in a single ASG
m4.large Spot
m5.large Spot
c4.large On-Demand
Availability
Zone 1
Availability
Zone 2
Availability
Zone3
A single ASG
combines purchase
options and
instance types
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Save up to 90% using EC2 Auto Scaling and EC2 Fleet
Automatically provision and scale instances across instance families and
purchase models in a single Auto Scaling group
Lowest cost
Specify what percentage of your Auto Scaling group capacity should be
fulfilled by On-Demand Instances and Spot Instances to optimize cost
Prioritized list
Use a prioritized list for On-Demand Instance types to scale capacity
during an urgent, unpredictable event to optimize performance
Reduce operational overhead
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 Fleet and Allocation strategies
Amazon EC2 Fleet
• Provisions capacity across multiple instance types according to allocation
strategies
Allocation strategies
• On-Demand prioritized list of instance types
• Spot instances across the N lowest priced instance pools
• Capacity Optimized
• Allocation strategies determine which instance types are launched
and terminated
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
API Parameters
"MixedInstancesPolicy": {
"LaunchTemplate": {
"LaunchTemplateSpecification": {
"LaunchTemplateName": ”MyLaunchTemplate"
},
"Overrides": [
{ "InstanceType": "c5.large" },
{ "InstanceType": "c4.large" }
]
},
"InstancesDistribution": {
"OnDemandAllocationStrategy": "prioritized",
"OnDemandBaseCapacity": 10,
"OnDemandPercentageAboveBaseCapacity": 50,
"SpotAllocationStrategy": ”capacity-optimized"
}
}
AZ1 and
AZ2
Desired
Min
Max
On-Demand Base
50% On-Demand
50% Spot
Minimum On-Demand (10)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instance type overrides and allocation strategies
• ASG adjusts to new
configuration as scale up and
down
• As ASG scales up
• Launch instances according to the
new configuration
• As ASG scales down
• Prioritize terminating instances not
matching the new configuration
• New termination policy:
AllocationStrategy
Instance type overrides: m4.large, m5.large
m4.large m5.large
Instance type overrides: m5.large, c5.large
m4.large m5.large c5.large
Instance type overrides: m5.large, c5.large
m4.large m5.large c5.large
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommendations on Mixed Instances Policy
Choose at least 2 instance type overrides
• Improves availability for On-Demand and Spot Instances
Diversify across at least N = 2 Spot Instance pools
• Reduces risk from fluctuations in Spot capacity and prices
Choose instance types of same size across families
• Maintains stability as dynamically scale up and down
Use default spot max price
• Leverages spot cost savings while defaulting to on-demand price as maximum
price to pay
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Spot with Amazon ECS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot and Containerized workloads
• Best practices overlap: cattle, not pets
• Cluster instances (worker nodes) are conceptually redundant
and ephemeral in containerized workloads
• Instance type flexibility is easy
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Integrated directly into ECS console,
or use CloudFormation, Terraform
ECS creates a Spot Fleet in your account,
Or – use an Auto Scaling Group to bootstrap instances
Autoscaling
ECS service: CPU/MEM, HTTP requests
Spot Fleet / ASG on reserved metric - CPU, MEM
Interruptions are handled automatically via
scripts installed in User Data
ECS: automatically Drain the instance
Spot Fleet / ASG: automatically replace the instance
ECS - Provisioning and scaling
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Spot with Amazon EKS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kubernetes and EKS considerations
• EKS is a managed control plane, Spot is only relevant for the worker nodes
• CA (cluster-autoscaler) & ASG: Use one ASG with a diversified set of instance types
• Run a DaemonSet on every worker to catch the Spot interruption and drain the node
• Use labels to identify Spot nodes (for the DaemonSet, and other purposes – schedule non-prod?)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kubernetes and EKS scaling
• HPA (horizontal pod autoscaler)
 Autoscales the number of pods in a Deployment/ReplicaSet
• CA (cluster-autoscaler)
 Autoscales the number of worker nodes in the cluster when:
o Pods cannot be scheduled due to lack of compute resources
o Nodes are underutilized and important pods can be rescheduled elsewhere
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability
Zone 1
etcd
Master
etcd
Master
Availability
Zone 2
Availability
Zone 3
etcd
Master
EKS – Master & etcd vs Worker nodes
Worker nodes
Your AWS account
Managed master &
etcd
Not visible in your
AWS account
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Spot with Amazon EMR
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use Spot in EMR, unless heavily time-constrained
Spot best practice – diversify and be instance type agnostic
 Use Instance Fleets - up to 5 instance types
 Improved chances of getting capacity,
decreased impact from interruptions
 Spark will automatically recover from
instance failures/interruptions
 Enable Dynamic allocation of executors (default in EMR)
Decouple storage from compute (from HDFS to S3 EMRFS)
Defined duration (Spot blocks) and fallback to On-Demand
If you need auto-scaling (i.e for Presto), use uniform groups
EMR – provisioning and instance types
r3.4xlarge
i3.4xlarge
r4.4xlarge
m4.10xlarge
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
# Parallelized
nodes
Time
# Parallelized
nodes
Time
Job running time: 1 hourJob running time: 10 hours
Parallelization with Spot Instances
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMR Nodes – Long running clusters
On-
demand
Master node
EMR cluster
Task nodesCore nodes
On-
demand Spot Spot
Spot SpotHDFS HDFS
Core nodes can be added
and removed gracefully
Master Node must keep
running
Cluster can tolerate loss
of task nodes.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMR Nodes – Data driven workloads
On-
demand
Master node
EMR cluster
Task nodesCore nodes
On-
demand
On-
demand Spot Spot
Spot SpotHDFS HDFS
• On-demand for Master
• On-demand for Core, so no
loss of data from these nodes
• Spot for task nodes
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMR Nodes – Cost-driven & Transient workloads
Spot
Master node
EMR cluster
Task nodesCore nodes
Spot Spot Spot Spot
Spot SpotHDFS HDFS
• All node types use EC2 Spot
• Can use any EC2 Spot
instance choices.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMR Choices for EC2 Spot - Compute flexibility
• Instance groups
• Instance fleets
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMR Instance groups
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMR Instance fleets
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instance fleets – Target capacity
• Target capacity is a mix of OD and EC2 Spot
• Can specify # of units each of these is contributing to
target capacity
• If no spot
• Can specify provisioning timeout AND
• Switch to OD
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
When to choose instance fleets
• Widest variety of provisioning options
• Choose a mix of instance types
• Choose mix of on-demand/spot
• Specify Target capacity as mix of on-demand or spot
• Auto-scaling not available
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you
https://aws.amazon.com/ec2/spot
ec2-spot-india@amazon.com

More Related Content

Similar to Amazon EC2 Spot Instances Workshop (20)

PDF
An introduction to Spot Instances and AWS Fleet - Webinar
CMPUTE
 
PDF
Cut AWS Costs: Using Spot Instances for More Than Batch
RightScale
 
PDF
Reduce Your Cloud Spending With AWS Spot Instances
Intelligentia IT Systems Pvt. Ltd.
 
PPTX
AWS Meetup - Exploring ways to buy EC2 capacity
Antti Siiskonen
 
PDF
AWS Cost Optimizations Risks
Olaf Reitmaier Veracierta
 
PPTX
AWS SSA Webinar - Cost optimisation on AWS
Cobus Bernard
 
PPTX
Aws ec2 - hibernate and spot instance
Aléx Carvalho
 
PDF
Advanced cost management strategies in AWS
AWS User Group Bengaluru
 
PPTX
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
rICh morrow
 
PDF
Embracing the volatility of AWS Spot Fleet
Matthew Mead-Briggs
 
PPTX
Cost optimization - Don't overspend on AWS
Sandeep Cashyap
 
PPTX
AWS Cost Optimization
Miles Ward
 
PDF
Six ways to reduce your AWS bill
Boaz Ziniman
 
PDF
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
AWS Summits
 
PPTX
AWS Spot Pricing with Terraform [ENG 2023]
Vladimir Samoylov
 
PDF
AWS Cost Optimization - JLM
Boaz Ziniman
 
PDF
Kubernetes: Reducing Infrastructure Cost & Complexity
DevOps.com
 
PDF
a presentation on aws computer service amazon ec2
rahulofficial756023
 
PPTX
5 Ways to Control your AWS Spending (or, How to Make Your CFO Happy)
parkmycloud
 
PDF
Cost Optimization with Spot Instances
Arun Sirimalla
 
An introduction to Spot Instances and AWS Fleet - Webinar
CMPUTE
 
Cut AWS Costs: Using Spot Instances for More Than Batch
RightScale
 
Reduce Your Cloud Spending With AWS Spot Instances
Intelligentia IT Systems Pvt. Ltd.
 
AWS Meetup - Exploring ways to buy EC2 capacity
Antti Siiskonen
 
AWS Cost Optimizations Risks
Olaf Reitmaier Veracierta
 
AWS SSA Webinar - Cost optimisation on AWS
Cobus Bernard
 
Aws ec2 - hibernate and spot instance
Aléx Carvalho
 
Advanced cost management strategies in AWS
AWS User Group Bengaluru
 
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
rICh morrow
 
Embracing the volatility of AWS Spot Fleet
Matthew Mead-Briggs
 
Cost optimization - Don't overspend on AWS
Sandeep Cashyap
 
AWS Cost Optimization
Miles Ward
 
Six ways to reduce your AWS bill
Boaz Ziniman
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
AWS Summits
 
AWS Spot Pricing with Terraform [ENG 2023]
Vladimir Samoylov
 
AWS Cost Optimization - JLM
Boaz Ziniman
 
Kubernetes: Reducing Infrastructure Cost & Complexity
DevOps.com
 
a presentation on aws computer service amazon ec2
rahulofficial756023
 
5 Ways to Control your AWS Spending (or, How to Make Your CFO Happy)
parkmycloud
 
Cost Optimization with Spot Instances
Arun Sirimalla
 

More from AWS User Group Bengaluru (20)

PDF
Demystifying identity on AWS
AWS User Group Bengaluru
 
PPTX
AWS Secrets for Best Practices
AWS User Group Bengaluru
 
PPTX
Cloud Security
AWS User Group Bengaluru
 
PDF
Lessons learnt building a Distributed Linked List on S3
AWS User Group Bengaluru
 
PDF
Medlife journey with AWS
AWS User Group Bengaluru
 
PPTX
Building Efficient, Scalable and Resilient Front-end logging service with AWS
AWS User Group Bengaluru
 
PPTX
Exploring opportunities with communities for a successful career
AWS User Group Bengaluru
 
PPTX
Slack's transition away from a single AWS account
AWS User Group Bengaluru
 
PDF
Log analytics with ELK stack
AWS User Group Bengaluru
 
PDF
Serverless Culture
AWS User Group Bengaluru
 
PPTX
Refactoring to serverless
AWS User Group Bengaluru
 
PPTX
Building Efficient, Scalable and Resilient Front-end logging service with AWS
AWS User Group Bengaluru
 
PDF
Medlife's journey with AWS from 0(zero) orders to 6 digit mark
AWS User Group Bengaluru
 
PPTX
AWS Secrets for Best Practices
AWS User Group Bengaluru
 
PPTX
Exploring opportunities with communities for a successful career
AWS User Group Bengaluru
 
PDF
Lessons learnt building a Distributed Linked List on S3
AWS User Group Bengaluru
 
PPTX
Cloud Security
AWS User Group Bengaluru
 
PDF
Cost Optimization in AWS
AWS User Group Bengaluru
 
PPTX
Keynote - Chaos Engineering: Why breaking things should be practiced
AWS User Group Bengaluru
 
PDF
Decentralized enterprise architecture using Blockchain & AWS
AWS User Group Bengaluru
 
Demystifying identity on AWS
AWS User Group Bengaluru
 
AWS Secrets for Best Practices
AWS User Group Bengaluru
 
Cloud Security
AWS User Group Bengaluru
 
Lessons learnt building a Distributed Linked List on S3
AWS User Group Bengaluru
 
Medlife journey with AWS
AWS User Group Bengaluru
 
Building Efficient, Scalable and Resilient Front-end logging service with AWS
AWS User Group Bengaluru
 
Exploring opportunities with communities for a successful career
AWS User Group Bengaluru
 
Slack's transition away from a single AWS account
AWS User Group Bengaluru
 
Log analytics with ELK stack
AWS User Group Bengaluru
 
Serverless Culture
AWS User Group Bengaluru
 
Refactoring to serverless
AWS User Group Bengaluru
 
Building Efficient, Scalable and Resilient Front-end logging service with AWS
AWS User Group Bengaluru
 
Medlife's journey with AWS from 0(zero) orders to 6 digit mark
AWS User Group Bengaluru
 
AWS Secrets for Best Practices
AWS User Group Bengaluru
 
Exploring opportunities with communities for a successful career
AWS User Group Bengaluru
 
Lessons learnt building a Distributed Linked List on S3
AWS User Group Bengaluru
 
Cloud Security
AWS User Group Bengaluru
 
Cost Optimization in AWS
AWS User Group Bengaluru
 
Keynote - Chaos Engineering: Why breaking things should be practiced
AWS User Group Bengaluru
 
Decentralized enterprise architecture using Blockchain & AWS
AWS User Group Bengaluru
 
Ad

Recently uploaded (20)

PDF
Market Wrap for 18th July 2025 by CIFDAQ
CIFDAQ
 
PDF
UiPath vs Other Automation Tools Meeting Presentation.pdf
Tracy Dixon
 
PDF
Rethinking Security Operations - Modern SOC.pdf
Haris Chughtai
 
PDF
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
DOCX
TLE9 COOKERY DLL WEEK3 technology and li
jamierha cabaero
 
PDF
GITLAB-CICD_For_Professionals_KodeKloud.pdf
deepaktyagi0048
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
PDF
Upskill to Agentic Automation 2025 - Kickoff Meeting
DianaGray10
 
PDF
OpenInfra ID 2025 - Are Containers Dying? Rethinking Isolation with MicroVMs.pdf
Muhammad Yuga Nugraha
 
PDF
Novus Safe Lite- What is Novus Safe Lite.pdf
Novus Hi-Tech
 
PDF
Upgrading to z_OS V2R4 Part 01 of 02.pdf
Flavio787771
 
PDF
Lecture A - AI Workflows for Banking.pdf
Dr. LAM Yat-fai (林日辉)
 
PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
PPTX
Lecture 5 - Agentic AI and model context protocol.pptx
Dr. LAM Yat-fai (林日辉)
 
PPTX
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
PPTX
Earn Agentblazer Status with Slack Community Patna.pptx
SanjeetMishra29
 
PDF
"Effect, Fiber & Schema: tactical and technical characteristics of Effect.ts"...
Fwdays
 
PDF
Shuen Mei Parth Sharma Boost Productivity, Innovation and Efficiency wit...
AWS Chicago
 
PPTX
Extensions Framework (XaaS) - Enabling Orchestrate Anything
ShapeBlue
 
Market Wrap for 18th July 2025 by CIFDAQ
CIFDAQ
 
UiPath vs Other Automation Tools Meeting Presentation.pdf
Tracy Dixon
 
Rethinking Security Operations - Modern SOC.pdf
Haris Chughtai
 
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
TLE9 COOKERY DLL WEEK3 technology and li
jamierha cabaero
 
GITLAB-CICD_For_Professionals_KodeKloud.pdf
deepaktyagi0048
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
Upskill to Agentic Automation 2025 - Kickoff Meeting
DianaGray10
 
OpenInfra ID 2025 - Are Containers Dying? Rethinking Isolation with MicroVMs.pdf
Muhammad Yuga Nugraha
 
Novus Safe Lite- What is Novus Safe Lite.pdf
Novus Hi-Tech
 
Upgrading to z_OS V2R4 Part 01 of 02.pdf
Flavio787771
 
Lecture A - AI Workflows for Banking.pdf
Dr. LAM Yat-fai (林日辉)
 
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
Lecture 5 - Agentic AI and model context protocol.pptx
Dr. LAM Yat-fai (林日辉)
 
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
Earn Agentblazer Status with Slack Community Patna.pptx
SanjeetMishra29
 
"Effect, Fiber & Schema: tactical and technical characteristics of Effect.ts"...
Fwdays
 
Shuen Mei Parth Sharma Boost Productivity, Innovation and Efficiency wit...
AWS Chicago
 
Extensions Framework (XaaS) - Enabling Orchestrate Anything
ShapeBlue
 
Ad

Amazon EC2 Spot Instances Workshop

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. [email protected] Sunday, 25 Aug 2019 Amazon EC2 Spot Instances Compute at up to 90% off. Scale more. Get faster results. Build resilient services. Chakravarthy Nagarajan Specialist Solution Architect, EC2 Spot Sridhar Bharadwaj Business Development Manager, EC2 Spot
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda • EC2 Spot Instances overview • Pricing model • Major features and functionality • Interruption details • Spot orchestration options (Auto Scaling Groups, Fleet) • Console Demo • Monitoring price and usage • Use Cases – where to use Spot • Main takeaways for success with Spot
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot – cool Cost Savings
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 purchase options Spot Instances Spare EC2 capacity at off On-Demand prices Fault-tolerant, flexible, stateless workloads Reserved Instances Make a 1 or 3-year commitment and receive a off On-Demand prices Committed & steady-state usage On-Demand Pay for compute capacity with no long-term commitments Spiky workloads, to define needs
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. To optimize Amazon EC2, combine purchase options for fault- tolerant, flexible, stateless workloads
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spare capacity at scale Clemson university – 1.1 Million cores https://tinyurl.com/clemson-spot
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Instances - basics Price changes infrequently based on long term supply and demand of spare capacity in each pool independently Just request capacity and pay the current rate. No Bidding Interruptions only happen when OD needs capacity. No outbidding
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Large customer base
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Spot integrations
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Flexibility is key to successful adoption Instance flexible Time flexible AZ flexible
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. $0.27 $0.29$0.50 1b 1c1a 8XL $0.30 $0.16$0.214XL $0.07 $0.08$0.082XL $0.05 $0.04$0.04XL $0.01 $0.04$0.01L C4 $1.76 On Demand $0.88 $0.44 $0.22 $0.11 EC2 Spot pools- instance type flexibility Each instance family Each instance size Each Availability Zone (60) In every region (20) Is a separate Spot pool R5 M4 C5 I3 M5d R4 D2 C4
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Time flexibility examples • Model training • Genomics • Development • Testing • One-time queries Time sensitive workloadsTime insensitive workloads examples • Web services • APIs • Analytics • Grid computing • Containers
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Blocks • Defined duration workload without interruptions • Commit for 1-6 hours. Instances terminate after duration. • Lower discounts compared to Spot • No Auto-Scaling and No Instance Diversification
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What about interruptions?
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Interruption behaviors Terminate HibernateStop
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Instance Advisor https://aws.amazon.com/ec2/spot/instance-advisor/
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot pricing history
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. aws ec2 describe-spot-price-history --start-time 2018-05-06T07:08:09 --end-time 2018-05-06T08:08:09 -- instance-types c4.2xlarge --availability-zone eu-west-1a --product-description "Linux/UNIX (Amazon VPC)“ { "SpotPriceHistory": [ { "Timestamp": "2018-05-06T06:30:30.000Z", "AvailabilityZone": "eu-west-1a", "InstanceType": "c4.2xlarge", "ProductDescription": "Linux/UNIX (Amazon VPC)", "SpotPrice": "0.122300" } ] } EC2 Spot pricing history – API access
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot orchestration options comparison Auto Scaling Groups Spot Fleet Maintains target capacity (upon interruptions or failures) Instance type diversification Availability zone diversification Allocation strategy (N Lowest Pools, Lowest Price) Autoscaling (target tracking, stepped with custom metrics) ELB integration *Detach on interruption notification requires automation On-demand capacity mixed with Spot Lifecycle hooks, termination policies, protection, detach, processes Weights Roadmap
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage – Cost Explorer Y e a r b y m o n t h – l o n g t e r m t r e n d s
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage – Cost Explorer M o n t h b y d a y – s h o r t t e r m c h a n g e s a n d a n o m a l y d e t e c t i o n
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage • Filter and group by: account, instance type, region, tags • Data is available via API • If you need: - Deeper insights - Hour-level data resolution or one hour data freshness - Resource ids Use Cost and Usage Reports or Spot Instance Data Feed (easily query with Athena or visualize with Quicksight / Tableau / Looker /…)
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage – Savings Summary
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Is my workload Spot Ready? Stateless Fault-Tolerant Flexible: Multi- AZ and Instance Flexibility Loosely Coupled Looks familiar?
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Main takeaways for success with Spot • Be instance type agnostic and let ASG/Fleet provide the required capacity at the lowest price • Adopt Launch Templates to benefit from new ASG and Fleet features • New instance families generally have higher interruption rates • Architect for fault-tolerance to be Spot compatible and increase your availability
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you https://aws.amazon.com/ec2/spot [email protected]
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Usage of EC2 Spot Instances with specific Workloads
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Stateless web application or API frontend • Spot diversification before all else • To mitigate risk, launch On-Demand in a different pool if Spot capacity is insufficient • Availability and performance should not be impacted, ensure low bootstrap time & some over-provisioning https://tinyurl.com/SpotAppnextBlog M o s t l y i d e n t i c a l f o r q u e u e w o r k e r s
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Auto Scaling
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is Amazon EC2 Auto Scaling? Amazon EC2 Auto Scaling
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Auto Scaling Group (ASG) introduction  Logical group of instances for your service  Minimum and maximum bound for the number of instances that can be in the ASG  Launch or terminate instances to meet the desired capacity Desired Min Max
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scheduled scaling Dynamic scaling Predictive scaling Manual scaling Amazon EC2 Auto Scaling
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Predictive scaling in Amazon EC2 Auto Scaling Machine learning technology behind the scenes Machine learning model Load metric and forecasts next two days based on the pre-trained model Performs regression analysis between load metric and scaling metric Schedules scaling actions for the next two days, hourly Repeats every day Capacity provisioning on-premises Capacity provisioning with dynamic scaling Capacity provisioning with predictive scaling and dynamic scaling Time Load/Capacity Time Load/Capacity Time Load/Capacity Provisioned capacity Actual capacity demand
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Auto Scaling Groups with Multiple Purchase Options and Instance Types
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Before: Multiple ASGs to use Spot, On-Demand, and RIs together m4.large Spot ASG m5.large Spot ASG c4.large On-Demand ASG Availability Zone 1 Availability Zone 2 Availability Zone3 One ASG for each purchase option and instance type
  • 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. After: Include Spot, On-Demand, and RIs in a single ASG m4.large Spot m5.large Spot c4.large On-Demand Availability Zone 1 Availability Zone 2 Availability Zone3 A single ASG combines purchase options and instance types
  • 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Save up to 90% using EC2 Auto Scaling and EC2 Fleet Automatically provision and scale instances across instance families and purchase models in a single Auto Scaling group Lowest cost Specify what percentage of your Auto Scaling group capacity should be fulfilled by On-Demand Instances and Spot Instances to optimize cost Prioritized list Use a prioritized list for On-Demand Instance types to scale capacity during an urgent, unpredictable event to optimize performance Reduce operational overhead
  • 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Fleet and Allocation strategies Amazon EC2 Fleet • Provisions capacity across multiple instance types according to allocation strategies Allocation strategies • On-Demand prioritized list of instance types • Spot instances across the N lowest priced instance pools • Capacity Optimized • Allocation strategies determine which instance types are launched and terminated
  • 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. API Parameters "MixedInstancesPolicy": { "LaunchTemplate": { "LaunchTemplateSpecification": { "LaunchTemplateName": ”MyLaunchTemplate" }, "Overrides": [ { "InstanceType": "c5.large" }, { "InstanceType": "c4.large" } ] }, "InstancesDistribution": { "OnDemandAllocationStrategy": "prioritized", "OnDemandBaseCapacity": 10, "OnDemandPercentageAboveBaseCapacity": 50, "SpotAllocationStrategy": ”capacity-optimized" } } AZ1 and AZ2 Desired Min Max On-Demand Base 50% On-Demand 50% Spot Minimum On-Demand (10)
  • 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Instance type overrides and allocation strategies • ASG adjusts to new configuration as scale up and down • As ASG scales up • Launch instances according to the new configuration • As ASG scales down • Prioritize terminating instances not matching the new configuration • New termination policy: AllocationStrategy Instance type overrides: m4.large, m5.large m4.large m5.large Instance type overrides: m5.large, c5.large m4.large m5.large c5.large Instance type overrides: m5.large, c5.large m4.large m5.large c5.large
  • 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommendations on Mixed Instances Policy Choose at least 2 instance type overrides • Improves availability for On-Demand and Spot Instances Diversify across at least N = 2 Spot Instance pools • Reduces risk from fluctuations in Spot capacity and prices Choose instance types of same size across families • Maintains stability as dynamically scale up and down Use default spot max price • Leverages spot cost savings while defaulting to on-demand price as maximum price to pay
  • 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot with Amazon ECS
  • 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot and Containerized workloads • Best practices overlap: cattle, not pets • Cluster instances (worker nodes) are conceptually redundant and ephemeral in containerized workloads • Instance type flexibility is easy
  • 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Integrated directly into ECS console, or use CloudFormation, Terraform ECS creates a Spot Fleet in your account, Or – use an Auto Scaling Group to bootstrap instances Autoscaling ECS service: CPU/MEM, HTTP requests Spot Fleet / ASG on reserved metric - CPU, MEM Interruptions are handled automatically via scripts installed in User Data ECS: automatically Drain the instance Spot Fleet / ASG: automatically replace the instance ECS - Provisioning and scaling
  • 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot with Amazon EKS
  • 46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubernetes and EKS considerations • EKS is a managed control plane, Spot is only relevant for the worker nodes • CA (cluster-autoscaler) & ASG: Use one ASG with a diversified set of instance types • Run a DaemonSet on every worker to catch the Spot interruption and drain the node • Use labels to identify Spot nodes (for the DaemonSet, and other purposes – schedule non-prod?)
  • 47. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubernetes and EKS scaling • HPA (horizontal pod autoscaler)  Autoscales the number of pods in a Deployment/ReplicaSet • CA (cluster-autoscaler)  Autoscales the number of worker nodes in the cluster when: o Pods cannot be scheduled due to lack of compute resources o Nodes are underutilized and important pods can be rescheduled elsewhere
  • 48. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Availability Zone 1 etcd Master etcd Master Availability Zone 2 Availability Zone 3 etcd Master EKS – Master & etcd vs Worker nodes Worker nodes Your AWS account Managed master & etcd Not visible in your AWS account
  • 49. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot with Amazon EMR
  • 50. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use Spot in EMR, unless heavily time-constrained Spot best practice – diversify and be instance type agnostic  Use Instance Fleets - up to 5 instance types  Improved chances of getting capacity, decreased impact from interruptions  Spark will automatically recover from instance failures/interruptions  Enable Dynamic allocation of executors (default in EMR) Decouple storage from compute (from HDFS to S3 EMRFS) Defined duration (Spot blocks) and fallback to On-Demand If you need auto-scaling (i.e for Presto), use uniform groups EMR – provisioning and instance types r3.4xlarge i3.4xlarge r4.4xlarge m4.10xlarge
  • 51. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. # Parallelized nodes Time # Parallelized nodes Time Job running time: 1 hourJob running time: 10 hours Parallelization with Spot Instances
  • 52. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Nodes – Long running clusters On- demand Master node EMR cluster Task nodesCore nodes On- demand Spot Spot Spot SpotHDFS HDFS Core nodes can be added and removed gracefully Master Node must keep running Cluster can tolerate loss of task nodes.
  • 53. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Nodes – Data driven workloads On- demand Master node EMR cluster Task nodesCore nodes On- demand On- demand Spot Spot Spot SpotHDFS HDFS • On-demand for Master • On-demand for Core, so no loss of data from these nodes • Spot for task nodes
  • 54. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Nodes – Cost-driven & Transient workloads Spot Master node EMR cluster Task nodesCore nodes Spot Spot Spot Spot Spot SpotHDFS HDFS • All node types use EC2 Spot • Can use any EC2 Spot instance choices.
  • 55. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Choices for EC2 Spot - Compute flexibility • Instance groups • Instance fleets
  • 56. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Instance groups
  • 57. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Instance fleets
  • 58. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Instance fleets – Target capacity • Target capacity is a mix of OD and EC2 Spot • Can specify # of units each of these is contributing to target capacity • If no spot • Can specify provisioning timeout AND • Switch to OD
  • 59. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. When to choose instance fleets • Widest variety of provisioning options • Choose a mix of instance types • Choose mix of on-demand/spot • Specify Target capacity as mix of on-demand or spot • Auto-scaling not available
  • 60. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you https://aws.amazon.com/ec2/spot [email protected]