SlideShare a Scribd company logo
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Julien Simon
Principal Technical Evangelist, AWS
@julsimon
Advanced Task Scheduling "
with Amazon ECS
(CON307 revisited)
Docker on Amazon Web Services
Amazon EC2 Container Service (ECS)
•  https://aws.amazon.com/ecs/
•  Launched in 04/2015
•  No additional charge

Amazon EC2 Container Registry (ECR)
•  https://aws.amazon.com/ecr/
•  Launched in 12/2015
•  Free tier: 500MB / month for a year
•  $0.10 / GB / month + outgoing traffic
ECS & ECS are available in 11 regions (US, EU, APAC)
Container Partners
https://aws.amazon.com/containers/partners/
Companies
Selected ECS customers
ECS Scheduling
The problem

Given a certain amount of
computing power and memory, 

how can we best manage
an arbitrary number of apps
running in Docker containers?
http://tidalseven.com
Amazon ECS: Under the Hood
ALB ALB
AZ 1 AZ 2
user / scheduler
https://github.com/aws/amazon-ecs-agent 
http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html
Case study: Coursera
https://www.youtube.com/watch?v=a45J6xAGUvA
Coursera deliver Massive Open Online Courses (14 million students, 1000+
courses). Their platform runs a large number of batch jobs, notably to grade
programming assignments. Grading jobs need to run in near-real time while
preventing execution of untrusted code inside the Coursera platform.

After trying out some other Docker solutions, Coursera have picked Amazon
ECS and have even written their own scheduler.

“Amazon ECS enabled Coursera to focus on releasing new software !
rather than spending time managing clusters” - Frank Chen, Software
Engineer
Scheduling on ECS: two options so far
1.  Let ECS handle scheduling through Services
•  Task Definition
•  ECS equivalent of the Docker Compose file
•  Versioned
•  cpu_shares, mem_limit
•  Number of containers
2.  Implement a custom scheduler with the ECS API
•  Describe cluster state
•  Select a specific ECS instance according to custom logic
•  Run task on this instance
ECS Placement Engine
Placement Engine: giving developers more control
ALB ALB
AZ 1 AZ 2
user / scheduler
Placement Engine
Placement Constraints
Placement Strategies
Placement Constraints
Name Example
AMI ID attribute:ecs.ami-id == ami-eca289fb
Availability Zone attribute:ecs.availability-zone == us-east-1a
Instance Type attribute:ecs.instance-type == t2.small
Distinct Instances type=“distinctInstance”
Custom attribute:stack == prod
Example: Constraint on Instance Family/Type
Example: Constraint on Availability Zone
Example: Combining Multiple Constraints
Placement Strategies
Binpacking Spread Affinity Distinct Instance
Placement Strategy Chaining
Spread tasks across Zones
and Binpack within each Zone
Placing Tasks
Anatomy of Task Placement
Cluster Constraints
Custom Constraints
Placement Strategies
Apply Filter
Satisfy CPU, memory, and port requirements
Filter for location, instance-type, AMI, or custom
attribute constraints
Identify instances that meet spread or binpack
placement strategy
Select final container instances for placement
g2.2xlarge t2.small g2.2xlarge g2.2xlarge
Placement: Targeting Instance Type
g2.2xlarge t2.small t2.micro t2.medium
t2.medium t2.small g2.2xlarge
t2.small
t2.small t2.medium
us-east-1aus-east-1d
Placement: Targeting Instance Type & Zone
g2.2xlarge t2.small t2.micro t2.medium
t2.medium t2.small g2.2xlarge t2.small
us-east-1aus-east-1d
g2.2xlarge t2.medium
t2.micro t2.small
us-east-1c
Placement: Availability Zone Spread
g2.2xlarge t2.small t2.micro t2.medium
t2.medium t2.small g2.2xlarge t2.small
us-east-1aus-east-1d
g2.2xlarge t2.medium
t2.micro t2.small
us-east-1c
Placement: Spread across Zone and Binpack
g2.2xlarge t2.small t2.micro t2.medium
t2.medium t2.small g2.2xlarge t2.small
us-east-1aus-east-1d
g2.2xlarge t2.medium
t2.micro t2.small
us-east-1c
Placement: Affinity and Anti-Affinity
Running a Service
t2.medium t2.small t2.small
us-east-1aus-east-1d
t2.medium t2.micro t2.small
us-east-1c
Placement: Multiple Services on a Cluster
t2.medium g2.2xlarge t2.micro t2.small
t2.small t2.small g2.2xlarge t2.small
t2.small t2.small
g2.2xlarge t2.small
Placement: Services – Distinct Instances
Event Stream & Blox
Amazon ECS: Under the Hood
ALB ALB
AZ 1 AZ 2
user / scheduler
Placement Engine
Event Stream
Consuming Real-time Events
Handling ECS events with Blox
scheduler cluster state service
Amazon ECS: Under the Hood
ALB ALB
AZ 1 AZ 2
user / scheduler
Scheduler
Cluster State Service
Placement Engine
Event Stream
t2.small t2.small t2.small
Blox: Daemon Scheduler
t2.small t2.small t2.small
scheduler cluster state service
t2.small t2.small t2.small
Demo: Deploying Blox on AWS
Creating Clusters
Create an ECS cluster for Blox
CF template: https://github.com/blox/blox/blob/dev/deploy/aws/conf/cloudformation_template.json 
à  CloudWatch Event Rule + SQS queue
à  Daemon Scheduler + Cluster State Service + etcd
à  REST API exposing the Daemon Scheduler API
Create another ECS cluster managed by Blox
$ ecs-cli configure --cluster WebCluster --region ap-southeast-1
$ ecs-cli up --keypair admin --capability-iam --size 3 --instance-type t2.micro
Invoke the scheduler API
‘demo-cli’ tool: https://github.com/blox/blox/tree/dev/deploy/demo-cli
Listing Task Definitions
Grab the ARN for an nginx Task Definition, which the
Daemon Scheduler will manage on ‘WebCluster’.
$ ./list-task-definitions.py --region ap-southeast-1
== Blox Demo CLI - List Task Definitions ==
{
"taskDefinitionArns": [
"arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/BloxFramework:2",
"arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/nginx:1",
"arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/nginx:2"
]
}
Creating an Environment
$ ./blox-create-environment.py --environment WebEnvironment --cluster
WebCluster --task-definition "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-
definition/nginx:2" --stack Blox --apigateway --region ap-southeast-1
== Blox Demo CLI - Create Blox Environment ==
HTTP Response Code: 200
{
"taskDefinition": "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/
nginx:2",
"deploymentToken": "17248257-08ec-4438-888f-e0ac28397653",
"health": "healthy",
"name": "WebEnvironment",
"instanceGroup": {
"cluster": "arn:aws:ecs:ap-southeast-1:ACCOUNT:cluster/WebCluster"
}
}
Listing Environments
$ ./blox-list-environments.py --stack Blox --apigateway --region ap-southeast-1
== Blox Demo CLI - List Blox Environments ==
HTTP Response Code: 200
{
"items": [
{
"taskDefinition": "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/
nginx:2",
"deploymentToken": "17248257-08ec-4438-888f-e0ac28397653",
"health": "healthy",
"name": "WebEnvironment",
"instanceGroup": {
"cluster": "arn:aws:ecs:ap-southeast-1:ACCOUNT:cluster/WebCluster"
}
}
]
}
Creating a Deployment
$ ./blox-create-deployment.py --environment WebEnvironment --deployment-token
"17248257-08ec-4438-888f-e0ac28397653" --stack Blox --apigateway --region ap-
southeast-1
== Blox Demo CLI - Create Blox Deployment ==
HTTP Response Code: 200
{
"status": "pending",
"environmentName": "WebEnvironment",
"id": "7a05ea99-27a9-4339-a7a6-f4120065aea3",
"failedInstances": [],
"taskDefinition": "arn:aws:ecs:ap-southeast-1:613904931467:task-definition/
nginx:2”
}
Listing Deployments
$ ./blox-list-deployments.py --environment WebEnvironment --stack Blox --
apigateway --region ap-southeast-1
== Blox Demo CLI - List Blox Deployments ==
HTTP Response Code: 200
{
"items": [
{
"status": "completed",
"environmentName": "WebEnvironment",
"id": "7a05ea99-27a9-4339-a7a6-f4120065aea3",
"failedInstances": [],
"taskDefinition": "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/
nginx:2"
}
]
}
Scaling a Deployment
$ ecs-cli ps
Name State Ports TaskDefinition
26313cbe-d929-49de-9cc3-873bf5f32a91/nginx RUNNING nginx:2
98442432-fd5c-434d-b93c-0737bd06aaab/nginx RUNNING nginx:2
ce9bf217-4b34-4f31-9c7b-a8c3402f1ffd/nginx RUNNING nginx:2
$ ecs-cli scale --size 4 --capability-iam
$ ecs-cli ps
Name State Ports TaskDefinition
26313cbe-d929-49de-9cc3-873bf5f32a91/nginx RUNNING nginx:2
98442432-fd5c-434d-b93c-0737bd06aaab/nginx RUNNING nginx:2
c404ac9a-0948-4cc8-b5b0-2238ccdf4035/nginx RUNNING nginx:2
ce9bf217-4b34-4f31-9c7b-a8c3402f1ffd/nginx RUNNING nginx:2
Additional resources
Tech articles by Werner Vogels, CTO, Amazon.com
http://www.allthingsdistributed.com/2014/11/amazon-ec2-container-service.html
http://www.allthingsdistributed.com/2015/04/state-management-and-scheduling-with-ecs.html"
http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Blox 
https://blox.github.io/ 

Amazon ECS videos @ AWS re:Invent 2016
https://aws.amazon.com/blogs/compute/amazon-ec2-container-service-at-aws-reinvent-2016-wrap-up/
Thank you!
Julien Simon
Principal Technical Evangelist, AWS
@julsimon
Ad

Recommended

Building Open Source Platforms on AWS (April 2017)
Building Open Source Platforms on AWS (April 2017)
Julien SIMON
 
Advanced Task Scheduling with Amazon ECS (June 2017)
Advanced Task Scheduling with Amazon ECS (June 2017)
Julien SIMON
 
Advanced Task Scheduling with Amazon ECS (June 2017)
Advanced Task Scheduling with Amazon ECS (June 2017)
Julien SIMON
 
Deep Dive: Amazon Relational Database Service (March 2017)
Deep Dive: Amazon Relational Database Service (March 2017)
Julien SIMON
 
Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)
Julien SIMON
 
Running Docker clusters on AWS (November 2016)
Running Docker clusters on AWS (November 2016)
Julien SIMON
 
Comenzando com la nube hibrida
Comenzando com la nube hibrida
Amazon Web Services LATAM
 
Don't think about the difficulty Let's try to connect easy to IPv6 network w...
Don't think about the difficulty Let's try to connect easy to IPv6 network w...
Namba Kazuo
 
Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)
Julien SIMON
 
Advanced Security Masterclass - Tel Aviv Loft
Advanced Security Masterclass - Tel Aviv Loft
Ian Massingham
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
Amazon Web Services Korea
 
AutoScaling and Drupal
AutoScaling and Drupal
Promet Source
 
VMware and AWS together (June 2017)
VMware and AWS together (June 2017)
Julien SIMON
 
Amazon Ec2 Application Design
Amazon Ec2 Application Design
guestd0b61e
 
Viadeo - Cost Driven Development
Viadeo - Cost Driven Development
Julien SIMON
 
IoT: it's all about Data!
IoT: it's all about Data!
Julien SIMON
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon Athena
Julien SIMON
 
Building serverless apps with Node.js
Building serverless apps with Node.js
Julien SIMON
 
Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)
Julien SIMON
 
Continuous Deployment with Amazon Web Services
Continuous Deployment with Amazon Web Services
Julien SIMON
 
Infrastructure as code with Amazon Web Services
Infrastructure as code with Amazon Web Services
Julien SIMON
 
Fascinating Tales of a Strange Tomorrow
Fascinating Tales of a Strange Tomorrow
Julien SIMON
 
Amazon AI (February 2017)
Amazon AI (February 2017)
Julien SIMON
 
The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)
Julien SIMON
 
Devops with Amazon Web Services (January 2017)
Devops with Amazon Web Services (January 2017)
Julien SIMON
 
Bonnes pratiques pour la gestion des opérations de sécurité AWS
Bonnes pratiques pour la gestion des opérations de sécurité AWS
Julien SIMON
 
Amazon AI (March 2017)
Amazon AI (March 2017)
Julien SIMON
 
Building Serverless APIs (January 2017)
Building Serverless APIs (January 2017)
Julien SIMON
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
AWS Security Best Practices (March 2017)
AWS Security Best Practices (March 2017)
Julien SIMON
 

More Related Content

What's hot (6)

Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)
Julien SIMON
 
Advanced Security Masterclass - Tel Aviv Loft
Advanced Security Masterclass - Tel Aviv Loft
Ian Massingham
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
Amazon Web Services Korea
 
AutoScaling and Drupal
AutoScaling and Drupal
Promet Source
 
VMware and AWS together (June 2017)
VMware and AWS together (June 2017)
Julien SIMON
 
Amazon Ec2 Application Design
Amazon Ec2 Application Design
guestd0b61e
 
Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)
Julien SIMON
 
Advanced Security Masterclass - Tel Aviv Loft
Advanced Security Masterclass - Tel Aviv Loft
Ian Massingham
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
Amazon Web Services Korea
 
AutoScaling and Drupal
AutoScaling and Drupal
Promet Source
 
VMware and AWS together (June 2017)
VMware and AWS together (June 2017)
Julien SIMON
 
Amazon Ec2 Application Design
Amazon Ec2 Application Design
guestd0b61e
 

Viewers also liked (20)

Viadeo - Cost Driven Development
Viadeo - Cost Driven Development
Julien SIMON
 
IoT: it's all about Data!
IoT: it's all about Data!
Julien SIMON
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon Athena
Julien SIMON
 
Building serverless apps with Node.js
Building serverless apps with Node.js
Julien SIMON
 
Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)
Julien SIMON
 
Continuous Deployment with Amazon Web Services
Continuous Deployment with Amazon Web Services
Julien SIMON
 
Infrastructure as code with Amazon Web Services
Infrastructure as code with Amazon Web Services
Julien SIMON
 
Fascinating Tales of a Strange Tomorrow
Fascinating Tales of a Strange Tomorrow
Julien SIMON
 
Amazon AI (February 2017)
Amazon AI (February 2017)
Julien SIMON
 
The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)
Julien SIMON
 
Devops with Amazon Web Services (January 2017)
Devops with Amazon Web Services (January 2017)
Julien SIMON
 
Bonnes pratiques pour la gestion des opérations de sécurité AWS
Bonnes pratiques pour la gestion des opérations de sécurité AWS
Julien SIMON
 
Amazon AI (March 2017)
Amazon AI (March 2017)
Julien SIMON
 
Building Serverless APIs (January 2017)
Building Serverless APIs (January 2017)
Julien SIMON
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
AWS Security Best Practices (March 2017)
AWS Security Best Practices (March 2017)
Julien SIMON
 
Deep Dive: Amazon Virtual Private Cloud (March 2017)
Deep Dive: Amazon Virtual Private Cloud (March 2017)
Julien SIMON
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
Amazon Athena (March 2017)
Amazon Athena (March 2017)
Julien SIMON
 
Europa AI startup scaleups report 2016
Europa AI startup scaleups report 2016
Ian Beckett
 
Viadeo - Cost Driven Development
Viadeo - Cost Driven Development
Julien SIMON
 
IoT: it's all about Data!
IoT: it's all about Data!
Julien SIMON
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon Athena
Julien SIMON
 
Building serverless apps with Node.js
Building serverless apps with Node.js
Julien SIMON
 
Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)
Julien SIMON
 
Continuous Deployment with Amazon Web Services
Continuous Deployment with Amazon Web Services
Julien SIMON
 
Infrastructure as code with Amazon Web Services
Infrastructure as code with Amazon Web Services
Julien SIMON
 
Fascinating Tales of a Strange Tomorrow
Fascinating Tales of a Strange Tomorrow
Julien SIMON
 
Amazon AI (February 2017)
Amazon AI (February 2017)
Julien SIMON
 
The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)
Julien SIMON
 
Devops with Amazon Web Services (January 2017)
Devops with Amazon Web Services (January 2017)
Julien SIMON
 
Bonnes pratiques pour la gestion des opérations de sécurité AWS
Bonnes pratiques pour la gestion des opérations de sécurité AWS
Julien SIMON
 
Amazon AI (March 2017)
Amazon AI (March 2017)
Julien SIMON
 
Building Serverless APIs (January 2017)
Building Serverless APIs (January 2017)
Julien SIMON
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 
AWS Security Best Practices (March 2017)
AWS Security Best Practices (March 2017)
Julien SIMON
 
Deep Dive: Amazon Virtual Private Cloud (March 2017)
Deep Dive: Amazon Virtual Private Cloud (March 2017)
Julien SIMON
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
Amazon Athena (March 2017)
Amazon Athena (March 2017)
Julien SIMON
 
Europa AI startup scaleups report 2016
Europa AI startup scaleups report 2016
Ian Beckett
 
Ad

Similar to Advanced Task Scheduling with Amazon ECS (20)

Advanced Scheduling with Amazon ECS (September 2017)
Advanced Scheduling with Amazon ECS (September 2017)
Julien SIMON
 
Advanced Task Scheduling with Amazon ECS - Julien Simon
Advanced Task Scheduling with Amazon ECS - Julien Simon
Paris Container Day
 
Amazon ECS (December 2015)
Amazon ECS (December 2015)
Julien SIMON
 
Running Docker Containers on AWS
Running Docker Containers on AWS
Vladimir Simek
 
AWS ECS Meetup Talentica
AWS ECS Meetup Talentica
Anshul Patel
 
Leveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container Orchestration
Neeraj Shah
 
Amazon ECS (March 2016)
Amazon ECS (March 2016)
Julien SIMON
 
ECS - from 0 to 100
ECS - from 0 to 100
Vitaliy Kuznetsov
 
Amazon ECS.pptx tasks conatiner ecs new car
Amazon ECS.pptx tasks conatiner ecs new car
zineblahib2
 
Running Docker clusters on AWS (June 2016)
Running Docker clusters on AWS (June 2016)
Julien SIMON
 
Getting Started with Docker on AWS
Getting Started with Docker on AWS
Kristana Kane
 
ECS and ECR deep dive
ECS and ECR deep dive
Shiva Narayanaswamy
 
Getting Started With Docker on AWS
Getting Started With Docker on AWS
Mikhail Prudnikov
 
intro elastic container service amazon aws
intro elastic container service amazon aws
DanielJara92
 
Amazon Container Services
Amazon Container Services
Richard Harvey
 
Docker clusters on AWS with Amazon ECS and Kubernetes
Docker clusters on AWS with Amazon ECS and Kubernetes
Julien SIMON
 
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks
 
Microservices and elastic resource pools with Amazon EC2 Container Service
Microservices and elastic resource pools with Amazon EC2 Container Service
Boyan Dimitrov
 
Deep Dive on Amazon Elastic Container Service (ECS) | AWS Summit Tel Aviv 2019
Deep Dive on Amazon Elastic Container Service (ECS) | AWS Summit Tel Aviv 2019
AWS Summits
 
ecs-presentation
ecs-presentation
Marc Costello
 
Advanced Scheduling with Amazon ECS (September 2017)
Advanced Scheduling with Amazon ECS (September 2017)
Julien SIMON
 
Advanced Task Scheduling with Amazon ECS - Julien Simon
Advanced Task Scheduling with Amazon ECS - Julien Simon
Paris Container Day
 
Amazon ECS (December 2015)
Amazon ECS (December 2015)
Julien SIMON
 
Running Docker Containers on AWS
Running Docker Containers on AWS
Vladimir Simek
 
AWS ECS Meetup Talentica
AWS ECS Meetup Talentica
Anshul Patel
 
Leveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container Orchestration
Neeraj Shah
 
Amazon ECS (March 2016)
Amazon ECS (March 2016)
Julien SIMON
 
Amazon ECS.pptx tasks conatiner ecs new car
Amazon ECS.pptx tasks conatiner ecs new car
zineblahib2
 
Running Docker clusters on AWS (June 2016)
Running Docker clusters on AWS (June 2016)
Julien SIMON
 
Getting Started with Docker on AWS
Getting Started with Docker on AWS
Kristana Kane
 
Getting Started With Docker on AWS
Getting Started With Docker on AWS
Mikhail Prudnikov
 
intro elastic container service amazon aws
intro elastic container service amazon aws
DanielJara92
 
Amazon Container Services
Amazon Container Services
Richard Harvey
 
Docker clusters on AWS with Amazon ECS and Kubernetes
Docker clusters on AWS with Amazon ECS and Kubernetes
Julien SIMON
 
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
Weaveworks
 
Microservices and elastic resource pools with Amazon EC2 Container Service
Microservices and elastic resource pools with Amazon EC2 Container Service
Boyan Dimitrov
 
Deep Dive on Amazon Elastic Container Service (ECS) | AWS Summit Tel Aviv 2019
Deep Dive on Amazon Elastic Container Service (ECS) | AWS Summit Tel Aviv 2019
AWS Summits
 
Ad

More from Julien SIMON (20)

deep_dive_multihead_latent_attention.pdf
deep_dive_multihead_latent_attention.pdf
Julien SIMON
 
Deep Dive: Model Distillation with DistillKit
Deep Dive: Model Distillation with DistillKit
Julien SIMON
 
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Julien SIMON
 
Building High-Quality Domain-Specific Models with Mergekit
Building High-Quality Domain-Specific Models with Mergekit
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Julien Simon - Deep Dive: Compiling Deep Learning Models
Julien Simon - Deep Dive: Compiling Deep Learning Models
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien SIMON
 
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien SIMON
 
Julien Simon - Deep Dive - Quantizing LLMs
Julien Simon - Deep Dive - Quantizing LLMs
Julien SIMON
 
Julien Simon - Deep Dive - Model Merging
Julien Simon - Deep Dive - Model Merging
Julien SIMON
 
An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
Julien SIMON
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
Building NLP applications with Transformers
Building NLP applications with Transformers
Julien SIMON
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
Julien SIMON
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
Julien SIMON
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
Julien SIMON
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
Julien SIMON
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
Julien SIMON
 
deep_dive_multihead_latent_attention.pdf
deep_dive_multihead_latent_attention.pdf
Julien SIMON
 
Deep Dive: Model Distillation with DistillKit
Deep Dive: Model Distillation with DistillKit
Julien SIMON
 
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Julien SIMON
 
Building High-Quality Domain-Specific Models with Mergekit
Building High-Quality Domain-Specific Models with Mergekit
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Julien Simon - Deep Dive: Compiling Deep Learning Models
Julien Simon - Deep Dive: Compiling Deep Learning Models
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien SIMON
 
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien SIMON
 
Julien Simon - Deep Dive - Quantizing LLMs
Julien Simon - Deep Dive - Quantizing LLMs
Julien SIMON
 
Julien Simon - Deep Dive - Model Merging
Julien Simon - Deep Dive - Model Merging
Julien SIMON
 
An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
Julien SIMON
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
Building NLP applications with Transformers
Building NLP applications with Transformers
Julien SIMON
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
Julien SIMON
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
Julien SIMON
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
Julien SIMON
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
Julien SIMON
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
Julien SIMON
 

Recently uploaded (20)

June Patch Tuesday
June Patch Tuesday
Ivanti
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Impelsys Inc.
 
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
Down the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training Roadblocks
Rustici Software
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
Muhammad Rizwan Akram
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
Precisely
 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Impelsys Inc.
 
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
Down the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training Roadblocks
Rustici Software
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
Muhammad Rizwan Akram
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
Precisely
 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 

Advanced Task Scheduling with Amazon ECS

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Julien Simon Principal Technical Evangelist, AWS @julsimon Advanced Task Scheduling " with Amazon ECS (CON307 revisited)
  • 2. Docker on Amazon Web Services Amazon EC2 Container Service (ECS) •  https://aws.amazon.com/ecs/ •  Launched in 04/2015 •  No additional charge Amazon EC2 Container Registry (ECR) •  https://aws.amazon.com/ecr/ •  Launched in 12/2015 •  Free tier: 500MB / month for a year •  $0.10 / GB / month + outgoing traffic ECS & ECS are available in 11 regions (US, EU, APAC)
  • 6. The problem Given a certain amount of computing power and memory, how can we best manage an arbitrary number of apps running in Docker containers? http://tidalseven.com
  • 7. Amazon ECS: Under the Hood ALB ALB AZ 1 AZ 2 user / scheduler https://github.com/aws/amazon-ecs-agent http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html
  • 8. Case study: Coursera https://www.youtube.com/watch?v=a45J6xAGUvA Coursera deliver Massive Open Online Courses (14 million students, 1000+ courses). Their platform runs a large number of batch jobs, notably to grade programming assignments. Grading jobs need to run in near-real time while preventing execution of untrusted code inside the Coursera platform. After trying out some other Docker solutions, Coursera have picked Amazon ECS and have even written their own scheduler. “Amazon ECS enabled Coursera to focus on releasing new software ! rather than spending time managing clusters” - Frank Chen, Software Engineer
  • 9. Scheduling on ECS: two options so far 1.  Let ECS handle scheduling through Services •  Task Definition •  ECS equivalent of the Docker Compose file •  Versioned •  cpu_shares, mem_limit •  Number of containers 2.  Implement a custom scheduler with the ECS API •  Describe cluster state •  Select a specific ECS instance according to custom logic •  Run task on this instance
  • 11. Placement Engine: giving developers more control ALB ALB AZ 1 AZ 2 user / scheduler Placement Engine Placement Constraints Placement Strategies
  • 12. Placement Constraints Name Example AMI ID attribute:ecs.ami-id == ami-eca289fb Availability Zone attribute:ecs.availability-zone == us-east-1a Instance Type attribute:ecs.instance-type == t2.small Distinct Instances type=“distinctInstance” Custom attribute:stack == prod
  • 13. Example: Constraint on Instance Family/Type
  • 14. Example: Constraint on Availability Zone
  • 16. Placement Strategies Binpacking Spread Affinity Distinct Instance
  • 17. Placement Strategy Chaining Spread tasks across Zones and Binpack within each Zone
  • 19. Anatomy of Task Placement Cluster Constraints Custom Constraints Placement Strategies Apply Filter Satisfy CPU, memory, and port requirements Filter for location, instance-type, AMI, or custom attribute constraints Identify instances that meet spread or binpack placement strategy Select final container instances for placement
  • 20. g2.2xlarge t2.small g2.2xlarge g2.2xlarge Placement: Targeting Instance Type
  • 21. g2.2xlarge t2.small t2.micro t2.medium t2.medium t2.small g2.2xlarge t2.small t2.small t2.medium us-east-1aus-east-1d Placement: Targeting Instance Type & Zone
  • 22. g2.2xlarge t2.small t2.micro t2.medium t2.medium t2.small g2.2xlarge t2.small us-east-1aus-east-1d g2.2xlarge t2.medium t2.micro t2.small us-east-1c Placement: Availability Zone Spread
  • 23. g2.2xlarge t2.small t2.micro t2.medium t2.medium t2.small g2.2xlarge t2.small us-east-1aus-east-1d g2.2xlarge t2.medium t2.micro t2.small us-east-1c Placement: Spread across Zone and Binpack
  • 24. g2.2xlarge t2.small t2.micro t2.medium t2.medium t2.small g2.2xlarge t2.small us-east-1aus-east-1d g2.2xlarge t2.medium t2.micro t2.small us-east-1c Placement: Affinity and Anti-Affinity
  • 26. t2.medium t2.small t2.small us-east-1aus-east-1d t2.medium t2.micro t2.small us-east-1c Placement: Multiple Services on a Cluster
  • 27. t2.medium g2.2xlarge t2.micro t2.small t2.small t2.small g2.2xlarge t2.small t2.small t2.small g2.2xlarge t2.small Placement: Services – Distinct Instances
  • 29. Amazon ECS: Under the Hood ALB ALB AZ 1 AZ 2 user / scheduler Placement Engine Event Stream
  • 31. Handling ECS events with Blox scheduler cluster state service
  • 32. Amazon ECS: Under the Hood ALB ALB AZ 1 AZ 2 user / scheduler Scheduler Cluster State Service Placement Engine Event Stream
  • 33. t2.small t2.small t2.small Blox: Daemon Scheduler t2.small t2.small t2.small scheduler cluster state service t2.small t2.small t2.small
  • 35. Creating Clusters Create an ECS cluster for Blox CF template: https://github.com/blox/blox/blob/dev/deploy/aws/conf/cloudformation_template.json à  CloudWatch Event Rule + SQS queue à  Daemon Scheduler + Cluster State Service + etcd à  REST API exposing the Daemon Scheduler API Create another ECS cluster managed by Blox $ ecs-cli configure --cluster WebCluster --region ap-southeast-1 $ ecs-cli up --keypair admin --capability-iam --size 3 --instance-type t2.micro Invoke the scheduler API ‘demo-cli’ tool: https://github.com/blox/blox/tree/dev/deploy/demo-cli
  • 36. Listing Task Definitions Grab the ARN for an nginx Task Definition, which the Daemon Scheduler will manage on ‘WebCluster’. $ ./list-task-definitions.py --region ap-southeast-1 == Blox Demo CLI - List Task Definitions == { "taskDefinitionArns": [ "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/BloxFramework:2", "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/nginx:1", "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/nginx:2" ] }
  • 37. Creating an Environment $ ./blox-create-environment.py --environment WebEnvironment --cluster WebCluster --task-definition "arn:aws:ecs:ap-southeast-1:ACCOUNT:task- definition/nginx:2" --stack Blox --apigateway --region ap-southeast-1 == Blox Demo CLI - Create Blox Environment == HTTP Response Code: 200 { "taskDefinition": "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/ nginx:2", "deploymentToken": "17248257-08ec-4438-888f-e0ac28397653", "health": "healthy", "name": "WebEnvironment", "instanceGroup": { "cluster": "arn:aws:ecs:ap-southeast-1:ACCOUNT:cluster/WebCluster" } }
  • 38. Listing Environments $ ./blox-list-environments.py --stack Blox --apigateway --region ap-southeast-1 == Blox Demo CLI - List Blox Environments == HTTP Response Code: 200 { "items": [ { "taskDefinition": "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/ nginx:2", "deploymentToken": "17248257-08ec-4438-888f-e0ac28397653", "health": "healthy", "name": "WebEnvironment", "instanceGroup": { "cluster": "arn:aws:ecs:ap-southeast-1:ACCOUNT:cluster/WebCluster" } } ] }
  • 39. Creating a Deployment $ ./blox-create-deployment.py --environment WebEnvironment --deployment-token "17248257-08ec-4438-888f-e0ac28397653" --stack Blox --apigateway --region ap- southeast-1 == Blox Demo CLI - Create Blox Deployment == HTTP Response Code: 200 { "status": "pending", "environmentName": "WebEnvironment", "id": "7a05ea99-27a9-4339-a7a6-f4120065aea3", "failedInstances": [], "taskDefinition": "arn:aws:ecs:ap-southeast-1:613904931467:task-definition/ nginx:2” }
  • 40. Listing Deployments $ ./blox-list-deployments.py --environment WebEnvironment --stack Blox -- apigateway --region ap-southeast-1 == Blox Demo CLI - List Blox Deployments == HTTP Response Code: 200 { "items": [ { "status": "completed", "environmentName": "WebEnvironment", "id": "7a05ea99-27a9-4339-a7a6-f4120065aea3", "failedInstances": [], "taskDefinition": "arn:aws:ecs:ap-southeast-1:ACCOUNT:task-definition/ nginx:2" } ] }
  • 41. Scaling a Deployment $ ecs-cli ps Name State Ports TaskDefinition 26313cbe-d929-49de-9cc3-873bf5f32a91/nginx RUNNING nginx:2 98442432-fd5c-434d-b93c-0737bd06aaab/nginx RUNNING nginx:2 ce9bf217-4b34-4f31-9c7b-a8c3402f1ffd/nginx RUNNING nginx:2 $ ecs-cli scale --size 4 --capability-iam $ ecs-cli ps Name State Ports TaskDefinition 26313cbe-d929-49de-9cc3-873bf5f32a91/nginx RUNNING nginx:2 98442432-fd5c-434d-b93c-0737bd06aaab/nginx RUNNING nginx:2 c404ac9a-0948-4cc8-b5b0-2238ccdf4035/nginx RUNNING nginx:2 ce9bf217-4b34-4f31-9c7b-a8c3402f1ffd/nginx RUNNING nginx:2
  • 42. Additional resources Tech articles by Werner Vogels, CTO, Amazon.com http://www.allthingsdistributed.com/2014/11/amazon-ec2-container-service.html http://www.allthingsdistributed.com/2015/04/state-management-and-scheduling-with-ecs.html" http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html Blox https://blox.github.io/ Amazon ECS videos @ AWS re:Invent 2016 https://aws.amazon.com/blogs/compute/amazon-ec2-container-service-at-aws-reinvent-2016-wrap-up/
  • 43. Thank you! Julien Simon Principal Technical Evangelist, AWS @julsimon