SlideShare a Scribd company logo
Scaling horizontally on AWS
Bozhidar Bozhanov, LogSentinel
About me
• Senior software engineer and architect
• Founder & CEO @ LogSentinel
• Blog: techblog.bozho.net
• Twitter: @bozhobg
• Stackoverflow top 50
Why?
• Why high availability?
• Why scalability?
• To account for increased load
• If you have decent HA, you’re likely scalable
• Don’t overdesign
• Why AWS (or any cloud provider)?
AWS
• IaaS (Infrastructure as a service) originally (EC2)
• Virtual machines
• Load balancers
• Security groups
• PaaS services ontop
• Multiple regions – US, EU, Asia, etc.
• Each region has multiple availability zones (roughly equal to “data centers”)
• Cross-availability zone is easy
• Cross-region is harder
• Similar to Azure, Google Cloud, etc.
Rule of thumb: stateless applications
• No persistent state on the application nodes
• Caches and temporary files are okay
• Distributed vs local cache
• Session state: distributed vs no session state (e.g. JWT)
• Makes the application layer horizontally scalable
• Application nodes are disposable
Executing only once in a cluster
• Sometimes you need to execute a scheduled piece of code only once in a cluster
• Database-backed schedule job management
• Distributed locks (Hazelcast)
• Using queues (SQS, AmazonMQ, RabbitMQ)?
Scaling
• Autoscaling groups
• Groups of virtual machines (instances) with identical configuration
• Scale-up - configure criteria for launching new virtual machines – e.g. “more than 5
minutes of CPU utilization over 80%”
• Scale-down – configure criteria for destroying virtual machines
• Allows for handling spikes, or gradual increase of load
• Spot instances
• Cheap instances you “bid” for. Can be reclaimed at any time
• Useful for heavy background processes.
• Useful for test environments.
Data stores
• Managed
• RDBMS (AWS RDS) – MySQL, MariaDB, Postgres, Oracle, MS SQL
• Search engines – Elasticsearch
• Caches – Elasticache (Redis and memcached)
• Custom:
• Amazon Aurora
• CloudSearch
• S3, SimpleDB, Dynamo
• Own installation: spin VMs, install anything you like (e.g. Cassandra, Hbase, own
Postgres, own Elasticsearch, own caching solution)
Scaling data stores
• The custom ones are automatically scaled (S3, SimpleDB)
• The managed ones are scaled by configuration
• Own deployments are scaled via auto-scaling groups
• Data sharding vs replication with consistent hashing
• Resharding is not trivial
• Replication with consistent hashing can handle scaling up automatically *
Elastic load balancer
• AWS-provided software load balancer
• Points to specified target machines or group of machines (roughly ASGs)
• Configurable: protocols, ports, healthcheck, monitoring metrics
• TLS termination
• AWS-managed certificates
• Load balancer in front of application nodes
• Load balancer in front of data store nodes
• vs application-level load-balancing (configuration vs fetching db nodes dynamically)
Things to automate
• Hardware and network resources (CloudFormation)
• Application and database configuration (OpsWorks: Puppet, Chef, S3+bash, Capistrano)
• Instances
• launch configurations + bash
• docker containers + bash (Elastic Container Service vs Fargate, Kubernetes)
• Why automate?
• because autoscaling benefits from automated instance creation
Scripted stacks
• You can create all instances, load balancers, auto-scaling groups, launch configurations,
security groups, domains, elasticsearch domains, etc., etc.. manually
• But CloudFormation is way better
• JSON or YAML
• CloudFormation manages upgrade
• Stack parameters (instance types, number of nodes, domains used, s3 buckets, etc.)
"DatabaseLaunchConfiguration": {
"Type": "AWS::AutoScaling::LaunchConfiguration",
"Properties": {
"AssociatePublicIpAddress": true,
"IamInstanceProfile": {
"Ref": "InstanceRoleInstanceProfile"
},
"ImageId": {
"Fn::FindInMap": [
{
"Ref": "DatabaseStorageType"
},
{
"Ref": "AWS::Region"
},
"Linux"
]
},
"InstanceType": {
"Ref": "DatabaseInstanceType"
},
"SecurityGroups": [
{
"Ref": "DatabaseSecurityGroup"
}
]
}
"WebAppLoadBalancer": {
"Type": "AWS::ElasticLoadBalancingV2::LoadBalancer",
"Properties": {
"Scheme": "internet-facing",
"Type": "application",
"Subnets": [
{
"Ref": "PublicSubnetA"
},
{
"Ref": "PublicSubnetB"
},
{
"Ref": "PublicSubnetC"
}
],
"SecurityGroups": [
{
"Ref": "WebAppLoadBalancerSecurityGroup"
}
]
}
},
"WebAppTargetGroup": {
"Type": "AWS::ElasticLoadBalancingV2::TargetGroup",
"Properties": {
"HealthCheckIntervalSeconds": 30,
"HealthCheckProtocol": "HTTP",
"HealthCheckTimeoutSeconds": 10,
"HealthyThresholdCount": 2,
"HealthCheckPath": "/healthcheck",
"Matcher": {
"HttpCode": "200"
},
"Port": 8080,
"Protocol": "HTTP",
"TargetGroupAttributes": [
{
"Key":
"deregistration_delay.timeout_seconds",
"Value": "20"
}
],
"UnhealthyThresholdCount": 3,
"VpcId": {
"Ref": "VPC"
}
}
},
Why CloudFormation?
• Replicable stacks
• Used for different customers
• Used for different environments
• Used for disaster recovery
• Having a clear documentation of your entire infrastructure
• DevOps friendly
• Not that hard to learn
• Drawbacks: slow change-and-test cycles, proprietary
• Alternatives: Terraform
• Tries to abstract stack creation independent of provider, but you still depend on
proprietary concepts like ELB, security groups, etc.
Configuration provisioning
• OpsWorks – hosted Puppet or Chef
• Capistrano – “login to all machines and do x, y, z”
• S3 – simple, no learning curve
• Instance launch configuration includes files to fetch from S3 (app.properties,
db.properties, cassandra.conf, mysql.conf, etc.)
• CloudFormation can write dynamic values to conf files (e.g. ELB address)
"UserData": {
"Fn::Base64": {
"Fn::Join": [
"",
[
"#!/bin/bash -xn",
"yum update -y aws-cfn-bootstrapn",
"yum install -y aws-clin",
"cat <<EOF >> /var/app/app.propertiesn",
{
"Fn::Join": [
"",
[
"n",
“db.host=",
{
"Ref": "DatabaseELBAddress"
},
"n",
"elasticsearch.url=https://",
{
"Ref": "ElasticSearchDomainName"
},
"n",
"root.url=https://",
{
"Ref": "DomainName"
]
]
},
"EOF“
Automated instance setup
• Elastic Container Services
• Deploy docker containers on EC2 instances
• Fargate abstracts the need to manage the underlying EC2 instance
• Kubernetes – vendor-independent
• But don’t rush into using kubernetes (or Docker for that matter).
• Packer – creates images
• Manual
• Launch configuration to fetch and execute setup.sh
• Allows for easy zero downtime blue-green deployment
• Instance setup changed? Destroy the it and launch a new one
• Simple. Simple is good.
Blue-green deployment
• Two S3 “folders” – blue and green
• Shared database
• Two autoscaling groups – blue (currently active) and green (currently passive)
• Upload new release artifact (e.g. fat jar) to s3://setup-bucket/green
• Activate the green ASG (increase required number of instances)
• Wait for nodes to launch
• Execute acceptance tests
• Switch DNS record (Route53) from blue ELB to green ELB
• Turquoise (intermediate deployment in case of breaking database changes)
• Can be automated via script that uses AWS CLI or APIs
Other useful services
• IAM – user and role management (each instance knows its role, no need for passwords)
• S3 – distributed storage / key-value store / universally applicable
• CloudTrail – audit trail of all infrastructure changes
• CloudWatch – monitoring of resources
• KMS – key management
• Glacier – cold storage
• Lambda – “serverless” a.k.a. function execution
General best practices
• Security groups
• Only open ports that you need
• Bastion host – entry point to the stack via SSH
• VPC (virtual private cloud)
• your own virtual network, private address space, subnets (per e.g. availability zone),
etc.
• Multi-factor authentication
Conclusion
• Scalability is a function of your application first and infrastructure second
• AWS is pretty straightforward to learn
• You can have scalable, scripted infrastructure without big investments
• New services appear often – check them out
• Vendor lock-in is almost inevitable
• But concepts are (almost) identical across cloud providers
• If something can be done easily without an AWS-specific service, prefer that
• Bash is inevitable
Thank you!
Ad

More Related Content

What's hot (19)

AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
QCloudMentor
 
Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017
Fernando Mejía
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Patrick Di Loreto
 
Azure CosmosDb
Azure CosmosDbAzure CosmosDb
Azure CosmosDb
Marco Parenzan
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
Marco Parenzan
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
Shy Engelberg
 
Scalding @ Coursera
Scalding @ CourseraScalding @ Coursera
Scalding @ Coursera
Daniel Jin Hao Chia
 
Zero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DBZero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DB
Adnan Hashmi
 
Building Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CSBuilding Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CS
John Burwell
 
Azure DocumentDB
Azure DocumentDBAzure DocumentDB
Azure DocumentDB
Neil Mackenzie
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
QCloudMentor
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
CAMMS
 
Azure CosmosDB
Azure CosmosDBAzure CosmosDB
Azure CosmosDB
Fernando Mejía
 
CosmosDb for beginners
CosmosDb for beginnersCosmosDb for beginners
CosmosDb for beginners
Phil Pursglove
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Rainforest QA
 
Cool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDBCool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDB
Jan Hentschel
 
Session 1 IaaS, PaaS, SaaS Overview
Session 1   IaaS, PaaS, SaaS OverviewSession 1   IaaS, PaaS, SaaS Overview
Session 1 IaaS, PaaS, SaaS Overview
Code Mastery
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosql
Riccardo Cappello
 
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
QCloudMentor
 
Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017
Fernando Mejía
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Patrick Di Loreto
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
Marco Parenzan
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
Shy Engelberg
 
Zero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DBZero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DB
Adnan Hashmi
 
Building Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CSBuilding Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CS
John Burwell
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
QCloudMentor
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
CAMMS
 
CosmosDb for beginners
CosmosDb for beginnersCosmosDb for beginners
CosmosDb for beginners
Phil Pursglove
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Rainforest QA
 
Cool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDBCool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDB
Jan Hentschel
 
Session 1 IaaS, PaaS, SaaS Overview
Session 1   IaaS, PaaS, SaaS OverviewSession 1   IaaS, PaaS, SaaS Overview
Session 1 IaaS, PaaS, SaaS Overview
Code Mastery
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosql
Riccardo Cappello
 

Similar to Scaling horizontally on AWS (18)

Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
Tom Laszewski
 
AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018
Bert Zahniser
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
Daniel N
 
Brief theoretical overview on AWS Components
Brief theoretical overview on AWS ComponentsBrief theoretical overview on AWS Components
Brief theoretical overview on AWS Components
Tech Tutorials
 
What are clouds made from
What are clouds made fromWhat are clouds made from
What are clouds made from
John Garbutt
 
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheTraining AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Bùi Quang Lâm
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
Tom Laszewski
 
TechBeats #2
TechBeats #2TechBeats #2
TechBeats #2
applausepoland
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & Opportunities
Owen Cutajar
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Kristana Kane
 
Utah Codecamp Cloud Computing
Utah Codecamp Cloud ComputingUtah Codecamp Cloud Computing
Utah Codecamp Cloud Computing
Tom Creighton
 
AWS Distilled
AWS DistilledAWS Distilled
AWS Distilled
Jeyaram Gurusamy
 
Big data on aws
Big data on awsBig data on aws
Big data on aws
Serkan Özal
 
[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop
Amazon Web Services Korea
 
Introduction to DevOps on AWS
Introduction to DevOps on AWSIntroduction to DevOps on AWS
Introduction to DevOps on AWS
Shiva Narayanaswamy
 
Running on Elastic Beanstalk
Running on Elastic BeanstalkRunning on Elastic Beanstalk
Running on Elastic Beanstalk
Alex Verdyan
 
Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)
Adrian Hornsby
 
Better, faster, cheaper infrastructure with apache cloud stack and riak cs redux
Better, faster, cheaper infrastructure with apache cloud stack and riak cs reduxBetter, faster, cheaper infrastructure with apache cloud stack and riak cs redux
Better, faster, cheaper infrastructure with apache cloud stack and riak cs redux
John Burwell
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
Tom Laszewski
 
AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018
Bert Zahniser
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
Daniel N
 
Brief theoretical overview on AWS Components
Brief theoretical overview on AWS ComponentsBrief theoretical overview on AWS Components
Brief theoretical overview on AWS Components
Tech Tutorials
 
What are clouds made from
What are clouds made fromWhat are clouds made from
What are clouds made from
John Garbutt
 
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheTraining AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Bùi Quang Lâm
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
Tom Laszewski
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & Opportunities
Owen Cutajar
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Kristana Kane
 
Utah Codecamp Cloud Computing
Utah Codecamp Cloud ComputingUtah Codecamp Cloud Computing
Utah Codecamp Cloud Computing
Tom Creighton
 
Running on Elastic Beanstalk
Running on Elastic BeanstalkRunning on Elastic Beanstalk
Running on Elastic Beanstalk
Alex Verdyan
 
Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)
Adrian Hornsby
 
Better, faster, cheaper infrastructure with apache cloud stack and riak cs redux
Better, faster, cheaper infrastructure with apache cloud stack and riak cs reduxBetter, faster, cheaper infrastructure with apache cloud stack and riak cs redux
Better, faster, cheaper infrastructure with apache cloud stack and riak cs redux
John Burwell
 
Ad

More from Bozhidar Bozhanov (20)

Откриване на фалшиви клетки за подслушване
Откриване на фалшиви клетки за подслушванеОткриване на фалшиви клетки за подслушване
Откриване на фалшиви клетки за подслушване
Bozhidar Bozhanov
 
Wiretap Detector - detecting cell-site simulators
Wiretap Detector - detecting cell-site simulatorsWiretap Detector - detecting cell-site simulators
Wiretap Detector - detecting cell-site simulators
Bozhidar Bozhanov
 
Антикорупционен софтуер
Антикорупционен софтуерАнтикорупционен софтуер
Антикорупционен софтуер
Bozhidar Bozhanov
 
Nothing is secure.pdf
Nothing is secure.pdfNothing is secure.pdf
Nothing is secure.pdf
Bozhidar Bozhanov
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
Bozhidar Bozhanov
 
Encryption in the enterprise
Encryption in the enterpriseEncryption in the enterprise
Encryption in the enterprise
Bozhidar Bozhanov
 
Blockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabiltyBlockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabilty
Bozhidar Bozhanov
 
Електронна държава
Електронна държаваЕлектронна държава
Електронна държава
Bozhidar Bozhanov
 
Blockchain - what is it good for?
Blockchain - what is it good for?Blockchain - what is it good for?
Blockchain - what is it good for?
Bozhidar Bozhanov
 
Algorithmic and technological transparency
Algorithmic and technological transparencyAlgorithmic and technological transparency
Algorithmic and technological transparency
Bozhidar Bozhanov
 
Alternatives for copyright protection online
Alternatives for copyright protection onlineAlternatives for copyright protection online
Alternatives for copyright protection online
Bozhidar Bozhanov
 
GDPR for developers
GDPR for developersGDPR for developers
GDPR for developers
Bozhidar Bozhanov
 
Политики, основани на данни
Политики, основани на данниПолитики, основани на данни
Политики, основани на данни
Bozhidar Bozhanov
 
Отворено законодателство
Отворено законодателствоОтворено законодателство
Отворено законодателство
Bozhidar Bozhanov
 
Overview of Message Queues
Overview of Message QueuesOverview of Message Queues
Overview of Message Queues
Bozhidar Bozhanov
 
Electronic governance steps in the right direction?
Electronic governance   steps in the right direction?Electronic governance   steps in the right direction?
Electronic governance steps in the right direction?
Bozhidar Bozhanov
 
Сигурност на електронното управление
Сигурност на електронното управлениеСигурност на електронното управление
Сигурност на електронното управление
Bozhidar Bozhanov
 
Opensource government
Opensource governmentOpensource government
Opensource government
Bozhidar Bozhanov
 
Биометрична идентификация
Биометрична идентификацияБиометрична идентификация
Биометрична идентификация
Bozhidar Bozhanov
 
Biometric identification
Biometric identificationBiometric identification
Biometric identification
Bozhidar Bozhanov
 
Откриване на фалшиви клетки за подслушване
Откриване на фалшиви клетки за подслушванеОткриване на фалшиви клетки за подслушване
Откриване на фалшиви клетки за подслушване
Bozhidar Bozhanov
 
Wiretap Detector - detecting cell-site simulators
Wiretap Detector - detecting cell-site simulatorsWiretap Detector - detecting cell-site simulators
Wiretap Detector - detecting cell-site simulators
Bozhidar Bozhanov
 
Антикорупционен софтуер
Антикорупционен софтуерАнтикорупционен софтуер
Антикорупционен софтуер
Bozhidar Bozhanov
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
Bozhidar Bozhanov
 
Encryption in the enterprise
Encryption in the enterpriseEncryption in the enterprise
Encryption in the enterprise
Bozhidar Bozhanov
 
Blockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabiltyBlockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabilty
Bozhidar Bozhanov
 
Електронна държава
Електронна държаваЕлектронна държава
Електронна държава
Bozhidar Bozhanov
 
Blockchain - what is it good for?
Blockchain - what is it good for?Blockchain - what is it good for?
Blockchain - what is it good for?
Bozhidar Bozhanov
 
Algorithmic and technological transparency
Algorithmic and technological transparencyAlgorithmic and technological transparency
Algorithmic and technological transparency
Bozhidar Bozhanov
 
Alternatives for copyright protection online
Alternatives for copyright protection onlineAlternatives for copyright protection online
Alternatives for copyright protection online
Bozhidar Bozhanov
 
Политики, основани на данни
Политики, основани на данниПолитики, основани на данни
Политики, основани на данни
Bozhidar Bozhanov
 
Отворено законодателство
Отворено законодателствоОтворено законодателство
Отворено законодателство
Bozhidar Bozhanov
 
Electronic governance steps in the right direction?
Electronic governance   steps in the right direction?Electronic governance   steps in the right direction?
Electronic governance steps in the right direction?
Bozhidar Bozhanov
 
Сигурност на електронното управление
Сигурност на електронното управлениеСигурност на електронното управление
Сигурност на електронното управление
Bozhidar Bozhanov
 
Биометрична идентификация
Биометрична идентификацияБиометрична идентификация
Биометрична идентификация
Bozhidar Bozhanov
 
Ad

Recently uploaded (20)

Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 

Scaling horizontally on AWS

  • 1. Scaling horizontally on AWS Bozhidar Bozhanov, LogSentinel
  • 2. About me • Senior software engineer and architect • Founder & CEO @ LogSentinel • Blog: techblog.bozho.net • Twitter: @bozhobg • Stackoverflow top 50
  • 3. Why? • Why high availability? • Why scalability? • To account for increased load • If you have decent HA, you’re likely scalable • Don’t overdesign • Why AWS (or any cloud provider)?
  • 4. AWS • IaaS (Infrastructure as a service) originally (EC2) • Virtual machines • Load balancers • Security groups • PaaS services ontop • Multiple regions – US, EU, Asia, etc. • Each region has multiple availability zones (roughly equal to “data centers”) • Cross-availability zone is easy • Cross-region is harder • Similar to Azure, Google Cloud, etc.
  • 5. Rule of thumb: stateless applications • No persistent state on the application nodes • Caches and temporary files are okay • Distributed vs local cache • Session state: distributed vs no session state (e.g. JWT) • Makes the application layer horizontally scalable • Application nodes are disposable
  • 6. Executing only once in a cluster • Sometimes you need to execute a scheduled piece of code only once in a cluster • Database-backed schedule job management • Distributed locks (Hazelcast) • Using queues (SQS, AmazonMQ, RabbitMQ)?
  • 7. Scaling • Autoscaling groups • Groups of virtual machines (instances) with identical configuration • Scale-up - configure criteria for launching new virtual machines – e.g. “more than 5 minutes of CPU utilization over 80%” • Scale-down – configure criteria for destroying virtual machines • Allows for handling spikes, or gradual increase of load • Spot instances • Cheap instances you “bid” for. Can be reclaimed at any time • Useful for heavy background processes. • Useful for test environments.
  • 8. Data stores • Managed • RDBMS (AWS RDS) – MySQL, MariaDB, Postgres, Oracle, MS SQL • Search engines – Elasticsearch • Caches – Elasticache (Redis and memcached) • Custom: • Amazon Aurora • CloudSearch • S3, SimpleDB, Dynamo • Own installation: spin VMs, install anything you like (e.g. Cassandra, Hbase, own Postgres, own Elasticsearch, own caching solution)
  • 9. Scaling data stores • The custom ones are automatically scaled (S3, SimpleDB) • The managed ones are scaled by configuration • Own deployments are scaled via auto-scaling groups • Data sharding vs replication with consistent hashing • Resharding is not trivial • Replication with consistent hashing can handle scaling up automatically *
  • 10. Elastic load balancer • AWS-provided software load balancer • Points to specified target machines or group of machines (roughly ASGs) • Configurable: protocols, ports, healthcheck, monitoring metrics • TLS termination • AWS-managed certificates • Load balancer in front of application nodes • Load balancer in front of data store nodes • vs application-level load-balancing (configuration vs fetching db nodes dynamically)
  • 11. Things to automate • Hardware and network resources (CloudFormation) • Application and database configuration (OpsWorks: Puppet, Chef, S3+bash, Capistrano) • Instances • launch configurations + bash • docker containers + bash (Elastic Container Service vs Fargate, Kubernetes) • Why automate? • because autoscaling benefits from automated instance creation
  • 12. Scripted stacks • You can create all instances, load balancers, auto-scaling groups, launch configurations, security groups, domains, elasticsearch domains, etc., etc.. manually • But CloudFormation is way better • JSON or YAML • CloudFormation manages upgrade • Stack parameters (instance types, number of nodes, domains used, s3 buckets, etc.)
  • 13. "DatabaseLaunchConfiguration": { "Type": "AWS::AutoScaling::LaunchConfiguration", "Properties": { "AssociatePublicIpAddress": true, "IamInstanceProfile": { "Ref": "InstanceRoleInstanceProfile" }, "ImageId": { "Fn::FindInMap": [ { "Ref": "DatabaseStorageType" }, { "Ref": "AWS::Region" }, "Linux" ] }, "InstanceType": { "Ref": "DatabaseInstanceType" }, "SecurityGroups": [ { "Ref": "DatabaseSecurityGroup" } ] }
  • 14. "WebAppLoadBalancer": { "Type": "AWS::ElasticLoadBalancingV2::LoadBalancer", "Properties": { "Scheme": "internet-facing", "Type": "application", "Subnets": [ { "Ref": "PublicSubnetA" }, { "Ref": "PublicSubnetB" }, { "Ref": "PublicSubnetC" } ], "SecurityGroups": [ { "Ref": "WebAppLoadBalancerSecurityGroup" } ] } },
  • 15. "WebAppTargetGroup": { "Type": "AWS::ElasticLoadBalancingV2::TargetGroup", "Properties": { "HealthCheckIntervalSeconds": 30, "HealthCheckProtocol": "HTTP", "HealthCheckTimeoutSeconds": 10, "HealthyThresholdCount": 2, "HealthCheckPath": "/healthcheck", "Matcher": { "HttpCode": "200" }, "Port": 8080, "Protocol": "HTTP", "TargetGroupAttributes": [ { "Key": "deregistration_delay.timeout_seconds", "Value": "20" } ], "UnhealthyThresholdCount": 3, "VpcId": { "Ref": "VPC" } } },
  • 16. Why CloudFormation? • Replicable stacks • Used for different customers • Used for different environments • Used for disaster recovery • Having a clear documentation of your entire infrastructure • DevOps friendly • Not that hard to learn • Drawbacks: slow change-and-test cycles, proprietary • Alternatives: Terraform • Tries to abstract stack creation independent of provider, but you still depend on proprietary concepts like ELB, security groups, etc.
  • 17. Configuration provisioning • OpsWorks – hosted Puppet or Chef • Capistrano – “login to all machines and do x, y, z” • S3 – simple, no learning curve • Instance launch configuration includes files to fetch from S3 (app.properties, db.properties, cassandra.conf, mysql.conf, etc.) • CloudFormation can write dynamic values to conf files (e.g. ELB address)
  • 18. "UserData": { "Fn::Base64": { "Fn::Join": [ "", [ "#!/bin/bash -xn", "yum update -y aws-cfn-bootstrapn", "yum install -y aws-clin", "cat <<EOF >> /var/app/app.propertiesn", { "Fn::Join": [ "", [ "n", “db.host=", { "Ref": "DatabaseELBAddress" }, "n", "elasticsearch.url=https://", { "Ref": "ElasticSearchDomainName" }, "n", "root.url=https://", { "Ref": "DomainName" ] ] }, "EOF“
  • 19. Automated instance setup • Elastic Container Services • Deploy docker containers on EC2 instances • Fargate abstracts the need to manage the underlying EC2 instance • Kubernetes – vendor-independent • But don’t rush into using kubernetes (or Docker for that matter). • Packer – creates images • Manual • Launch configuration to fetch and execute setup.sh • Allows for easy zero downtime blue-green deployment • Instance setup changed? Destroy the it and launch a new one • Simple. Simple is good.
  • 20. Blue-green deployment • Two S3 “folders” – blue and green • Shared database • Two autoscaling groups – blue (currently active) and green (currently passive) • Upload new release artifact (e.g. fat jar) to s3://setup-bucket/green • Activate the green ASG (increase required number of instances) • Wait for nodes to launch • Execute acceptance tests • Switch DNS record (Route53) from blue ELB to green ELB • Turquoise (intermediate deployment in case of breaking database changes) • Can be automated via script that uses AWS CLI or APIs
  • 21. Other useful services • IAM – user and role management (each instance knows its role, no need for passwords) • S3 – distributed storage / key-value store / universally applicable • CloudTrail – audit trail of all infrastructure changes • CloudWatch – monitoring of resources • KMS – key management • Glacier – cold storage • Lambda – “serverless” a.k.a. function execution
  • 22. General best practices • Security groups • Only open ports that you need • Bastion host – entry point to the stack via SSH • VPC (virtual private cloud) • your own virtual network, private address space, subnets (per e.g. availability zone), etc. • Multi-factor authentication
  • 23. Conclusion • Scalability is a function of your application first and infrastructure second • AWS is pretty straightforward to learn • You can have scalable, scripted infrastructure without big investments • New services appear often – check them out • Vendor lock-in is almost inevitable • But concepts are (almost) identical across cloud providers • If something can be done easily without an AWS-specific service, prefer that • Bash is inevitable