0% found this document useful (0 votes)
385 views

AWS Solutions Architect Associate Certification Notes

This document provides summaries of key AWS services: 1. IAM allows creation of users, groups, policies and roles to control access to AWS resources. Roles are more secure than storing keys on EC2 instances and can be used across regions. 2. S3 provides object storage for files up to 5TB in buckets with various storage classes. Versioning allows retrieval of prior file versions. 3. EC2 provides virtual computing resources that can be horizontally and vertically scaled on demand. Pricing models include on-demand, reserved, and spot instances.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
385 views

AWS Solutions Architect Associate Certification Notes

This document provides summaries of key AWS services: 1. IAM allows creation of users, groups, policies and roles to control access to AWS resources. Roles are more secure than storing keys on EC2 instances and can be used across regions. 2. S3 provides object storage for files up to 5TB in buckets with various storage classes. Versioning allows retrieval of prior file versions. 3. EC2 provides virtual computing resources that can be horizontally and vertically scaled on demand. Pricing models include on-demand, reserved, and spot instances.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

AWS Solutions Architect Associate Certification Notes

• AWS Budgets gives you the ability to set custom budgets that alert you when your costs or usage
exceed (or are forecasted to exceed) your budgeted amount. You can further refine your budget
to track costs associated with multiple dimensions, such as AWS service, linked account, tag, and
others.
• Cost Explorer tool helps you visualize and manage your AWS costs and usages over time.

Identity and Access Management


• IAM consists of users, groups (collection of users), policies (permissions which a user/group/role
have) and roles(assigned to AWS resources)
• IAM is universal and not limited to regions
• New users have no permissions when first created and have to be given that
• Billing alarms can be created by going into CloudWatch and creating a CloudWatch alarm which
uses a SNS topic

IAM Roles
• Roles can be created for many AWS services. When creating a role, a policy is attached to the role
which determines what the role can do.
• Roles are more secure than storing access key and secret access key on individual EC2 instances
• Roles are easier to manage
• Roles are universal and can be used in any region

S3 (Simple Storage Service)

• It is an object storage where flat files can be stored


• Files can be a maximum of 5TB and are stored in buckets. The largest object that can be uploaded
in a single PUT is 5 GB. For objects larger than 100MB multipart upload should be considered
• S3 is a universal namespace and names must be unique globally (because there can be only 1 web
page)
• Object consists of key (name of object), value (data), Version ID and metadata. S3 is essentially a
key value data storage
• S3 has read after write consistency for PUTS of new objects. Objects can be read as soon as it is
written
• S3 has eventual consistency for overwrite PUTS and DELETES (can take some time to propogate
the change)
• Guarantee of a 99.99% availability and 99.99999999999% durability
• S3 storage classes
o S3 standard: 99.99% availability and 99.99999999999% durability. Designed to sustain loss
of 2 facilities
o S3 IA (infrequently accessed): 99.9% availability. Lower fee than S3 but there is retrieval fee
o S3 One Zone - IA: 99.5% availability. Same as IA but it is less data resilient
o S3 - Intelligent Tiering: Automatically moves data to most cost-effective tier
o S3 Glacier: For data archiving. Retrieval times can be from minutes to hours. Glacier also
encrypts the data stored in it by default. You can also put a vault lock policy on it. You
cannot upload data to glacier directly through the management console (CLI or SDK can be
used)
• There are 3 retrieval options: Expedited, Standard and Bulk
o S3 Glacier Deep Archive: Retrieval takes up to 12 hours
• S3 Cross region replication can be done for high availability for an extra charge
• Transfer acceleration can be done for extra charge. Data is sent to edge locations through
optimized routes and users can access data through edge locations
• MFA can be applied to delete S3 objects
• By default, all newly created buckets are private. Access control can be setup using
o bucket policies - applies to the whole bucket
o access control lists - can be drilled down to objects instead of the whole bucket
• S3 buckets can be configured to create access logs by enabling "server access logs" which can be
sent to another bucket even in another account
• S3 Select is an Amazon S3 feature that makes it easy to retrieve specific data from the contents of
an object using simple SQL expressions without having to retrieve the entire object.
• Transferring data from EC2 to S3 in the same region is free and has no cost at all
• S3 can be used to host static websites. The bucket that is used to host the files must have the
same name as your domain or subdomain.
• S3 pre-signed URLs can be generated to give external users temporary access to do PUT and GET
requests to the S3 bucket (e.g premium users download a video). By default, pre-signed URLs are
valid for 3600 seconds which can be changed.
• CloudFront is great for static content that needs to be available everywhere and S3 Cross Region
Replication is great for dynamic content that needs to be available at low latency in few regions

S3 Versioning
• Versioning can be enabled. Once enabled it can only be suspended and not disabled
• Versioning stores all versions of an objects (including all writes and even if you delete an object)
• Versioning can be integrated with life cycle management
• Lifecycle management rules can be used to move files across different storage tiers automatically
• Cross Region Replication can be turned on. Current files will not be replicated. Delete markers and
deletes will not be replicated as well
• Versioning has to be turned on both the buckets to enable cross region replication

Encryption
• Encryption in transit is achieved by SSL/TLS e.g https
• Encryption at rest is achieved by either server side or client side
• Server side encyption can be done by
o S3 Managed Keys (SSE S3)
o AWS Key Management Service (SSE KMS). It is like SSE S3 but has some additional features
like audit trails and added protection
o Customer provided Keys (SSE C)

S3 Transfer Acceleration
• Utilizes the CloudFront Edge Network to accelerate your uploads to S3
• Users upload files to edge locations which then uses amazon’s backbone network to transfer files
to the bucket

Cloudfront
• It is a CDN (content delivery network). CDN is a system of distributed servers that deliver
webpages and other web content to a user based on the geographic locations of the user, the
origin of the webpage, and a content delivery server
• Edge location: It is the location where content will be cached.
• Distribution is the name given to the CDN which consists of a collection of edge locations
• CloudFront can be used to deliver your entire website or media streaming (RTMP)
• Edge locations are not just READ only but you can write to them
• Objects are cached for the life of the TTL (Time to Live)
• Cached objects can be cleared if misinformation needs to be removed, but you will be charged
(invalidating the cache)

Snowball
• It is a large-scale data transport solution to reduce costs and increase security. It comes in a 50TB
or 80TB (of which only 72TB can be used) size
• Snowball edge is 100TB (of which only 83TB can be used) which has an onboard storage and
compute capabilities which can run lambda functions
• AWS snowmobile is an exabyte scale data transfer service up to 100PB per snowmobile
• Snowball can import to/from S3
• Snowball Edge devices have three options for device configurations – storage optimized, compute
optimized, and with GPU.

Storage gateway
• It is a virtual machine or physical device which replicates your on-premise data to the cloud. Data
is encrypted when sent to any AWS storage
• Three types
o File gateway (files are stored in S3 buckets and accessed through a NFS)
o Volume gateway (stored, cached) (storing your virtual hard drives in S3 using iCSI protocol)
o Tape Gateway (backing up tape-based backup application infrastructure to store data on
cloud)

EC2 (elastic compute cloud)

• It is web service that provides resizable compute capacity in the cloud and obtain and boot new
server instances to minutes, allowing you to scale quickly.
• Pricing models
o On demand: Fixed rate by the hour with no commitment. Good for applications with short
term and unpredictable workloads that cannot be interrupted
o Reserved: Contract terms are 1 or 3 years and offers a significant discount on the hourly
charge. Good for applications with predictable usage
• Standard: Can't convert type
• Convertible: Can convert type
• Schedule: Available within a specific time window
o Spot: Bid price you want for instance capacity. Good for applications that have flexible start
and end times and are only feasible for a very small price. If spot instances are terminated
manually amazon charges for the whole hour.
o Dedicated hosts: Physical servers dedicated to you. Good for application with govt.
regulations, for licensing which does not support multi -tenancy. Price is low for reserved
instances
• Termination protection is turned off by default when creating an instance
• On an EBS backed instance the default action is for the root EBS volume to be deleted when the
instance is terminated
• All inbound traffic is blocked, and all outbound traffic is allowed by default
• For all new AWS accounts there is a soft limit of creating 20 instances per region which can be
increased by requesting it via AWS support
• Security group changes take effect immediately.
• Multiple security groups can be applied to the same instance
• Security groups are STATEFUL. Which means if you enable an inbound rule the outbound rule is
enabled automatically
• You cannot block specific IP addresses using Security Groups
• You can assign up to 5 security groups to an instance in a VPC
• By default, Security groups deny everything, you can apply specific allow rules
• The underlying hypervisor for EC2 are either Nitro or Xen. Nitro is more recent
• To bring IP addresses from on premise servers to cloud, create a Route Origin Authorization (ROA)
then once done, provision and advertise your whitelisted IP address range to your AWS account
• EC2 instances are billed by the second

EBS (elastic block store)

• Provides persistent block storage volumes for use with EC2 instances. It’s a virtual hard disk on
cloud
• EBS is a network drive and not a physical drive so there might be a little bit of latency.
• You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time
snapshots. Snapshots are point in time copies of volumes
• Snapshots are incremental - this means that only the blocks that have changed since your last
snapshot are appended to the previous snapshot and moved to S3. That's why first snapshot takes
some time to create
• To create a snapshot for Amazon EBS volumes that serve as root devices, you should stop the
instance before taking the snapshot
• An in-progress snapshot is not affected by ongoing reads and writes to the volume and one can
still use the volume
• 5 types of EBS volumes
o General Purpose SSD (gp2)
o Provisioned IOPS SSD (io1)
o Throughput Optimized HDD (st1)
o Cold HDD (sc1): With a lower throughput limit than Throughput Optimized HDD, this is a
good fit ideal for large, sequential cold-data workloads. If you require infrequent access to
your data and are looking to save costs, Cold HDD provides inexpensive block storage.
o EBS Magnetic (standard)
• Only SSD volumes can be used a boot volumes
• When you create an EBS volume in an Availability Zone, it is automatically replicated within that
zone to prevent data loss due to a failure of any single hardware component
• Wherever the EC2 instance is, the EBS volume will also be in the same availability zone
• EBS cannot tolerate availability zone failure because all volumes are stored and replicated in the
same AZ only
• An EBS volume is off-instance storage that can persist independently from the life of an instance.
You can specify not to terminate the EBS volume when you terminate the EC2 instance during
instance creation
• Volume properties can be changed after an ec2 instance is launched
• To move a volume to another availability zone you have to create a snapshot of the volume. That
snapshot can be used to create an image (AMI). Another instance can then be launched using this
image and another availability zone as preference. (A way to do data migration. Another way is to
copy the AMI to another region)
• To increase the write throughput of an EBS volume, increase the size of the volume or arrange
multiple EBS volumes in Raid 0 configuration
• Raid 0 (striping) configuration is used for higher I/O performance and Raid 1 (mirroring)
configuration is used of higher redundancy
• EBS volumes support live configuration changes while in production which means that you can
modify the volume type, volume size, and IOPS capacity without service interruptions
• You can use Amazon Data Lifecycle Manager (Amazon DLM) to automate the creation, retention,
and deletion of snapshots taken to back up your Amazon EBS volumes
AMI Types (EBS backed volume and Instance Store)
• For EBS volumes: The root device for an instance launched from the AMI is an Amazon EBS volume
created from an Amazon EBS snapshot
• For Instance, store volume: The root device for an instance launched from the AMI is an instance
store volume created from a template store in Amazon S3. It is also called ephemeral storage. It is
ephemeral because it is a physical drive attached to your instance and provides better IO
• When a new EC2 instance is launched you can select an AMI which either has a EBS volume or and
instance store
• Instances with instance store volumes can't be stopped (only rebooted or terminated) and that's
why the storage is ephemeral. EBS volume backed instances can be stopped if a hypervisor
underneath is not working and restarted on another hypervisor without the data being lost.
• Both can be rebooted without losing data

Encryption with EBS (encrypted root device volumes and snapshots)


• An encrypted volume can be created while launching an EC2 instance
• To encrypt an existing unencrypted volume, create a snapshot of the EC2 volume. Once the
snapshot is created, go to actions, copy snapshot and encrypt the snapshot while creating the
copy. Next create an image from the encrypted snapshot. This AMI can be used to launch and EC2
instance with an encrypted volume
• Snapshots of an encrypted EBS volume are also encrypted and the data in transit from the
instance to the volume is also encrypted

CloudWatch
• It is a performance monitoring service to monitor the AWS resources as well as the applications
that run on AWS
• It can monitor things like
o Compute (EC2, ASG, ELB, Route53 health checks)
o Storage and content delivery (EBS Volumes, Storage gateways, CloudFront)
• It can monitor host level metrics like CPU usage, Inbound Network Traffic, Disk Read Operation,
Status Check (underlying hypervisor or the instance) but it does not provide memory usage. To get
additional metrics you have to install a CloudWatch agent on all instances
• CloudWatch with EC2 will monitor events every 5 minutes by default which can be changed to 1
minute by turning on detailed monitoring
• Regional and global dashboards can be created on CloudWatch
• CloudWatch event streams can also be created
• Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop,
terminate, reboot, or recover your EC2 instances
• CloudWatch gathers metrics about CPU utilization from the hypervisor for a DB instance, and
Enhanced Monitoring gathers its metrics from an agent on the instance
• To collect logs from your Amazon EC2 instances and on-premises servers into CloudWatch Logs,
AWS offers a new unified CloudWatch agent

CloudTrail
• AWS CloudTrail is different than CloudWatch in that it is like a CCTV camera which keeps track of
all the activity which is being performed on AWS through API calls (like auditing)
• CloudTrail stores all of its logs on S3 which are encrypted using Amazon S3 server side encryption
AWS Command Line (CLI)
• You can interact with AWS from anywhere in the world by using the CLI
• You can SSH into an EC2 instance and run "aws configure" and provide the access ID and secret
key to start using the CLI but that is not secure (because that file can be seen if the instance is
hacked). A better alternative is to attach an IAM role to the EC2 instance

Using Boot Strap Scripts


• While launching an instance, "User Data" section under advanced details can be used to run a
bash script when the EC2 instance is booting. It is a way to automate your infrastructure

Instance Metadata
• Used to get information about an instance (such as public ip)
• curl http://169.254.169.254/latest/meta-data/
• curl http://169.254.169.254/latest/user-data/

EFS (elastic file system)


• File storage service for EC2 instances. Storage capacity is elastic, growing and shrinking
automatically as you add and remove files
• It is different from EBS in that an EFS instance can be shared by multiple instances
• It supports the Network File System version 4 protocol
• You only pay for the storage you use
• Data is stored across multiple AZ's within a region
• When deciding between EFS and S3, if the data is changing rapidly and concurrently accessible
storage is needed from up to 1000s linux servers, select EFS

EC2 Placement Group


• 3 types
o Clustered: grouping within a single availability zone. Low network latency and high network
throughput
o Spread: group of instances that are placed on distinct hardware (different racks to prevent
problems due to hardware failure). Recommended for applications which have small
number of critical instances that should be kept separate from each other. This placement
group can only have 7 running instances per availability zone
o Partition: Different from spread in that it is a group of EC2 instances on different partitions.
Spread is for single instances and partition is for multiple instances. Can have maximum of 7
partitions per AZ and 100s of EC2s per partition

Databases in AWS
• Online transaction processing (OLTP) is used to pull a particular record (e.g. in RDS) whereas
Online analytics processing (OLAP) is used to run multiple queries to get an answer (e.g. in Data
Warehousing)

RDS (OLTP)
6 types of RDS databases in AWS
o SQL Server
o Oracle
o MySQL Server
o PostgreSQL
o Aurora
o MariaDB
• RDS (OLTP) has multi AZ and Read Replicas.
• Multi AZ is useful for backup as there is always a synchronized copy which can act as a failover
(used for Disaster Recovery).
• Read replicas allow you to have a read-only copy of your production database. It is useful when
most number of requests that are made are for reading. There can be as many as 5 replicas (used
for increasing performance and scaling and not for DR).
• To increase the write capacity the best option is the increase the size of RDS instance because the
primary database usually can't be distributed
• There can be read replicas of read replicas but can affect latency
• Read replicas can be promoted to their own databases but this breaks the replication
• Read replica option can be turned on only if backups are turned on
• RDS runs on virtual machines but you cannot log in to these operating systems
• RDS is not serverless (except Aurora). Even though we can't access the machines, Amazon can
• There are 2 types of backup in RDS. Automated and Database Snapshots
• Automated backups allow you to recover your database to any point in time within a retention
period accurate up to a second
• Automated backups are enabled by default. Backup data is stored in S3 and you get free storage
space equal to the size of your database. Backups are taken during a defined window and there
may be elevated latency during this period
• Database snapshots are done manually and are stored even after you delete your original RDS
instance, unlike automated backups
• The restored version of a database will be a new RDS instance with a new DNS endpoint
• Encryption at rest is supported for all RDS databases and uses Amazon KMS service
• RDS reserved instances are available for multi AZ deployments
• You manage your DB engine configuration through the use of parameters in a DB parameter
group.
Aurora
• It is a MySQL and PostgreSQL compatible, RDS engine. Comparable to MySQL it is 5 times more
efficient and 10 times less expensive
• 2 copies of your data is contained in each availability zone, with a minimum of 3 availability zones.
6 copies of your data
• Two types of read replicas available. Aurora replicas and a MySQL replica
• Automated backups are always enabled by default
• An aurora read replica can be created from a MySQL RDS instance. This can be used to create
another aurora database. This is a good way to migrate data from MySQL to Aurora

DynamoDB (NoSQL)
• NoSQL is used to store document type data or key value pairs. Amazon has DynamoDB as a NoSQL
database
• NoSQL databases are great when the schema, or the structure of your data, changes frequently.
NoSQL provides a non-rigid and flexible way of adding or removing new types of data
• It is used for applications that need consistent, single-digit millisecond latency at any scale
• DynamoDB is stored on SSD, spread across 3 geographically distinct data centers
• DynamoDB can have eventually consistent reads or strongly consistent reads (If multiple reads are
happening within 1 second)
• There will always be a charge for provisioning read and write capacity and the storage of data
within DynamoDB. There is no charge for transfer of data into DynamoDB if you stay in a single
region and no charge for the actual number of tables you create
• Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache that
can reduce Amazon DynamoDB response times from milliseconds to microseconds, even at
millions of requests per second.
• DynamoDB is managed by AWS and you can't access the underlying servers
• In DynamoDB a partition key design that doesn't distribute I/O requests evenly can create "hot"
partitions that result in throttling and use your provisioned I/O capacity inefficiently
• DynamoDB Time-to-Live (TTL) mechanism enables you to manage web sessions of your application
easily. It lets you set a specific timestamp to delete expired items from your tables. This makes
DynamoDB a great solution to maintain state data and make an architecture stateless
• Dynamo DB can be used in conjunction with Amazon S3 where DynamoDB stores the metadata
and the actual files are stored in S3

Redshift (Data Warehousing - OLAP)


• Data warehouse is for business intelligence. Used to pull in large and complex datasets.
• Redshift is amazon's data warehousing solution
• Redshift can be a single node or multi-node
• Redshift is a columnar data store and can be compressed much more than row-based data store
because similar data is stored sequentially on discs
• Redshift comes with Massively Parallel Processing (MPP)
• Redshift does not have read replicas and it does not scale
• Backups on redshift are enabled by default with a 1-day retention period with a max of 35 days
• It always attempts to maintain at least 3 copies of your data (the original and replica on the
compute nodes and a backup in Amazon S3)
• It is currently available in only 1 availability zone but can be restored to another one in case of a
disaster
• Amazon Redshift Spectrum is a feature of Amazon Redshift that enables you to run queries against
exabytes of unstructured data in Amazon S3 with no loading or ETL required
• In Amazon Redshift, you use workload management (WLM) to define the number of query queues
that are available, and how queries are routed to those queues for processing. WLM is part of
parameter group configuration. A cluster uses the WLM configuration that is specified in its
associated parameter group
• You can configure Amazon Redshift to copy snapshots for a cluster to another region

Elasticache
• Elasticache is a web service that makes it easy to deploy, operate, and scale an in-memory cache
in the cloud. It caches most common web queries.
• It supports two engines Redis and Memcached.
• Memcached is used when you need a simple cache to offload DB with multi-threaded
performance
• Redis is used when you need more advanced features

Route53
• DNS works on port 53 and hence the name
• DNS is used to convert human friendly domain names into an Internet Protocol (IP)
• A registrar is an authority that can assign domain names directly under one or more top level
domains like .com, .gov etc. Popular domain registrars include amazon, godaddy etc.
• Each domain name becomes registered in a central database known as whois database
• User searches for a domain, browser sends a request to the top level domain server which
provides the name server, the NS records give the SOA (start of authority) which gives the DNS
records
• Alias records are used to map resource record sets in your hosted zone to ELBs, CloudFront
distributions or S3 buckets that are configured as websites
• To route domain traffic to an ELB load balancer, use Amazon Route 53 to create an alias record
that points to your load balancer. An alias record is a Route 53 extension to DNS. It's similar to a
CNAME record, but you can create an alias record both for the root domain, such as
tutorialsdojo.com, and for subdomains, such as portal.tutorialsdojo.com. (You can create CNAME
records only for subdomains.)
• An A record (address record) is the fundamental type of DNS record used by a computer to
translate the name of the domain to an IP4 address
• A CName stands for canonical name which can be used to resolve one domain name to another
• ELBs do not have pre-defined IPv4 addresses; you resolve to them using a DNS name
• Given a choice always choose alias name instead of CName
• With Route53 there is a default limit of 50 domain names that you can manage which can be
increased by contacting AWS support
• In AWS, the most common records are:
• A: URL to IPv4
• AAAA: URL to IPv6
• CNAME: URL to URL
• Alias: URL to AWS resource.
• The following routing policies are available with Route53
o Simple routing: You can only have one record with multiple IP addresses. IP address is sent
to the requestor in a random order
o Weighted routing: Allows you to split your traffic based on different weights assigned
o Latency based routing: Allows you to route your traffic based on the lowest network latency
for your end user
o Failover routing: Used when you want to create an active/passive setup for disaster
recovery. Uses a health check
o Geolocation routing: Lets you choose where your traffic will be sent based on the
geographic location of your users
o Geoproximity routing: lets route53 route traffic to your resources based on the geographic
location of your users and your resources
o Multivalue answer routing: Same as simple routing but also does a healthcheck of the
resource before sending its IP

VPCs
• Glossary
o Subnet: A subnetwork or a subnet is basically a smaller network within a larger network. The
process of partitioning a network into subnets is called subnetting. Each computer on the
same subnet can communicate directly with each other but not directly with computers on
a different subnet
o Network Prefix and Host Identifier: IPV4 addresses consist of 2 different parts. First part is
network prefix which identifies the network the address belongs to and the second part is
the host identifier which identifies the host within the network
o Internet Gateway: It is a logical connection between an Amazon VPC and the Intenet. It
allows the resources in the VPC to access the internet
o Routing Tables: It is a data table stored in a router that lists the routes to a particular
network destination. The routing table contains information about the topology of the
network immediately around it. Routing tables apply to the whole subnet instead of just
instances
o Router: It is a networking device that forwards data packets between computer networks
o NAT: Network Address Translation

• Amazon Virtual Private Cloud lets you provision a logically isolated section of the Amazon Web
Services (AWS) Cloud where you can launch AWS resources in a virtual network that you define.
You have complete control over your virtual networking environment, including selection of your
own IP address range, creation of subnets, and configuration of route tables and network
gateways
• 1 subnet = 1 availability zones. Can't have 1 subnet in multiple AZs but vice versa is possible
• Subnets can communicate with other subnets across availability zones
• Security groups are stateful (for an inboud rule, an outbound is automatically created); network
access control lists are stateless (deny/allow rules for both inbound and outbound) security groups
and Network ACLs are used as security layers between subnets
• You can create a Hardware Virtual Private Network (VPN) connection between your corporate
datacenter and your VPC and leverage the AWS cloud as an extension of your corporate
datacenter
• VPC Peering: Allows you to connect one VPC with another via a direct network route using private
IP addresses. Instances behave as they were on the same private network. It can happen between
different accounts as well as different regions
• When creating a VPC in aws it by default creates a route table, network ACL and a security group
• Max size of VPC can be a CIDR with /16 and minimum is with /28
• Amazon always reserves 5 IP addresses within your subnets
• Steps to create a sample custom VPC with public and private subnets
1. Create a new VPC with the CIDR block mentioned (e.g. 10.0.0.0/24 will give 256 IPs)
2. Create two subnets in the custom VPC, one private and one public with the CIDR blocks
mentioned for both of them (e.g. 10.0.0.0/25 and 10.0.0.128/25)
3. Create an internet gateway for the custom VPC
4. A routing table is created by default for the VPC. Create another routing table to give the
public subnet access to the internet
5. Add the internet gateway (0.0.0.0/0) to this routing table to give the public subnet access to
internet
6. Go to the public subnet and change the routing table association to the new routing table
(by default it was associated with the default routing table which didn't have internet
access)
7. Change the public subnet settings so that EC2 Instances in it get a public IP address
8. Create EC2 Instances in both the subnets with different security groups
9. You will notice that you can't SSH into the private EC2 from the public because they are in
different security groups.
10. Let's create another security group which allows HTTP, HTTPS, SSH, MySQL/Aurora and
ICMP access from our public CIDR block
11. Go the private EC2 Instance and attach the new security group to it.
12. You can now SSH into your public instance, copy the key on the instance and SSH into the
private instance using the key
13. However, the private instance still doesn't have access to the internet. This can be done by
creating NAT gateway in the public subnet (because it connects to internet through the
Internet gateway) and editing the routing tables for the default main subnet and including
the NAT gateway
• An elastic network interface (ENI) is a logical networking component in a VPC that represents a
virtual network card. You can attach a network interface to an EC2 instance in the following ways:
1. When it's running (hot attach)
2. When it's stopped (warm attach)
3. When the instance is being launched (cold attach).

CloudFront
• CloudFront is used to increase content delivery performance by caching requests at edge locations
around the world.
• You can both write and read to CloudFront
• CloudFront also works on AWS 'global network backbone, enabling efficient transfer of requests
between CloudFront Edge locations and other AWS services in different regions and applications
• CloudFront supports both static and dynamic content

AWS Global Accelerator


• AWS Global Accelerator is a service that improves the availability and performance of your
applications with local or global users. It provides static IP addresses that act as a fixed entry point
to your application endpoints in a single or multiple AWS Regions, such as your Application Load
Balancers, Network Load Balancers or Amazon EC2 instances.
• AWS Global Accelerator uses the AWS global network to optimize the path from your users to
your applications, improving the performance of your TCP and UDP traffic. AWS Global
Accelerator continually monitors the health of your application endpoints and will detect an
unhealthy endpoint and redirect traffic to healthy endpoints in less than 1 minute.
• Unlike CloudFront, it has the capability to route the traffic to the closest edge location via an
Anycast static IP address

Network Address Translation


• NAT (Network Address Translation) instance is a single EC2 server whereas NAT Gateway is a
highly available gateway that allows us to have our private subnets communicate to the internet
without becoming public
• NAT instance is acting as a bridge between the private and public subnets. For that to happen we
have to disable source/destination checks because the instance is neither a source nor destination
• NAT gateway is a better alternative to a NAT instance because the NAT instance can easily get
overwhelmed if it doesn't have enough capacity. Gateways can scale as required
• NAT instances/gateway must be in a public subnet and there must be route out of the private
subnet to the NAT instance, in order to work
• NAT gateway cannot span multiple availability zones. They will have to be created in each zone
and routing can be configured accordingly

Network ACL
• VPC automatically comes with a default network ACL which allows all inbound and outbound
traffic. When creating a new NACL all inbound/outbound rules are denied by default
• Each subnet in your VPC must be associated with a network ACL. If not, the subnet is
automatically associated with the default ACL
• A network ACL can be associated with multiple subnets but not vice versa
• Network ACLs are followed in chronological order from low to high number e.g. if an allow rule is
before a deny rule, the allow rule will be followed
• Specific IP addresses can be denied by placing them before the allow rules

Network ACL vs Security Group


• ACLs trump security groups
Security group Network ACL
Operates at the instance level Operates at the subnet level
Supports allow rules only Supports allow rules and deny rules
Is stateful: Return traffic is automatically Is stateless: Return traffic must be explicitly
allowed, regardless of any rules allowed by rules
We evaluate all rules before deciding We process rules in number order when
whether to allow traffic deciding whether to allow traffic
Applies to an instance only if someone Automatically applies to all instances in the
specifies the security group when launching subnets that it's associated with (therefore, it
the instance, or associates the security provides an additional layer of defense if the
group with the instance later on security group rules are too permissive)
• When creating an internet facing ELB, you need at least two subnets to create it
• VPC flow logs is a feature that enables you to capture information about the IP traffic going to and
from your network interfaces. This is done using CloudWatch. It can be done at a VPC, subnet or
networking interface level. You cannot enable flow logs for VPCs that are peered with your VPC.
Once created, configuration for the flow log cannot be changed

Bastion Hosts (Jump Boxes)


• It is a special purpose computer on the network specifically designed to withstand attacks. The
computer generally hosts a single application and all other services are removed or limited to
reduce the threat to the computer. Instead of hardening all the EC2 servers on the private subnet,
only the bastion server needs to be hardened. This reduces the surface area for attacks

VPC Endpoint
• It enables you to privately connect your VPC to supported AWS services without requiring an
internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. Traffic
between your VPC and the other service does not leave the Amazon network. There are two types
of VPC endpoints: Interface Endpoints and Gateway Endpoints (only for Amazon S3 and
DynamoDB)

Security
• You can authenticate to your DB instance by enabling IAM DB authentication. IAM database
authentication works with MySQL and PostgreSQL. With this authentication method, you don't
need to use a password when you connect to a DB instance.
• The AWS Security Token Service (STS) is a web service that enables you to request temporary,
limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users
that you authenticate (federated users).
• If a company is using Active Directory for authentication, SAML (security assertion markup
language) based federation can be be used for access to AWS cloud
• AWS Directory Service for Microsoft Active Directory, also known as AWS Managed Microsoft AD,
enables your directory-aware workloads and AWS resources to use managed Active Directory in
the AWS Cloud.
• For higher levels of protection against attacks targeting your applications running on Amazon
Elastic Compute Cloud (EC2), Elastic Load Balancing (ELB), Amazon CloudFront, and Amazon Route
53 resources, you can subscribe to AWS Shield Advanced. AWS shield comes free of cost by
default.
• Perfect Forward Secrecy is a feature that provides additional safeguards against the
eavesdropping of encrypted data, through the use of a unique random session key. CloudFront
and Elastic Load Balancing are the two AWS services that support Perfect Forward Secrecy
• S3 doesn't provide AES-128 encryption, only AES-256
• AWS WAF is a web application firewall that enables you to create custom, application-specific
rules that block common attack patterns that can affect application availability, compromise
security, or consume excessive resources. You can administer AWS WAF completely through APIs,
which makes security automation easy, enabling rapid rule propagation and fast incident
response. AWS shield advanced can be added on top of AWS WAF if auditing is required
• Shared Responsibility framework:
ELB
• It is a physical or virtual device that is used to balance load across web servers
• 3 types
o Application Load Balancer: Best suited for balancing http and https traffic and work at Level
7. They are browser aware and balance load depending on what you do on the browser (e.g.
if language is changed to French, load is balanced across French servers). They have features
such as X-forwarded-for headers (Web servers will know what the IP address of user is
instead of just seeing ELB's IP) and sticky sessions
o Network Load Balancer: Best suited for balancing TCP traffic where extreme performance is
required. They work on Level 4.
o Classic Load Balancer: Legacy elastic load balancer and it is not application aware. . To
ensure that a Classic Load Balancer stops sending requests to instances that are de-
registering or unhealthy while keeping the existing connections open, use connection
draining. This enables the load balancer to complete in-flight requests made to instances
that are de-registering or unhealthy.
• A 504 error means the gateway has timed out. This means that either the web server or database
layer is not functioning. ELB is fine
• Target groups can be created for EC2 instances which will be used as targets for the ELBs
• Load balancers always have a DNS name. You are never given an IP address
• ELBs can scale but not instantaneously. Contact aws for a "warm up"

Advanced Load Balancing


• Sticky sessions allow you to bind a user's session to a specific EC2 instance (e.g. user saves
something in shopping cart and comes back to it later). For Application Load Balancer the
stickiness happens at the target level
• Cross Zone Load Balancing. Sounds exactly like what it is. If there are multiple ELBs in multiple
zones they will balance the load between all the EC2 instances across multiple availability zones
• Path patterns can be enabled when traffic needs to sent to specific web servers based on the path
in the URL

Launch Configurations and Auto Scaling Groups


• Launch configuration provides all the config details (e.g. bootstrap script, memory size, IAM,
security groups etc.) to the auto scaling group to be able to launch more EC2 instances if the load
increases
• A launch configuration once created cannot be modified. A new launch configuration can be
created and the ASG can be moved to the new launch config
• A schedule scaling policy can be configured for the ASG to start new instances at a specified time if
you predict a sudden increase in load
• When the ASG is scaling in, it finds the AZ with the greatest number of instances and then the EC2
instance with the oldest launch configuration will shut down first
• ASG has a minimum size, desired size and maximum size
• CloudWatch alarms can be used as scaling policies for the ASG
• Scaling policies can be based on CPU, Network etc. and can also be based on schedule
• ASG are free. You only get charged for the resources under it
• In Auto Scaling, the following statements are correct regarding the cooldown period:
1. It ensures that the Auto Scaling group does not launch or terminate additional EC2 instances
before the previous scaling activity takes effect.
2. Its default value is 300 seconds.
3. It is a configurable setting for your Auto Scaling group.

CloudFormation
• It is a way of completely scripting your cloud environment
• Templates are created which can deploy environments e.g. WordPress. It will create all the
resources required to run the application

Elastic Beanstalk
• It is a way for developers to easily deploy and manage a web application without knowing much of
AWS. It provisions AWS resources in the background
• Once the resources are deployed their configuration can be modified.
• Developers can upload their code and get the application up and running instantly

SQS (Simple Queue Service)


• It was the first AWS service
• It is a web service that gives you access to a message queue to store messages while waiting for a
computer (e.g. EC2 instances) to process them. This way requests are not lost if EC2 instances are
down and can't handle requests on the go
• The queue resolves issues that arise if the producer is producing work faster than the consumer
can process it
• SQS service helps decouple the components of an application so they run independently.
• SQS is pull based, not pushed based. Services will pull the messages from the queue and process
them. SNS is used for push
• Messages can contain up to 256 KB of text in any format but can go up to 2gb if required
(messages are then stored in S3)
• Messages can be kept in the queue from 1 minute to 14 days; default retention period is 4 days
• Visibility Time Out is the amount of time that messages are invisible in the queue after a reader
picks up the message. It becomes visible again after the time out if the Job hasn't been processed
in that time. Timeout default is 30 seconds and maximum is 12 hours.
• Amazon SQS long polling can be used to return a response only if a message arrives in the queue
• Two types
o Standard queue: Nearly unlimited number of messages. Messages are generally in the same
order but because of the distributed nature there can be some discrepancies. It guarantees
that messages are delivered at least once but there can be more than one copy of a
message that is delivered
o Fifo queues (first in first out): This guarantees that there are no duplicates and messages are
sent in the same order in which they were received. They are limited to 300 transactions per
second
• If you're using messaging with existing applications and want to move your messaging service to
the cloud quickly and easily, it is recommended that you consider Amazon MQ

SWF (Simple Work Flow Service)


• Web service that makes it easy to coordinate work across distributed application components. It is
a process combining automatic processing and manual processing (e.g. how amazon warehouses
run)
• Workflow executions have a retention period/execution time of up to 1 year
• It ensures that a task is never duplicated and is assigned only once.

SNS (Simple Notification Service)


• Web service that makes it easy to set up, operate and send notifications from the cloud
• Allows you to do push notifications
• Notifications can be pushed to mobile devices, SMS, emails, SQS or to a http endpoint
• Multiple recipients can be grouped using topics. A topic can support delivery to multiple endpoint
types.
• All messages published are redundant across multiple AZs

AWS Step Functions


• AWS Step Functions provides serverless orchestration for modern applications. Orchestration
centrally manages a workflow by breaking it into multiple steps, adding flow logic, and tracking
the inputs and outputs between the steps. As your applications execute, Step Functions maintains
application state, tracking exactly which workflow step your application is in, and stores an event
log of data that is passed between application components.

Elastic Transcoder
• It is a media transcoder in the cloud. Coverts media files from their source format that will play on
other devices
• Pay based on minutes and resolution at which you transcode

API Gateway
• It is a fully managed service that make it easy for developers to easily publish, maintain, monitor
and secure APIs at any scale
• It is a doorway into your AWS environment
• It uses AWS Lambda and hence serverless and scalable. You pay only for the API calls you receive,
and the amount of data transferred out
• Users do a call to the API gateway to perform various functions
• Requests can be throttled to prevent attacks
• Can be connected to CloudWatch to log all requests made to the API
• API caching can be enabled to cache endpoint's response for a specified TTL. This reduces the
number of calls made to the endpoint
• In computing the same origin policy makes sure that data is shared between two web pages only if
the origin is the same (same DNS). This prevents cross scripting attacks. However, in AWS the
services usually have different endpoint names and have to talk to each other. CORS (cross origin
resource sharing) can be enabled on the API gateway to overcome the same origin policy.

Kinesis 101
• Streaming data is data that is generated continuously by thousands of data sources (e.g.
transaction data from online stores, stock prices, game data
• Kinesis is a platform on AWS to send your streaming data to
• 3 types
o Kinesis Streams: All kinds of devices can send streaming data to kinesis streams which will
be store it for 24hr - 7 days. Different types of data are store in individual shards. Consumers
can analyze data from these shards. Once analyzed this data can be stored in other places
like S3, Dynamo DB, Redshift etc.
o Kinesis Firehose: All kinds of devices send streaming data to kinesis firehose but there is no
persistent data storage. There can be lambda functions which can analyze and perform
computations on the go and store the results somewhere. It can capture, transform, and
load streaming data into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and
Splunk, enabling near real-time analytics with existing business intelligence tools and
dashboards you’re already using today.
o Kinesis Analytics: If you want to perform analysis inside firehose, kinesis analytics can be
used

Athena
• Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3
using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only
for the queries that you run.

Web Identity Federation and Cognito


• Web identity federation lets you give your users access to AWS resources after they have
successfully authenticated with a web-based identity provider like google, Facebook etc.
• Cognito is web identity federation service. It acts as an identity broker between your application
and Web ID providers
• Users sends credentials to Facebook. Once Facebook has authenticated from the User Pool it
sends an authentication token to Cognito. Cognito converts it to a JWT (JSON Web Token) and
sends it back to the user. The user then sends the JWT to the identity pool which then grants the
user access to the AWS resources.
• Cognito tracks the association between user identity and the various different devices they sign-in
from. It uses SNS to send out notifications to all the devices if any information on the cloud
changes
Lambda
• It is a serverless compute service where you can upload your code and create a lambda function
• It can be used as an event-driven compute service or as a compute service to run your code in
response to HTTP requests using API gateway or API calls made using AWS SDKs
• Traditional vs Serverless Architecture: Serverless is instantly scalable without any configuration
changes. Everything is managed by AWS
• Lambda supports Node.js, Java, C#, Go and PowerShell
• Pricing is based on no. of requests and duration: First 1 million requests are free. $0.20 per 1
million requests. There is a price for every GB-second used (memory and duration)
• Serverless Technologies: Lambda, Aurora Serverless (only RDS which is serverless), DynamoDB, S3,
API Gateway, NAT Gateway
• AWS X-ray allows you to debug serverless applications like Lambda.
• Lambda can do things globally
• When you create or update Lambda functions that use environment variables, AWS Lambda
encrypts them automatically using the AWS Key Management Service. However, this data is still
visible to other users who have access to the lambda console. The best option in this scenario is to
use encryption helpers to secure your environment variables.
• Lambda functions can consume events from a variety of AWS sources, such as Amazon DynamoDB
update streams and Amazon S3 event notifications. You don’t have to worry about implementing
a queuing or other asynchronous integration method because Lambda handles this for you
• Maximum execution time is 5 minutes (now increased to 15 minutes)

ECS (Elastic Container Service)


• Amazon Elastic Container Service (Amazon ECS) is a fully managed container orchestration service.
ECS is a great choice to run containers for several reasons. First, you can choose to run your ECS
clusters using AWS Fargate which is serverless compute for containers.

AWS XRay
• You can use AWS X-Ray to trace and analyze user requests as they travel through your Amazon API
Gateway APIs to the underlying services. API Gateway supports AWS X-Ray tracing for all API
Gateway endpoint types: regional, edge-optimized, and private. You can use AWS X-Ray with
Amazon API Gateway in all regions where X-Ray is available.

Other content
• To connect your private network to a VPC
• The term pilot light is often used to describe a DR scenario in which a minimal version of an
environment is always running in the cloud.
• If you got your certificate from a third-party CA, import the certificate into ACM or upload it to the
IAM certificate store

Other Reads
• AWS Whitepapers – best practices
• AWS FAQs for popular services

You might also like