0% found this document useful (0 votes)
82 views

EMR Workshop Lab 0: Create VPC

This document provides steps to create an EMR cluster in AWS with a VPC, including: 1) Creating a VPC with a single public subnet and selecting it when launching the EMR cluster. 2) Configuring the cluster software (Hadoop, Ganglia, Hive, etc.), hardware (m3.xlarge instance types and counts), security (key pair), and general settings. 3) Updating the security group for the master node to allow SSH access from your IP address.

Uploaded by

Weslei Brasil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

EMR Workshop Lab 0: Create VPC

This document provides steps to create an EMR cluster in AWS with a VPC, including: 1) Creating a VPC with a single public subnet and selecting it when launching the EMR cluster. 2) Configuring the cluster software (Hadoop, Ganglia, Hive, etc.), hardware (m3.xlarge instance types and counts), security (key pair), and general settings. 3) Updating the security group for the master node to allow SSH access from your IP address.

Uploaded by

Weslei Brasil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

EMR  Workshop  Lab  0    

(Updated  27-­Oct-­16)  

-­  Cluster  Creation  
This  lab  demonstrates  the  steps  involved  in  cluster  creation.      
 

Create  VPC  
 
In  AWS  Mgmt  Console    
Click  on  VPC  
In  VPC  Dashboard  
Choose  Start  VPC  Wizard  
In  Step  1:  Select  a  VPC  Configuration  
Choose  VPC  with  a  Single  Public  Subnet  
In  Step  2:  VPC  with  a  Single  Public  Subnet  
Enter  a  VPC  name.  
Keep  the  defaults  on  everything  else.  
Click  Create  VPC    
 
 
EC2  key  pair  
Make  sure  you  have  an  EC2  key  pair  in  the  region  you  are  using.  

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-­key-­
pairs.html#having-­ec2-­create-­your-­key-­pair  

 
Launch  EMR  Cluster  
Open  the  Amazon  EMR  console  at  
https://console.aws.amazon.com/elasticmapreduce/  
 

Click  Create  cluster.  

In  Create  Cluster  
Click  ‘Go  to  advanced  options’  

Step  1:  Software  and  Steps  


Vendor   Leave  as  ‘Amazon’  

Release   Leave  as  default  

Software  Configuration   Ensure  that  the  following  are  checked:  

  •   Hadoop    
•   Ganglia  
•   Hive  
•   Zeppelin  
•   Presto  
•   Tez  
•   Pig  
•   Hue  
•   Spark  

Edit  Software  Settings   Leave  as  default  


Add  Steps   Leave  as  default  

Click  ‘Next’  

 
Step  2:  Hardware  Config  
Network   Choose  previously  created  VPC  
EC2  Subnet   Choose  the  public  subnet  
Instances   Set  the  cluster  instances  and  counts  as  follows:  

Master:  m3.xlarge,  count  =  1  


Core:        m3.xlarge,  count  =  2                
Task:        m3.xlarge,  count  =  10          

Click  ‘Next’  

Step  3:  General  Cluster  Settings  


Cluster  Name   Name  your  cluster  
Logging   Leave  checked  
Choose  a  bucket  in  this  region  
Debugging   Leave  checked  
Termination  Protection   Leave  checked  
   
Tags     Leave  Blank  
   
EMRFS  Consistent  View   Leave  unchecked  
Bootstrap  actions   Leave  alone  

Click  ‘Next’  

 
Step  4:  Security  
EC2  Key  Pair   Choose  a  key  pair  in  the  region  
Cluster  visible   Leave  checked  
Permissions   Choose  Default  
EC2  Security  Groups   Leave  as  default  
Encryption  Options   Leave  as  default  

Click  ‘Create  Cluster’.  

 
Update  Security  Group  
•   In  the  details  page  for  your  cluster,  scroll  down  and  click  on  
the  security  group  shown  for  ‘Security  Group  for  Master’  
•   Click  on  the  security  group  for  ‘ElasticMapReduce-­master’  
•   Click  on  the  ‘Inbound’  tab.  
•   Click  the  ‘Edit’  button.  
•   Click  the  Add  Rule  button.  
•   Add  a  rule  that  allows  SSH  from  your  IP  Address.  
•   Click  Save.  

This  will  you  to  SSH  into  the  cluster  when  it  comes  up  in  about  
10-­15  mins.  

 
 
 

You might also like