S ok
Training
Partner
Big Data on AWS
AWS-BIGD
In this course, you'll learn about cloud-based big data solutions like Amazon EMR, Amazon
Redshift, Amazon Kinesis, and the rest of the AWS big data platform. Learn to use Amazon EMR
to process data using the broad ecosystem of Hadoop tools like Hive and Hue, create big data
environments, work with Amazon DynamoDB, Amazon Redshift, Amazon QuickSight, Amazon
Athena and Amazon Kinesis, and design big data environments for security and cost-
effectiveness.
Course objectives 4 Use Apache Hadoop with Amazon EMR
4 Launch and configure an Amazon EMR cluster
4 Use common programming frameworks for Amazon EMR,
Including Hive, Pig, and Streaming
Use Hue to improve the ease-of-use of Amazon EMR
Use in-memory analytics with Spark on Amazon EMR
4 Understand how services like AWS Glue, Amazon Kinesis,
Amazon Redshift, Amazon Athena, and Amazon
QuickSight can be used with big data workloads
ence This course is intended for:
4 Individuals responsible for designing and implementing
big data solutions, namely Solutions Architects and SysOps
Administrators
4 Data Scientists and Data Analysts interested in learning
about big data solutions on AWS.
We recommend that attendees of this course have:
4 Basic familiarity with big data technologies, including Apache
Hadoop, HDFS, and SQL/NoSQL querying
4 Completed Data Analytics Fundamentals free digital training
oF equivalent experience
4 Working knowledge of core AWS services and public cloud
implementation
4 Completed the AWS Technical Essentials classroom training or
have equivalent experience
Pacts LoRool)S ok
Training
Partner
4 Basic understanding of data warehousing, relational database
systems, and database design
Course outline
Day!
Module 1: Overview of Big Data
4 What is big data
4 The big data pipeline
4 Big data architectural principals
Module 2: Big Data ingestion and transfer
4 Overview: Data ingestion
4 Transferring data
Module 3: Big data streaming and Amazon Kinesis
4 Stream processing of big data
4 Amazon Kinesis
4 Amazon Kinesis Data Firehose
4 Amazon Kinesis Video Streams
4 Amazon Kinesis Data Analytics
4 Hands-on lab 1: Streaming and Processing Apache Server Logs Using Amazon Kinesis
Module 4: Big data storage solutions
4 AWS data storage options
4 Storage solutions concepts
4 Factors in choosing a data store
Module 5: Big data processing and analytics
4 Big data processing and analytics
4 Amazon Athena
4 Hands-on lab 2: Using Amazon Athena to Analyze Log Data
Pacts LoRool)S ok
Training
Partner
Day2
Module 6: Apache Hadoop and Amazon EMR
4 Introduction to Amazon EMR and Apache Hadoop
4 Best practices for ingesting data
4 Amazon EMR
4 Amazon EMR architecture
4 Hands-on lab 3: Storing and Querying Data on Amazon DynamoDB
Module 7: Using Amazon EMR
4 Developing and running your application
4 Launching your cluster
4 Handling output from your completed jobs
Module 8: Hadoop programming frameworks
4 Hadoop frameworks
4 Other frameworks for use on Amazon EMR
4 Hands-on lab 4: Processing Server Logs with Hive on Amazon EMR
Module 9: Web interfaces on Amazon EMR
4 Hue on Amazon EMR
4 Monitoring your cluster
4 Hands-on lab 5: Running Pig Scripts in Hue on Amazon EMR
Module 10: Apache Spark on Amazon EMR
4 Apache Spark
4 Using Spark
4 Hands-on lab 6: Processing NY Taxi Data Using Apache Spark
Day3
Module
Using AWS Glue to automate ETL workloads
4 What is AWS Glue?
4 AWS Glue: Job orchestration
Dae Rol)S ok
Training
Partner
Module 12: Amazon Redshift and big data
4 Data warehouses vs. traditional databases
4 Amazon Redshift
4 Amazon Redshift architecture
Module 13: Securing your Amazon deployments
4 Securing your Amazon deployments
Amazon EMR security overview
AWS Identity and Access Management (IAM) overview
Securing data
Amazon Kinesis security overview
AAA AA
Amazon DynamoDB security overview
4 Amazon Redshift security overview
Module 14: Managing big data costs
4 Total cost considerations for Amazon EMR
‘Amazon EC2 pricing models
Amazon Kinesis pricing models
Cost considerations for Amazon DynamoDB
Cost considerations and pricing models for Amazon Redshift
4 Optimizing cost with AWS
Module 15:
isual
4 Visualizing big data
4 Amazon QuickSight
4 Orchestrating a big data workflow
4 Hands-on lab 7: Using TIBCO Spotfire to visualize data
Module 16: Big data design patterns
4 Common architectures
Module 17: Course wrap-up
4 What's next?
Dae Rol)OWS fase
Training
Partner
Debido a las constantes actualizaciones de los contenidos de los cursos por parte del fabricante, el
contenido de este temario puede variar con respecto al publicado en el sitio oficial, sin embargo,
Netec siempre entregars la version actualizada de éste
Dae Rol)