0% found this document useful (0 votes)
31 views

Course - Big Data Technology

Big data

Uploaded by

raghav7253
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Course - Big Data Technology

Big data

Uploaded by

raghav7253
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Big Data Technology

Dashboard / My courses / 8

Bridge Course in BIG DATA Technologies Your progress 

A programme to create an up-skilling / re-skilling ecosystem in BIG DATA Technologies to facilitate continuous enhancement of skills as well
as knowledge of IT professionals in line with their aspirations and aptitude.

Instructions for Bridge Course:

Course Overview:

1. The Total course duration:110 Hours in 60 days.

Theory: 35 Hours

Lab: 45 hours

Project: 15 hours

Offline: 15 hours

2. A total of 15 hours session with faculty will be conducted for discussions on:

• Case study for each module

• Latest industry trends

• Assignment and Clearing of concepts

3. A dedicated 15 hours for Project work and Online demo of the same will be given by the candidate.

4. The modules in the course will be enabled based on the candidate’s progress i.e., the next module will be accessible only, when the previous
module is completed.

5. Doubt Clearing Session: 2 Modes

• Online Meeting: Monday, Thursday (4 PM to 5PM) (Working days).

• Mail: To be responded by CDAC Noida team within 24 hours (working days).

6. Eligibility Criteria to Qualify for Assessment:

• The candidate should complete at least 80% of the course to be eligible for the assessment.

• The candidates should attempt quiz and assignment after each module, with 60% passing rate.

For Contact:Tushar Patnaik: [email protected]

About the Course:

The training program in Big Data Analytics course is designed with the objective to develop multi-disciplinary skill-sets in Big Data
Technologies, enhancing technical acumen in policy & decision-making roles and steering big projects efficiently with speed, scale and agility
across sectors. The course includes working on advanced framework of Hadoop, MapReduce programming technique, Hive,
Predictive Analytics Using Python, Visualization, Statistical Techniques and many more alongwith Practical and real-time Case studies on
Cloudera. The participants who have successfully undergone the training programme would be provided with a certification.

The programme would run in online /blended mode, which is its salient feature, and aims to facilitate the continuous enhancement of skills,
create awareness amongst IT professionals about the impact of evolving technological ecosystem and train them with relevant skill sets.

Applications:
With most business transactions and customer conversations happening online, a huge amount of data is being generated every day. This data
holds the secret to improving business processes and identifying new innovations in the process. Big Data is a collection of data that is huge in
volume, yet growing exponentially with time. It is data with so large size and complexity that none of the traditional data management tools can
store it or process it efficiently. Big Data techniques can help classify, sort, and analyses data to find hidden insights. Completing a course in
Big Data will equip you with the knowledge and expertise to analyze Big Data. Big data has found many applications today in various fields like
banking, agriculture, chemistry, data mining, cloud computing, finance, marketing, healthcare, stock exchanges, social media sites, jet engines
etc.

Outcomes:
To make data driven decisions Data Analytics & Predictive Analytics Industry ready & Have edge over their peers After completing this courses
candidates shall be expert in: Studying Big Data will broaden their horizon by Surpassing Market Forecast and uncover new opportunities.
Course USP:
Hadoop HDFS and YARN MapReduce framework and pig Apache Hive NoSQL MongoDB Tableau Spark Scala Operations Python Data Science
and Data Analytics using Python.

What you will get?

Joint Certification from MeitY, Govt. of India & NASSCOM Highly qualified faculties and experts from Academia and Industry, those will be
imparting knowledge by sharing their expertise and experiences in Big Data Technologies. A concept of ‘skills passport’ is introduced for
learners which they acquire during their re-skilling/ up-skilling journey with incentives and badges. A learner successfully completing the course
would be given a badge and the same will be added to his 'Skills Passport' which will act as a repository for his future employability. The
badges and certificates earned, can also be shared with peer networks such as the LinkedIn network of professionals, resulting in further value
for the learners.

Job Profile:

Business Analyst - Provide end to end solution to the client Solution Architect - Convert business solution into technical requirement
Data Integrator - Combines data from different sources to present it collectively
Data Architect - Collection, storage, transfer and extraction of data
Data Analyst - Using software to analyze data to develop insights
Data Scientist - Using technology to analyze data for making decisions Career opportunities in Big Data are numerous and identifying which is
more suitable for any given individual depends on interests, career path, skills and abilities.
Some well-known Big Data career paths are Data Analyst, Data Scientist, Big Data Engineer, Data Modeler, Solution Architect and many more.
With the rising demand that industries are witnessing, it is an ideal time to add Big data skills to your curriculum vitae and offer yourself the
wings to fly in the job market with the ample of Big Data jobs available today.

Centre Preference for Offline / Online mentoring

Course Introduction

Big Data and Hadoop


Unit 1.1.1 Introduction to Big Data Technologies

Unit 1.1.2 Introduction to Big Data and Big Data Challenges

Unit 1.1.3 Big Data Applications

Unit 1.1.4 Types of Big Data Technologies

Unit 1.2.1 Introdution to Hadoop

Unit 1.2.2 Limitations and Solutions to Big Data Architecture

Unit 1.2.3 What is Hadoop

Unit 1.2.4 Hadoop Ecosystem

Unit 1.2.5 Hadoop Features

Unit 1.2.6 Hadoop Core Components Part-1

Unit 1.2.7 Hadoop Core Components Part-2

Unit 1.2.8 HDFS Architecture

Unit 1.2.9 HDFS Daemons Properties

Unit 1.2.10 Replication and Rack Awareness


Unit 1.2.11 Rack Awareness Example-20210528T041941Z-001

Unit 1.2.12 HDFS Read Write Operation Part-1-20210528T041956Z-001

Unit 1.3.1 Hadoop Architecture and HDFS

Unit 1.3.2 Hadoop 2.x Cluster Architecture

Unit 1.3.3 Federation and High Availability Architecture

Unit 1.3.4 Hadoop Cluster Modes

Unit 1.3.5.1 Common Hadoop Shell Commands Part-I

Unit 1.3.5.2 Common Hadoop Shell Commands Part-II

Unit 1.3.6.1 Common Hadoop Shell Commands Part-III

Unit 1.3.6.2 Common Hadoop Shell Commands Part-IV

Unit 1.3.7 Hadoop 2.x Configuration Files

Unit 1.3.10 Basic Hadoop Administration

Unit 1.4.1 Hadoop Map Reduce Framework

Unit 1.4.2 Why Map Reduce

Unit 1.4.3 YARN Components and Architecture

Unit 1.4.4 YARN MapReduce Application Execution Flow

Unit 1.4.5 Anatomy of Map Reduce Program

Unit 1.4.6 Input Splits, Relation between Input Splits and HDFS Blocks

Unit 1.4.7 Map Reduce Combiner and Partitioner

Unit 1.4.8 Demo of Map Reduce

Unit 1.5.1 Advanced Hadoop MapReduce

Unit 1.5.2 Counters

Unit 1.5.3 Distributed Cache

Unit 1.5.4 Reduce Join

Unit 1.5.5 Custom Input Format

Unit 1.5.6 Output Formats

Unit 1.5.7 Sequence Input Format

Quiz on Hadoop Eco system

Unit 1.6.1 Apache PIG Learning Objectives

Unit 1.6.2 Introduction to Apache Pig

Unit 1.6.3 Map Reduce vs Pig

Unit 1.6.4 Pig Components and Execution

Unit 1.6.5.1 Pig Latin Programs Fundamentals Part-I

Unit 1.6.5.2 Pig Latin Programs Fundamentals Part-II

Unit 1.7.1 Pig built in functions Learning Objectives

Unit 1.7.2 Introduction to Pig Built In Functions

Unit 1.7.3 Shell and Utility Commands

Unit 1.7.4.1 Pig LATIN Commands PART -I

Unit 1.7.4.2 PIG LATIN COMMANDS PART -I(Continued)

Unit 1.7.5.1 PIG LATIN COMMANDS PART –II

Unit 1.7.5.2 PIG LATIN COMMANDS PART –II(Continued)

Unit 1.7.6.1 PIG UDF


Unit 1.7.6.2 PIG UDF PRACTICAL

Unit 1.7.7 Analytics using PIG

Unit 1.8 Apache Hive Learning Objectives

Unit 1.8.1 Introduction to Apache Hive

Unit 1.8.2 Hive vs Pig

Unit 1.8.3 Hive Architecture and Components

Unit 1.8.4 Limitations of Hive

Unit 1.8.5 Hive Data Types and Data Models

Unit 1.9.1Hive Tables (Managed Tables and External Tables)

Unit 1.9.2 Hive Tables Practical Part-I

Unit 1.9.3 Hive Tables Practical Part-II

Unit 1.9.4 Importing Data Querying Data and Managing Outputs

Unit 1.9.5 Importing Data Querying Data and Managing Outputs(Practical)

Unit 1.9.6 Hive Partition

Unit 1.9.7Hive Partition Practical

Unit 1.9.10 Hive Bucketing

Unit 1.11.1 Overview of RDBMS

Unit 1.11.1 Overview of RDBMS

Unit 1.11.2 Limitations of RDBMS

Unit 1.11.3 Introduction to NoSQL

Unit 1.11.4 NoSQL in Brief Part 1

Unit 1.11.5 NoSQL in Brief Part 2

Unit 1.11.6 SQL and NoSQL A Comparative Study

Unit 1.11.7 Categories of NoSQL Databases

Unit 1.12.1 Introduction to MongoDB

Unit 1.12.2 Installation Guidelines of MongoDB

Unit 1.12.3 Components of MongoDB

Unit 1.12.4 The MongoDB Data Model

Unit 1.12.5 JSON and MongoDB

Unit 1.13.1. Define Database in MongoDB

Unit 1.13.2. Define a Collection in MongoDB

Unit 1.13.3. CRUD Operation in MongoDB- (Create)-Part-1

unit 1.13.4. CRUD Operation in MongoDB- (Create)-Part-2

Unit 1.13.5. CRUD Operation in MongoDB- (Read)

Unit 1.13.6. CRUD Operation in MongoDB- (Update)

Unit 1.13.7. CRUD Operation in MongoDB- (Update)-Part-2

Unit 1.13.8.CRUD Operation in MongoDB- (Delete)

Unit 1.14.1 The find() in MongoDB

Unit 1.14.2 Key projection in find()

Unit 1.14.3 Querying in find()

Unit 1.14.4 Querying in find() part 2

Unit 1.14.5 Querying in find() Part 3


Unit 1.14.6 Find method with limit and skip

Unit 1.14.7 Mongoimport in MongoDB

Unit 1.14.8 Mongoexport

Unit 1.15.1 Aggregation in MongoDB

Unit 1.15.2 Project stage operator in aggregation part 1

Unit 1.15.3 Project stage operator in aggregation part 2

Unit 1.15.4 Group stage operator in MongoDB part 1

Unit 1.15.6 Match stage operator in MongoDB

Unit 1.15.5 Group stage operator in MongoDB part 2

1.15.6 Indexing in MongoDB

Contents on Big Data and Hadoop

Quiz on Hadoop

Restricted Not available unless: You achieve a required score in Quiz on Hadoop Eco system

Click here to join the virtual lab

Click here to join the virtual lab

Working with Sparks


Unit 2.1.0 Learning Objective of Introduction to Apache Spark

Unit 2.1.1 Introduction to Sparks

Unit 2.1.2 Introduction to Apache Spark

Unit 2.1.3Spark Components and Architecture

Unit 2.1.4 Spark deployment Modes

Unit 2.1.5- Learning Objective of Introduction to Spark

Unit 2.1.6 Introduction to Apache Spark animated

Unit 2.1.7 Spark Components and its Architecture animated

Unit 2.1.8Spark Deployment Modes animated

Unit 2.1.9 Introduction to Sparks

Unit 2.1.10 Hands on Sparks 1

Unit 2.1.11 Hands on Sparks 2

Unit 2.1.11 Hands on Sparks 3

Data Science
Unit 3.1.1 Introduction to Numpy

Unit 3.1.2 Installation to Numpy

Unit 3.1.3 Import Numpy

Unit 3.1.4 Creation of array (Part 1)

Unit 3.1.5 Creation of Array ( Part 2)

Unit 3.1.6. Installation Guideline of Anaconda

Unit 3.1.7 Hands on_Numpy

Unit 3.2.1 Introduction to Pandas

Unit 3.2.2 More on Pandas


Unit 3.2.3 Hands on Pandas

Unit 3.2. 4 Hands on handling Missing Data

Unit 3.2.5 Hands on data pre processing

3.3.1 Introduction to Data Visualization in Python

Unit 3.3.2 Hands on Data Visualization through Matplotlib

Unit 3.4.1 Introduction to Machine learning

Unit 3.4.2.1 Types of Learning

Unit 3.4.2.2 Types of Machine learning Algorithm

Unit3.4.3 Uses of machine learning Algorithm

Unit 3.4.4 Introduction to data Science

Unit 3.4.5 Predictive Analytics

Unit 3.4.6 Data preprocessing

Unit 3.5.1 Unsupervised Learning

Unit 3.5.2 Clustering

Unit 3.5.3 K-means Clustering Algorithm

Unit 3.5.4 Working Example of K-Means

Unit 3.5.6 Implementation of K-means Clustering Algorithm

Unit 3.5.5 Practical Implementation of K-Means

Unit 3.5.7 Determine Optimal Value of K

Unit 3.5.8 Elbow Method Implementation

Unit 3.5.9.1 Hierarchical Clustering

Unit 3.5.9.2 Hierarchical Clustering Example

Unit 3.5.10 Linkages in Hierarchical Clustering

Unit 3.5.11 Implementing Agglomerative Clustering

Unit 3.5.12 Measuring Quality of Clusters

Unit 3.5.13 Dimensioanlity Reduction

Unit 3.5.14 Principal Component Analysis

Unit 3.5.15 Steps Used in Calculating PCA

Unit 3.5.16 Implementation of PCA

Unit 3.7.1 Linear Regression

Unit 3.7.2 Implementation of Linear Regression

Unit 3.7.3 Classification

Unit 3.7.4.1 Decision Trees

Unit 3.7.4.2 Decision Tree Classifier

Unit 3.7.5 Specify Test Condition in Decision Tree

Unit 3.7.6.1 Determine Best Split Part 1

Unit 3.7.6.2 Determine the Best Split Part 2

Unit 3.7.7 Stopping Condition

Unit 3.7.8 Implementing DT Algorithm

Unit 3.7.9 Overfitting Underfitting

Unit 3.7.10 Controlling Overfitting

Unit 3.8.1 Naive Bayes Classifier


Unit 3.8.2 Naive Bayes Example

Unit 3.8.3 Types of Naive Bayes Classifier

Unit 3.8.4 Implementation of Naive Bayes Classifier

Unit 3.8.5 K-Nearest Neighbour Classifier

Unit 3.8.6 Knn Implementation

10. Clustering: the Unsupervised Learning

11. Hands on K-mean

12. Hands on Agglomerative

13. Naive Bayes Algorithm

14. Hands on Naive Bayes Algorithm

15. Regression Analysis

16. Hands on Linear Regression

17. Hands on Multiple Linear Regression

Quiz on Data Science

Not available unless:


Restricted
You achieve a required score in Quiz on Hadoop
You achieve a required score in Quiz on Hadoop Eco system

quiz on Big data

Not available unless:


Restricted
You achieve a required score in Quiz on Data Science
You achieve a required score in Quiz on Hadoop Eco system
You achieve a required score in Quiz on Hadoop

You are logged in as Rohit Kumar (Log out)


Home

Data retention summary


Get the mobile app

You might also like