Data Science Portfolio
Data Science Portfolio
85000+ Candidates
Engineers Trained
11600+ OEM
Training Days, 7400+ Operator Training Days
2100+
Training Sessions
94%
Excellent Review on Trainers
15+
Countries
01
Company Overview
Why Us?
Update, High Quality Course Experienced and Engaging One Stop Shop
Materials Instructors
The learning solutions delivered by us have
Our portfolio of courses are developed Our customized programs are practical, our proven successful for telecommunication
utilizing the latest learning and delivery content is relevant and our trainers are equipment manufacturers, service providers,
techniques to meet the needs of corporates engaging. educational Institutions and management
seeking a broad understanding consulting firms.
02
What is Data Science?
Wikipedia says “Data science is a multi-
disciplinary field that uses scientific
methods, processes, algorithms and
systems to extract knowledge and insights
from structured and unstructured data”.
Today, professionals in the field of data
clearly understand that they must go above
and beyond the traditional skills of
analyzing data, data mining, and
programming skills. In order to uncover
useful information for their companies,
these professionals must master the
complete life cycle of data science.
03
List of Courses
This portfolio showcases different courses under the umbrella of Data Science that will help your employees and
clients understand the basics of this domain and also take a deep dive into the world of Data Science. Our courses
are curated by experts in this field and can be further customized to cater to your needs.
Contents
Data Science in Python - Basic ………………………………………………………………………………………………………......................... 5
Data Science in Python - Advanced ……………………………………………………………............................................................. 6
Data Science in R - Basic to Intermediate ………………………………………………………………….……........................................ 7
Data Science in R - Advanced ..……………………………………………………………….................................................................. 9
Hadoop 2.0 Internals, Ecosystem and Analytics …………………………………………............................................................ 13
HBase …………………………………………..……………………………………………………………............................................................ 15
Modern Natural Language Processing with Deep Learning …..……………………............................................................ 15
Spark Internals ……………………………..……………………………………………………………............................................................ 15
Spark 2 Internals & Performance Tuning ……….……………………………………………........................................................... 16
MongoDB ……………………………………………………..……………………………………………........................................................... 16
Note
The following course outlines are not final, instead they give an overview of what these courses have to
offer. The actual course contents are more detailed and can be customized as per your expectations.
04
Data Science in Python – Basic
04 Days
This program is designed to give the participants a strong flavor of using Python for data management and
analytics. We would use Anaconda distribution of Python 3 to implement the hands-on aspects of the training. The
program has no pre-requisites in programming and will focus exclusively on data manipulation and visualization.
We will use 3 – 4 datasets (excel sheets) extensively to understand how Python 3 can be used to process financial
data.
Course Objectives
Course Content
Day 1 Day 2
Day 3 Day 4
Data Management in Python Visualization in Python
• Numpy Arrays • Line Plot
• Indexing • Scatter Plot
• Processing Arrays • Histogram
• Introduction to Pandas • Customizations
• Merge • Plotting 2D Arrays
• Combine • Statistical Plots using Seaborn
• Reshape • Dealing with time series
• Pivot • Introduction to Interactive plots with bokeh
• Groupby • Introduction to Interactive plots with plotl
• Aggregate
• Cross-tabs
05
Data Science in Python - Advanced
05 Days
This program will expose the participants to various techniques in predictive modelling and machine learning using
Python. The program will start with the application of statistics using Excel and then proceed into different aspects
of Machine Learning. Upon completion, the participants will be comfortable with topics in Supervised,
Unsupervised and Reinforcement Learning.
Course Objectives
Course Content
Day 4 Day 5
Unsupervised Learning and Text Deep Learning
Mining
Introduction to Deep Learning
• Clustering CNN
• PCA RNN
• Text Processing SOM
• Text Mining models Auto Encoders
• Artificial Neural Networks LSTM
• Support Vector Machine GAN
06
Data Science in R – Basic to Intermediate
03 Days
The Data Science with R has been designed to give you in-depth knowledge of the various data analytics
techniques that can be performed using R. The data science course is packed with real-life projects and case
studies, and includes R Studio Lab for practice and also provides an in-depth understanding of the R language, R-
studio, and R packages. You will learn the various types of apply functions including DPYR, gain an understanding of
data structure in R, and perform data visualizations using the various graphics available in R.
Course Objectives
Course Content
Day 1
Module 01 - Introduction to R Module 02 – R Programming Module 03 – R Data Structure
07
Data Science in R – Basic to Intermediate
Course Content
Day 2
Module 04 - Apply Functions Module 05 - Dplyr Package - An Module 06 - Group Projects on
Overview Financial Data / Sales Data
• Introduction
• Objectives • Dplyr Package - The Five Verbs • Project Discussion
• Types of Apply Functions • Installing the Dplyr Package • Action Plan to Implement
• Apply() Function • Use the Select Function
• Lapply() Function • Functions of Dplyr-Package -
• Sapply() Function Filter()
• Tapply() Function • Functions of Dplyr Package -
• Vapply() Function Arrange()
• Mapply() Function • Use Arrange Function
• Functions of Dplyr Package -
Mutate()
• Functions of Dply Package -
Summarise()
• Use Summarise Function
• Quiz
• Summary
Course Content
Day 3
Module 07 – Data Visualization Module 08 - Decision Trees and Module 10 – Group Projects on
Random Forest [Introduction] Financial Data / Sales Data
• Introduction
• Objectives • Decision Trees with Package party • Group Wise Project Presentation
• Graphics in R • Decision Trees with Package rpart • Summary
• Types of Graphics • Random Forest
• Bar Charts
• Pie Charts Module 09 – Regression
• Histograms [Introduction]
• Kernel Density Plots
• Line Charts • Linear Regression
• Box Plots • Logistic Regression
• Heat Maps • Generalized Linear Regression
• Word Clouds • Non-linear Regression
• File Formats for Graphic Outputs
• Saving a Graphic Output as a File
• Exporting Graphs in RStudio
• Exporting Graphs as PDFs in RStudio
• Demo - Save Graphics Using RStudio
08
Data Science in R – Advanced
04 Days
The Data Science with R has been designed to give you in-depth knowledge of the various data analytics
techniques that can be performed using R. The data science course is packed with real-life projects and case
studies, and includes R Studio Lab for practice. This course will help you in:
Mastering R language: The data science course provides an in-depth understanding of the R language, R-studio,
and R packages. You will learn the various types of apply functions including DPYR, gain an understanding of data
structure in R, and perform data visualizations using the various graphics available in R.
Mastering advanced statistical concepts: The data science training course also includes various statistical
concepts such as linear and logistic regression, cluster analysis and forecasting. You will also learn hypothesis
testing.
Group Case Study for Solving real Time Problems on existing Datasets
Course Objectives
Course Content
Day 1
Module 01 – R Module 02 – R Data Module 03 – Apply Module 04 – Data
Programming (Quick Structure Functions Visualization
Review with Assignments
and Examples) • Introduction • Introduction • Introduction
• Objectives • Objectives • Objectives
• Introduction • Types of Data • Types of Apply • Graphics in R
• Objectives Structures in R Functions • Types of Graphics
• Operators in R • Vectors • Apply() Function • Bar Charts
• Arithmetic Operators • Matrices • Lapply() Function • Pie Charts
• Relational Operators • Accessing Matrix • Sapply() Function • Histograms
• Logical Operators Elements • Tapply() Function • Kernel Density Plots
• Assignment Operators • Arrays • Vapply() Function • Line Charts
• Conditional Statements • Data Frames • Mapply() Function • Box Plots
in R • Elements of Data • Dplyr Package - An • Heat Maps
• If else() Function Frames Overview • Word Clouds
…continued on next page
09
Data Science in R – Advanced
Course Content
Course Content
Day 2
Module 05 – Introduction to Statistics Module 06 - Hypothesis Testing Module 07 - Hypothesis Testing II
10
Data Science in R – Advanced
Course Content
Day 2 (continued from last page)
Course Content
Day 3
• Introduction • Introduction
• Objectives • Objectives
• Introduction to Regression Analysis • Introduction to Classification
• Use of Regression Analysis - Examples • Examples of Classification
• Types Regression Analysis • Classification vs. Prediction
• Simple Linear Regression Model • Issues Regarding Classification and Prediction
• Correlation • Data Preparation Issues
• Find Correlation • Evaluating Classification Methods Issues
• Method of Least Squares Regression Model • Decision Tree
• Coefficient of Multiple Determination Regression Model • Classification Rules of Trees
• Standard Error of the Estimate Regression Model • Overfitting in Classification
• Dummy Variable Regression Model • Tips to Find the Final Tree Size
• Interaction Regression Model • Basic Algorithm for a Decision Tree
• Non-Linear Regression Models • Statistical Measure - Information Gain
• Demo - Perform Regression Analysis with Multiple • Enhancing a Basic Tree
Variables • Demo - Model a Decision Tree
• Perform Regression Analysis with Multiple Variables • Model a Decision Tree
• Non-Linear Models to Linear Models • Naive Bayes Classifier Model
• Bayesian Theorem
11
Data Science in R – Advanced
Course Content
Day 3 (continued from last page)
Course Content
Day 4
Module 10 – Clustering and Module 11 – Shiny Apps and Dashboard Module 12 – Text
Segmentation Mining and NLP
• Introduction
• Introduction • App architecture • Introduction to text
• Objectives • App template mining using R
• Introduction to Clustering and • Hello Shiny • Text mining definition
Segmentation • Shiny Text • Text preprocessing
• Clustering vs. Classification • Inputs and Outputs • Sentiment Analysis
• Use Cases of Clustering • Reactivity • N-grams
• Clustering Models • BUILDING AN APP • Case Study
• K-means Clustering • The server function • Conclusion and
• Pseudo Code of K-means • Sharing apps Summary
• K-means Clustering - Case Study • Shinyapps.io
• Perform Clustering Using Kmeans • Shiny servers
• Hierarchical Clustering • UI & Server
• Requirements of Hierarchical • Inputs & Outputs
Clustering Algorithms • Run & Debug
• Agglomerative Clustering Process • TOOLING UP
• Perform Hierarchical Clustering • Sliders
• DBSCAN Clustering • Tabsets
• k-means Clustering for Customer • DataTables
Segmentation: A Practical Example • More Widgets
• Which Customers are in Each • Uploading Files and Downloading Data
Segment? • HTML UI and Dynamic UI
• Determining the Preferences of the • Scoping
Customer Segments • Client Data
• Quiz • Sending Images
• Summary • Case Study
• Conclusion and Summary
12
Hadoop 2.0 Internals, Ecosystem and Analytics
05 Days
Hadoop is the solution to Big Data problems. It is the technology to store massive datasets on a
cluster of cheap machines in a distributed manner. Not only this, it provides Big Data analytics through distributed
computing framework. It is an open-source data platform or framework developed in Java, dedicated to store and
analyze large sets of unstructured data.
Course Content
Streaming in Hadoop
• Streaming with key/value pairs,
Streaming with Unix commands,
Streaming with the Aggregate
package, Streaming with scripts.
13
Hadoop 2.0 Internals, Ecosystem and Analytics
Course Content
Day 4 Day 5
14
Synopsis of other additional courses offered
05 Days
HBase
Course Content
• Introduction
• HBase Storage Architecture
• Log Structures Merge Tree
• Future Directions
• HBase Operations
• Designing HBase Tables and Schemas
• Advanced Map Reduce
• Real-Time Case-Study
Course Content
03 Days
Spark Internals
Course Content
15
Synopsis of other additional courses offered
Course Content
• Spark Introduction
• RDDs
• RDD Internals:Part-1
• RDD Internals:Part-2
• Data ingress and egress
• Spark Internals
• Advanced Spark Programming
• High Performance Spark patterns and programming
• Spark SQL , Dataset, Dataframe Internals
• Spark Streaming
• Kafka Internals
• Tuning and Debugging Spark
• Hardware profiling of Spark cluster at Runtime
02 Days
MongoDB
Course Content
• MongoDB Introduction
• CRUD with MongoDB
• Querying
• MongoDB Application Development
• Replication
• Components of a Replica
• Set MongoDB Internals
16
Contact Us
17