SlideShare a Scribd company logo
BIG DATA
simplified!
Pravin Hanchinal
pravinhanchinal.com
Before we start...
Big Data
Big Data simplified
What you can do with Big Data?
Big Data
Big Data is a
cluster of many technologies and tools
that are used in various scenarios.
(Hadoop + HDFS+ Hcatalog+Flume+PowerView)
(HortonWorks + PowerView)
What you can do in Big Data?
Fetching
Processing
Visualizing
How Big is Big Data?
Byte of data: one grain of rice
Kilobyte: cup of rice
Megabyte: 8 bags of rice
Gigabyte: 3 container of lorries
Terabyte: 2 container ships
Petabyte: covers Mumbai
Exabyte: covers India
Zettabyte: fills Indian Ocean
Big Data Industry Overview
Big Data simplified
MapReduce
‱MapReduce is a processing technique and a program
model for distributed computing based on java.
‱The MapReduce algorithm contains two important tasks,
namely Map and Reduce.
Mapreduce
Big Data simplified
Big Data simplified
Big Data simplified
Hadoop Cluster
Big Data simplified
Big Data simplified
What you can do on Big Data?
Get Started with this:
CloudEra
HortonWorks
Why Big Data?
Business Intelligence
HortonWorks
Cloud Era
Why Hadoop?
-> Hadoop modeling and development: MapReduce, Pig, Mahout
-> Hadoop storage and data management: HDFS, HBase, Cassandra
-> Hadoop data warehousing, summarization and query: Hive, Sqoop
-> Hadoop data collection, aggregation and analysis: Chukwa, Flume
-> Hadoop metadata, table and schema management: HCatalog
-> Hadoop cluster management, job scheduling and workflow:
ZooKeeper, Oozie and Ambari
-> Hadoop Data serialization: Avro
Big Data in Nutshell
Big Data simplified
Got questions?
Text/WhatsApp on 974-086-1099
Stay connected
pravinhanchinal.com
What Next?
Dive in and Explore
Typical Use Case
Resources
http://pravinhanchinal.com/what-is-for-what-hadoop-tools
https://blog.cloudera.com/blog/2014/01/how-to-create-a-simple-hadoop-cluster-wit
h-virtualbox/
http://pingax.com/install-apache-hadoop-ubuntu-cluster-setup/
https://de.slideshare.net/EdurekaIN/ha-webinar-48976388
Resources
https://ayende.com/blog/4435/map-reduce-a-visual-explanation
MultiNode on Amazon: https://dzone.com/articles/how-set-multi-node-hadoop
https://ayende.com/blog/4435/map-reduce-a-visual-explanation
Run Sample MapReduce Examples:
MapReduce examples:
http://www.informit.com/articles/article.aspx?p=2190194&seqNum=3
https://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig/

More Related Content

What's hot (20)

PPTX
Introduction to BIg Data and Hadoop
Amir Shaikh
 
PPTX
Hadoop and BigData - July 2016
Ranjith Sekar
 
PPT
BigData Analytics with Hadoop and BIRT
Amrit Chhetri
 
ODP
Big data, map reduce and beyond
datasalt
 
PPTX
Intro to Big Data Hadoop
Apache Apex
 
PDF
Introduction to Bigdata and HADOOP
vinoth kumar
 
PPT
Big Data: An Overview
C. Scyphers
 
PPTX
Hadoop: An Industry Perspective
Cloudera, Inc.
 
PDF
Big data Big Analytics
Ajay Ohri
 
PDF
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
Gigaom
 
PPTX
Hadoop and big data
Yukti Kaura
 
PDF
Big data technologies and Hadoop infrastructure
Roman Nikitchenko
 
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
 
PPTX
Big Data Course - BigData HUB
Ahmed Salman
 
PPT
Big Data Analytics 2014
Stratebi
 
PDF
Introduction to Big Data and Hadoop
Febiyan Rachman
 
PPTX
Big Data & Hadoop Introduction
Jayant Mukherjee
 
PPTX
Introduction of Big data, NoSQL & Hadoop
Savvycom Savvycom
 
PPT
Big data introduction, Hadoop in details
Mahmoud Yassin
 
PPTX
Whatisbigdataandwhylearnhadoop
Edureka!
 
Introduction to BIg Data and Hadoop
Amir Shaikh
 
Hadoop and BigData - July 2016
Ranjith Sekar
 
BigData Analytics with Hadoop and BIRT
Amrit Chhetri
 
Big data, map reduce and beyond
datasalt
 
Intro to Big Data Hadoop
Apache Apex
 
Introduction to Bigdata and HADOOP
vinoth kumar
 
Big Data: An Overview
C. Scyphers
 
Hadoop: An Industry Perspective
Cloudera, Inc.
 
Big data Big Analytics
Ajay Ohri
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
Gigaom
 
Hadoop and big data
Yukti Kaura
 
Big data technologies and Hadoop infrastructure
Roman Nikitchenko
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
 
Big Data Course - BigData HUB
Ahmed Salman
 
Big Data Analytics 2014
Stratebi
 
Introduction to Big Data and Hadoop
Febiyan Rachman
 
Big Data & Hadoop Introduction
Jayant Mukherjee
 
Introduction of Big data, NoSQL & Hadoop
Savvycom Savvycom
 
Big data introduction, Hadoop in details
Mahmoud Yassin
 
Whatisbigdataandwhylearnhadoop
Edureka!
 

Viewers also liked (20)

PPTX
Big data ppt
Nasrin Hussain
 
PPTX
What is Big Data?
Bernard Marr
 
PPTX
Big data ppt
Thirunavukkarasu Ps
 
PPTX
What is big data?
David Wellman
 
PPTX
Simple Tactics Superb Performance by INSPIRE-groups
Praveen Hanchinal
 
PDF
Cloud computing projects by inspire-groups (Pravin Hanchinal)
Praveen Hanchinal
 
PDF
Virtualization-the Cloud Enabler by INSPIRE-groups
Praveen Hanchinal
 
PDF
Vedic Sciences and Computers
Praveen Hanchinal
 
PDF
Entrepreneurship by INSPIRE-groups (Pravin Hanchinal)
Praveen Hanchinal
 
PDF
Cloud development and career path
Praveen Hanchinal
 
PDF
Big data Analytics hands-on sessions
Praveen Hanchinal
 
PPT
Big data ppt
IDBI Bank Ltd.
 
PDF
Cloud APIs and Cloud Frameworks
Praveen Hanchinal
 
PDF
Virtualization, the cloud enabler
Praveen Hanchinal
 
PDF
How to give final year project presentation?
Praveen Hanchinal
 
PPTX
Big Idea For Big Data
Dexlab Analytics
 
PPT
Big Data
NGDATA
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
PPTX
Big data presentation on Crystal Ball Event Prediction
Sujan Thapa
 
PDF
Apache Drill (ver. 0.1, check ver. 0.2)
Camuel Gilyadov
 
Big data ppt
Nasrin Hussain
 
What is Big Data?
Bernard Marr
 
Big data ppt
Thirunavukkarasu Ps
 
What is big data?
David Wellman
 
Simple Tactics Superb Performance by INSPIRE-groups
Praveen Hanchinal
 
Cloud computing projects by inspire-groups (Pravin Hanchinal)
Praveen Hanchinal
 
Virtualization-the Cloud Enabler by INSPIRE-groups
Praveen Hanchinal
 
Vedic Sciences and Computers
Praveen Hanchinal
 
Entrepreneurship by INSPIRE-groups (Pravin Hanchinal)
Praveen Hanchinal
 
Cloud development and career path
Praveen Hanchinal
 
Big data Analytics hands-on sessions
Praveen Hanchinal
 
Big data ppt
IDBI Bank Ltd.
 
Cloud APIs and Cloud Frameworks
Praveen Hanchinal
 
Virtualization, the cloud enabler
Praveen Hanchinal
 
How to give final year project presentation?
Praveen Hanchinal
 
Big Idea For Big Data
Dexlab Analytics
 
Big Data
NGDATA
 
Big Data Analytics with Hadoop
Philippe Julio
 
Big data presentation on Crystal Ball Event Prediction
Sujan Thapa
 
Apache Drill (ver. 0.1, check ver. 0.2)
Camuel Gilyadov
 
Ad

Similar to Big Data simplified (20)

PDF
Big Data
Mehmet Burak AkgĂŒn
 
PDF
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
PDF
training huawei big data data engineer basic
EricSandria2
 
PPTX
BIG Data & Hadoop Applications in E-Commerce
Skillspeed
 
PPTX
Big Data
Mahesh Bmn
 
PPTX
Big data Intro - Presentation to OCHackerz Meetup Group
Sri Kanajan
 
PPT
Lecture 5 - Big Data and Hadoop Intro.ppt
almaraniabwmalk
 
PPTX
Intro to big data and how it works
Nadeem Tahir
 
PPTX
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
OCTO Technology
 
PPTX
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Rio Info
 
PPTX
Introduction to BIG DATA
Zeeshan Khan
 
PPTX
BIG Data & Hadoop Applications in Retail
Skillspeed
 
PPTX
BIG Data & Hadoop Applications in Logistics
Skillspeed
 
PPTX
Big Data By Vijay Bhaskar Semwal
IIIT Allahabad
 
PPTX
Big Data
Faisal Ahmed
 
PPTX
Unit 1 - Introduction to Big Data and hadoop.pptx
2111CS010077SHAIKAZE
 
PPTX
Big data-denis-rothman
Denis Rothman
 
PPTX
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 
PDF
Big Data
Catarina Moreira
 
PPT
Data analytics & its Trends
Dr.K.Sreenivas Rao
 
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
training huawei big data data engineer basic
EricSandria2
 
BIG Data & Hadoop Applications in E-Commerce
Skillspeed
 
Big Data
Mahesh Bmn
 
Big data Intro - Presentation to OCHackerz Meetup Group
Sri Kanajan
 
Lecture 5 - Big Data and Hadoop Intro.ppt
almaraniabwmalk
 
Intro to big data and how it works
Nadeem Tahir
 
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
OCTO Technology
 
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Rio Info
 
Introduction to BIG DATA
Zeeshan Khan
 
BIG Data & Hadoop Applications in Retail
Skillspeed
 
BIG Data & Hadoop Applications in Logistics
Skillspeed
 
Big Data By Vijay Bhaskar Semwal
IIIT Allahabad
 
Big Data
Faisal Ahmed
 
Unit 1 - Introduction to Big Data and hadoop.pptx
2111CS010077SHAIKAZE
 
Big data-denis-rothman
Denis Rothman
 
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 
Big Data
Catarina Moreira
 
Data analytics & its Trends
Dr.K.Sreenivas Rao
 
Ad

More from Praveen Hanchinal (12)

PDF
Artificial Intelligence (AI): Applications in Life Science | Davangere Univer...
Praveen Hanchinal
 
PDF
TensorFlow based Machine Learning VTU 2019 by pravin hanchinal
Praveen Hanchinal
 
PDF
Internet of things | Research Directions in Green IoT and Case Studies
Praveen Hanchinal
 
PDF
Artificial Intelligence and Machine Learning by Praveen Hanchinal
Praveen Hanchinal
 
PDF
Economy and Big Data | Praveen Hanchinal
Praveen Hanchinal
 
PDF
Artificial intelligence by praveen hanchinal
Praveen Hanchinal
 
PDF
Research Issues, Challenges and Directions in IoT (Internet of Things)
Praveen Hanchinal
 
PDF
Cloud based mobile app development cit 2017
Praveen Hanchinal
 
PDF
Cloud based development cit-2017
Praveen Hanchinal
 
PDF
Cloud computing simplified cit 2017
Praveen Hanchinal
 
PDF
Women and Web
Praveen Hanchinal
 
PDF
Google App Engine (Introduction)
Praveen Hanchinal
 
Artificial Intelligence (AI): Applications in Life Science | Davangere Univer...
Praveen Hanchinal
 
TensorFlow based Machine Learning VTU 2019 by pravin hanchinal
Praveen Hanchinal
 
Internet of things | Research Directions in Green IoT and Case Studies
Praveen Hanchinal
 
Artificial Intelligence and Machine Learning by Praveen Hanchinal
Praveen Hanchinal
 
Economy and Big Data | Praveen Hanchinal
Praveen Hanchinal
 
Artificial intelligence by praveen hanchinal
Praveen Hanchinal
 
Research Issues, Challenges and Directions in IoT (Internet of Things)
Praveen Hanchinal
 
Cloud based mobile app development cit 2017
Praveen Hanchinal
 
Cloud based development cit-2017
Praveen Hanchinal
 
Cloud computing simplified cit 2017
Praveen Hanchinal
 
Women and Web
Praveen Hanchinal
 
Google App Engine (Introduction)
Praveen Hanchinal
 

Recently uploaded (20)

PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Next Generation AI: Anticipatory Intelligence, Forecasting Inflection Points ...
dleka294658677
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
PDF
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
PDF
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PPTX
Wondershare Filmora Crack Free Download 2025
josanj305
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Next Generation AI: Anticipatory Intelligence, Forecasting Inflection Points ...
dleka294658677
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
Digital Circuits, important subject in CS
contactparinay1
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Wondershare Filmora Crack Free Download 2025
josanj305
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 

Big Data simplified