0% found this document useful (0 votes)
2 views3 pages

Hadoop Dealing Big Data

The document discusses the challenges industries face in managing big data and the limitations of traditional approaches using RDBMS. It highlights Google's introduction of the MapReduce algorithm, which employs a master-slave architecture for efficient data processing. This led to the development of Hadoop, an open-source framework designed to handle large datasets using commodity hardware and a simple programming model.

Uploaded by

venkatek3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Hadoop Dealing Big Data

The document discusses the challenges industries face in managing big data and the limitations of traditional approaches using RDBMS. It highlights Google's introduction of the MapReduce algorithm, which employs a master-slave architecture for efficient data processing. This led to the development of Hadoop, an open-source framework designed to handle large datasets using commodity hardware and a simple programming model.

Uploaded by

venkatek3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Hadoop - A Solution For Big Data

Last Updated : 10 Jul, 2020

Wasting the useful information hidden behind the data can be a dangerous roadblock for
industries, ignoring this information eventually pulls your industry growth back. Data? Big
Data? How big you think it is, yes it's really huge in volume with huge velocity, variety,
veracity, and value. So how do you think humans find the solution to deal with this big data.
Let's discuss these various approaches one by one.

Traditional Approach

In traditional Approach, earlier the Big Giant tech company Handles the data on a single
system storing and processing the data with the help of various database vendors available
in the market like IBM, Oracle, etc. The databases used at that time use RDBMS(Relational
Database Management System) which is used for storing the structured data. The developer
uses a short Application that helps them to communicate with the databases and help them
to maintain, analyze, modify, and visualize the data stored.

But there is a problem with using this traditional approach, the problem is that the database
server at that time which is actually the commodity hardware is capable of only storing and
maintaining a very less size of data. The data can only be processed up to a limit i.e. about
the processing speed of the processors available at that time. Also, the servers are not very
efficient or capable of handling the velocity and variety of data because we are not using a
cluster of computer systems. A single database server is dedicated to handling all this data.
How Google finds it's Solution for Big Data?

Google at that time introduced the algorithm name MapReduce. MapReduce works on a
master-slave architecture means that rather than dedicating a single database server for
handling the data google introduced a new terminology where there is Master who will
guide the other slave nodes to handle this big data. The task should be divided into various
blocks and then be distributed among these slaves. Then once the slaves process the data
the Master will gather the result obtained from the various slaves' nodes and make the final
result Dataset.

Later on, Doug Cutting and his co-worker Mike Cafarella in 2005 decided to make an open-
source software that can work on this MapReduce algorithm. This is where the picture of
Hadoop is introduced for the first time to deal with the very larger data set. Hadoop is a
framework written in Java that works over the collection of various simple commodity
hardware to deal with the large dataset using a very basic level programming model.

You might also like