0% found this document useful (0 votes)
173 views

Understanding Data Lakes EMC

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. Data from various sources flows into the data lake and is stored, allowing users to analyze the data through queries to gain business insights. Data lakes allow companies to store both structured and unstructured data for exploration and analysis to address business issues and build predictive models.

Uploaded by

pedro_luna_43
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
173 views

Understanding Data Lakes EMC

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. Data from various sources flows into the data lake and is stored, allowing users to analyze the data through queries to gain business insights. Data lakes allow companies to store both structured and unstructured data for exploration and analysis to address business issues and build predictive models.

Uploaded by

pedro_luna_43
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

UNDERSTANDING DATA LAKES

WHAT IS A DATA LAKE?


Data lake is one place to put all the data
enterprises may want to use, including
structured and unstructured data

HOW DO DATA LAKES WORK?


The concept can be compared to a water body, a lake, where water flows in, filling up a reservoir and flows out.

The incoming flow represents


multiple raw data archives ranging
from emails, spreadsheets,
STRUCTURED DATA social media content, etc.
1. Information in rows and columns
2. Easily ordered and processed
with data mining tools

UNSTRUCTURED DATA
1. Raw, unorganized data
2. Emails
3. PDF files
4. Images, video and audio
5. Social media tools

The reservoir of water is a dataset,


where you run analytics on all the data.

The outflow of water is the analyzed data. Through this process, you are
able to “sift” through all the
data quickly to gain key
business insights.

The information in the In the last 10 years, companies have Data lakes help reveal complex business issues and
Digital Universe will grow
http://singapore.emc.com/about/news/press/2014/20140409-01.htm
started using data lakes to deal build predictive models to address these. Companies
10 times by 2020. with the enormous amounts of data. ranging from restaurants to mining corporations use
data lake solutions in their everyday analytics.

WHO IS USING DATA LAKES?

BUSINESS & DATA SCIENTISTS


DATA ARCHITECTS
DATA ANALYSTS & APP DEVELOPERS
Analyze reports on specific data in Responsible for designing, creating, Perform statistical analysis on big
the organization to provide deploying and managing an data to identify trends, solve
business insight organization’s data architecture business problems and optimize
performance

WHY ARE DATA LAKES IMPORTANT?

BUILD FLEXIBILITY & RETAIN DATA EXPLORE &


SPEED
APPLICATIONS ACCESSIBILITY AUTHENTICITY ANALYZE
Platform for businesses Provide flexibility and Data Lakes allow you to Ability to sift through Ability to explore and
to get at the data and accessibility in moving store and analyze the immense quantities of analyze data to derive
quickly build the views, large amounts of data information in different data quickly business value and
and data-driven from data warehouse to formats, retaining data benefit
applications they perform analytics authenticity
really need

You might also like