0% found this document useful (0 votes)
8 views

Big Data

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Big Data

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

IIP Online Class

Big Data

Delivered by:
Daw Zin Mar Soe
Institute of International Professionalism
What is Big Data?
• Big Data is also data but with a huge size. Big Data is a term
used to describe a collection of data that is huge in volume
and yet growing exponentially with time.
• In short such data is so large and complex that none of the
traditional data management tools are able to store it or
process it efficiently.
Examples Of Big Data
• The New York Stock Exchange generates about one
terabyte of new trade data per day.
• 500+terabytes of new data get ingested into the
databases of social media site Facebook, every day.
• A single Jet engine can generate 10+terabytes of data
in 30 minutes of flight time. With many thousand
flights per day, generation of data reaches up to
many Petabytes.
Types Of Big Data
BigData' could be found in three forms:

 Structured
 Unstructured
 Semi-structured
Structured
• Any data that can be stored, accessed and processed in the form
of fixed format is termed as a 'structured' data.
• Examples Of Structured Data. An 'Employee' table in a database is
an example of Structured Data
Unstructured
• Any data with unknown form or the structure is classified as
unstructured data. In addition to the size being huge, un-structured
data poses multiple challenges in terms of its processing for deriving
value out of it.
• A typical example of unstructured data is a heterogeneous data
source containing a combination of simple text files, images, videos
etc.
Unstructured
• Examples Of Un-structured Data. The output returned by 'Google
Search’
Semi-structured
• Semi-structured data can contain both the forms of data. We can see
semi-structured data as a structured in form but it is actually not defined
with e.g. a table definition in relational DBMS.
• Example of semi-structured data is a data represented in an XML file.
• Examples Of Semi-structured Data
• Personal data stored in an XML file-
<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>
Characteristics Of Big Data
• Volume; how big must a data set be before traditional data handling
methods cannot cope. [Speed of collection]
• Velocity; how quickly will data arrive, how quickly must it be evaluated
and acted upon. Including the problems of real time data. [Volume of
data]
• Variety; the problems of dealing with unstructured data, including
additional processing and the use of metadata. [Range of data types
collected.]
• Veracity; how reliable is the data. Costs of finding errors v costs of
accepting errors. Legal consequences could also be looked at. [Accuracy
or quality of data]
• Value; it is often said that all data has value, but there are costs in
collecting, storing, processing, analysing. Students should look at the cost-
benefit equation.[Actual or potential usefulness of analyzing the data]
Benefits of Big Data Processing

• Businesses can utilize outside intelligence while taking


decisions
• Improved customer service
• Early identification of risk to the product/services, if any
• Better operational efficiency

There are some interesting examples of real life uses of Big Data here:

https://www.datapine.com/blog/big-data-examples-in-real-life/
One characteristic of Big Data is the
volume of data collected. Give two other
characteristics of Big Data. (2021 Oct)
• Velocity / Speed of collection
• Variety / Range of data types collected / Mix of
structured and unstructured data
• Veracity / Accuracy or quality
• Value / Actual or potential usefulness of analysing
the data
High capacity storage devices are required to
handle Big Data.
State one other infrastructure requirement
for Big Data. (2021 Oct)

• Award one mark for any of the following infrastructure


requirements:
• Processing power/capacity (1)
• Complexity of algorithms / Software for analysis (1)
• Storing/Analysing related data over several sites (1)
• Fast/high capacity WAN (1)
Thank You !!

You might also like