0% found this document useful (0 votes)

17 views

BDA-Lec4

The lecture covers the Big Data Analytics Lifecycle, focusing on a case study of ETI Insurance Company, which aims to detect fraudulent claims using Big Data solutions. It details stages such as Data Validation and Cleansing, Data Aggregation and Representation, Data Analysis, and Data Visualization, emphasizing the importance of each step in achieving accurate and actionable insights. Additionally, it introduces data pipelines and their role in automating data movement and transformation for analysis.

Uploaded by

Ahmed Ibrahim Ghnnam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

BDA-Lec4

Uploaded by

Ahmed Ibrahim Ghnnam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

3rd grade

Big Data Analytics

Dr. Nesma Mahmoud
Lecture 4: More on
Big Data Analytics
What will we learn in this lecture?

01. Detailed Big Data Analytics Lifecycle with Case Study (cont.)

02. Pipeline

03. Exam-like Questions

Detailed Big Data
01. Analytics Lifecycle with
Case Study (cont.)
ETI Insurance Company (Case Study)
● ETI’s Big Data journey has reached the stage where its IT team
possesses the necessary skills and the management is convinced of the
potential benefits that a Big Data solution can bring in support of the
business goals.

● The CEO and the directors are eager to see Big Data in action.
○ In response to this, the IT team, in partnership with the business
personnel, take on ETI’s first Big Data project.

○ After a thorough evaluation process, the “detection of fraudulent

claims” objective is chosen as the first Big Data solution.

● The team then follows a step-by-step approach as set forth by the Big
Data Analytics Lifecycle in pursuit of achieving this objective.
Stage 5: Data Validation and Cleansing
● It is dedicated to establishing often complex
validation rules and removing any known
invalid data.
● Invalid data can skew and falsify analysis
results.
● Unlike traditional enterprise data(Database),
where the data structure is pre-defined
and data is pre-validated, data input into
Big Data analyses can be unstructured
without any indication of validity.
○ Its complexity can further make it
difficult to arrive at a set of suitable
validation constraints.
Stage 5: Data Validation and Cleansing
● Big Data solutions often receive redundant data
across different datasets.
○ This redundancy can be exploited )‫ (تستغل‬to
explore interconnected datasets in order to
assemble validation parameters and fill in
missing valid data.
○ For example, as illustrated in this Figure:
■ • The first value in Dataset B is
validated against its corresponding
value in Dataset A.
■ • The second value in Dataset B is not
validated against its corresponding
value in Dataset A.
■ • If a value is missing, it is inserted
from Dataset A.
Stage 5: Data Validation and Cleansing
● For batch analytics,
○ data validation and cleansing can be
achieved via an offline ETL operation.
● For real-time analytics,
○ a more complex in-memory system is
required to validate and cleanse the
data as it arrives from the source.

● Provenance can play an important role in

determining the accuracy and quality of
questionable data.
Stage 5: Data Validation and Cleansing (Case Study)
● To keep costs down,
○ ETI is currently using free versions of the weather and the census
datasets that are not guaranteed to be 100% accurate.
■ As a result, these datasets need to be validated and cleansed.
○ Based on the published field information,
■ the team is able to check the extracted fields for typographical
errors and any incorrect data as well as data type and range
validation.
■ A rule is established that a record will not be removed if it
contains some meaningful level of information even though
some of its fields may contain invalid data.
Stage 6: Data Aggregation and Representation
● This stage is dedicated to integrating
multiple datasets together to arrive at a
unified view.
● Data may be spread across multiple
datasets, requiring that datasets be joined
together via common fields, for example
date or ID.
○ In other cases, the same data fields
may appear in multiple datasets, such
as date of birth.
● Either way, a method of data reconciliation
)‫ (التوفيق بين البيانات‬is required or the dataset
representing the correct value needs to be
determined.
Stage 6: Data Aggregation and Representation
● Performing this stage can become complicated because of differences
in:
○ • Data Structure – Although the data format may be the same, the
data model may be different.
○ • Semantics – A value that is labeled differently in two different
datasets may mean the same thing, for example “surname” and
“last name.”

● The large volumes processed by Big Data solutions can make data
aggregation a time and effort-intensive operation.

● Reconciling (‫ )التوفيق‬these differences can require complex logic that is

executed automatically without the need for human intervention.
Stage 6: Data Aggregation and Representation
● A data structure standardized by the Big Data solution can act as a common
denominator that can be used for a range of analysis techniques and projects.
○ This can require establishing a central, standard analysis repository, such as
a NoSQL database, as shown in the following Figure.

A simple example of data aggregation where two datasets are aggregated together using
the Id field.
Stage 6: Data Aggregation and Representation
● This Figure shows the same piece of data stored in two different formats.
○ Dataset A contains the desired piece of data, but it is part of a BLOB(Binary
Large Object) that is not readily accessible for querying.
○ Dataset B contains the same piece of data organized in column-based
storage, enabling each field to be queried individually.

Dataset A and B can be combined to create a standardized data structure with a Big Data
solution.
Stage 6: Data Aggregation and Representation (Case
Study)
● For meaningful analysis of data,
○ it is decided to join together policy data, claim data and call center
agent notes in a single dataset that is tabular in nature where each
field can be referenced via a data query.
○ It is thought that this will not only help with the current data analysis
task of detecting fraudulent claims but will also help with other data
analysis tasks, such as risk evaluation and speedy settlement of
claims.
○ The resulting dataset is stored in a NoSQL database.
Stage 7: Data Analysis
● The Data Analysis stage is dedicated to carrying out the
actual analysis task, which typically involves one or more
types of analytics.
● This stage can be iterative in nature,
○ especially if the data analysis is predictive analytics,
in which case analysis is repeated until the
appropriate pattern or correlation is uncovered.
● Depending on the type of analytic result required,
○ this stage can be as simple as querying a dataset to
compute an aggregation for comparison.
○ On the other hand, it can be as challenging as
combining data mining and complex statistical
analysis techniques to discover patterns and
anomalies or to generate a statistical or
mathematical model to depict relationships between
variables.
Stage 7: Data Analysis
● Before making inferences from data it is essential to
examine all your variables. Why?

● So, you need to go deep into the data by performing

Exploratory Data Analysis (EDA) in order to have a good
descriptive analytics or other types of analysis
○ What is the maximum and minimum value?
○ How is the data distributed?
○ Are there different types of individuals represented
in the data?
Stage 7: Data Analysis (Case Study)
● The IT team applies Exploratory Data Analysis (EDA) as follows:
○ Determine the number of rows and columns.
○ Identify data types of each column.
○ Checking whether there are any missing values on dataset or not.
○ For checking duplicate rows in the dataset

○ By visualizing the data we will have clearer knowledge about the data.
Stage 7: Data Analysis (Case Study)
● The IT team applies Exploratory Data Analysis (EDA) as follows:
○ By visualizing the data we will have clearer knowledge about the data.
■ o Categorical Data plotting

●
Stage 7: Data Analysis (Case Study)
● The IT team applies Exploratory Data Analysis (EDA) as follows:
○ By visualizing the data we will have clearer knowledge about the data.
Stage 7: Data Analysis (Case Study)
● The IT team involves the data analysts at this stage as it does not have the right
skillset for analyzing data in support of detecting fraudulent claims.
● In order to be able to detect fraudulent transactions,
○ first the nature of fraudulent claims needs to be analyzed in order to find
which characteristics differentiate a fraudulent claim from a legitimate claim.
○ For this, the predictive data analysis approach is taken. As part of this
analysis, a range of analysis techniques are applied.
● This stage is repeated a number of times as the results generated after the first
pass are not conclusive enough ) ‫ (ليس قاطعا بما فيه الكفاية‬to comprehend what makes a
fraudulent claim different from a legitimate claim.
● As part of this exercise, attributes that are less indicative of a fraudulent claim are
dropped while attributes that carry a direct relationship are kept or added.
Stage 8: Data Visualization
● The ability to analyze massive amounts of data and
find useful insights carries little value if the only
ones that can interpret the results are the analysts.
● The Data Visualization stage is dedicated to using
data visualization techniques and tools to
graphically communicate the analysis results for
effective interpretation by business users.
● The results of completing the Data Visualization
stage provide users with the ability to perform
visual analysis, allowing for the discovery of
answers to questions that users have not yet even
formulated.
Stage 8: Data Visualization
Stage 8: Data Visualization (Case Study)
● The team has discovered some interesting findings and now needs to
convey the results to the actuaries, underwriters and claim adjusters.

● Different visualization methods are used including bar and line graphs
and scatter plots.
○ Scatter plots are used to analyze groups of fraudulent and
legitimate claims in the light of different factors, such as customer
age, age of policy, number of claims made and value of claim.
Stage 8: Data Visualization (Case Study)
Stage 9: Utilization of Analysis Results
● Subsequent to analysis results being made available to
business users to support business decision-making, such
as via dashboards, there may be further opportunities to
utilize the analysis results.
● This stage is dedicated to determining how and where
processed analysis data can be further leveraged.
● Depending on the nature of the analysis problems being
addressed, it is possible for the analysis results to produce
“models” that encapsulate new insights and understandings
about the nature of the patterns and relationships that exist
within the data that was analyzed.
○ A model may look like a mathematical equation or a set of rules.
○ Models can be used to improve business process logic and
application system logic, and they can form the basis of a new
system or software program.
Stage 9: Utilization of Analysis Results (Case Study)
● Based on the data analysis results, the underwriting and the claims
settlement users have now developed an understanding of the nature of
fraudulent claims.

● However, In order to realize tangible)‫ (ملموس – حقيقي – مادي‬benefits from

this data analysis exercise, a model based on a machine-learning
technique is generated, which is then incorporated into the existing
claim processing system to flag fraudulent claims.
02. Pipeline
What is Data Pipeline?
● A data pipeline is a set of tools and processes used to automate data movement
and transformation between a source and a target.
● Data pipelines are more commonly known as ETL pipelines or ETL workflows.
● Data pipelines usually consist of multiple components depending on the
complexity of the pipeline. It can be
○ as simple as consisting of two or three key components: a source, a
processing step or steps, and a destination,
○ as complex, and more commonly, as source(s) destination(s), processing,
storage, monitoring, dataflow, workflow, etc.

https://medium.com/@HassanFaheem/what-is-data-pipeline-in-big-data-6c0989cc4877
Why we need a Pipeline?
● The main goal of using data pipelines is to simplify and automate the
process of extracting, transforming, and loading (ETL) data from various
sources into a central location for analysis.
● Data pipelines are often used in conjunction with big data technologies
like Apache Hadoop and Apache Spark, distributed systems for
processing large amounts of data.
Pipeline vs ETL
● you should think about an ETL pipeline as a subcategory of data
pipelines.
● ETL pipelines follow a specific sequence. As the abbreviation implies,
they extract data, transform data, and then load and store data in a data
repository. Not all data pipelines need to follow this sequence.
● Finally, while unlikely, data pipelines as a whole do not necessarily need
to undergo data transformations, as with ETL pipelines. It’s rare to see a
data pipeline that doesn’t utilize transformations to facilitate data analysis
Types of data pipelines
● There are several main types of data pipelines, each appropriate for
specific tasks on specific platforms.

○ Batch processing
○ Streaming data
○ Data integration pipelines
Types of data pipelines
● Batch pipeline
Types of data pipelines
● Streaming pipeline
Types of data pipelines
● Data integration pipelines (ETL pipelines)
○ Data integration pipelines concentrate on merging data from multiple sources
into a single unified view.
○ These pipelines often involve extract, transform and load (ETL) processes that
clean, enrich, or otherwise modify raw data before storing it in a centralized
repository such as a data warehouse or data lake.
○ Data integration pipelines are essential for handling disparate systems that
generate incompatible formats or structures.
03. Exam-like Questions
Sample questions
● In Big Data, the term "Velocity" refers to:
• A) The number of different types of data generated

• B) The speed at which data is generated and processed

• C) The trustworthiness of data sources

• D) The volume of data produced by a system

● Which of the following is an example of semi-structured data?

• a) A relational database table

• b) An XML document c) A video file d) A log file without any schema

Sample questions
● Consider the following data: "Excellent," "Good," "Average," "Poor." This
is an example of:
• A) Nominal data B) Ordinal data
• C) Interval data D) Ratio data
● A researcher wants to measure the weight of participants. Which type of
data would be most appropriate?
• A) Nominal B) Ordinal
• C) Interval D) Ratio/continou
Sample questions
● Which type of analytics answers the question, “What should be done?”
○ - A) Diagnostic Analytics - B) Prescriptive Analytics
○ - C) Predictive Analytics - D) Descriptive Analytics
● A clothing company discovers that its payment page is malfunctioning,
causing sales to decrease. What type of analytics is utilized to find this
issue?
● - A) Predictive Analytics B) Prescriptive Analytics
● - C) Descriptive Analytics - D) Diagnostic Analytics
Sample questions
● What is the significance of KPIs in the Business Case Evaluation stage?
• A) They help measure the success of the project.
• B) They determine the required data sources.
• C) They define the data extraction process.
• D) They help visualize the data.
Thanks!
Do you have any questions?

CREDITS: This presentation template was created by Slidesgo, and includes

icons by Flaticon, and infographics & images by Freepik

Data Analytics Lecture Notes
100% (1)
Data Analytics Lecture Notes
10 pages
Unit I - Part I Notes
100% (7)
Unit I - Part I Notes
33 pages
Case Study: Ensure To Insure
No ratings yet
Case Study: Ensure To Insure
31 pages
BDA-Lec3
No ratings yet
BDA-Lec3
48 pages
Unit 1 Notes Final Part C
No ratings yet
Unit 1 Notes Final Part C
38 pages
Big Data Categories-Life Cycle
No ratings yet
Big Data Categories-Life Cycle
15 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
62 pages
dsbd
No ratings yet
dsbd
23 pages
Week 2 - Data Analytics Life Cycle
No ratings yet
Week 2 - Data Analytics Life Cycle
41 pages
ATW115 Slides Chp02
No ratings yet
ATW115 Slides Chp02
52 pages
Document (1)
No ratings yet
Document (1)
10 pages
Big Data Analytics Life Cycle
No ratings yet
Big Data Analytics Life Cycle
3 pages
Unit 2
No ratings yet
Unit 2
58 pages
Data Analytics Process
No ratings yet
Data Analytics Process
10 pages
Big Data Lec5
No ratings yet
Big Data Lec5
37 pages
BI Unit 3
No ratings yet
BI Unit 3
96 pages
Unit V
No ratings yet
Unit V
3 pages
Andromeda
No ratings yet
Andromeda
6 pages
Summary_ Lifecycle of Data Analysis -3982
No ratings yet
Summary_ Lifecycle of Data Analysis -3982
7 pages
Part II, Meet 4 - Ch 6 Dan 7 UNP
No ratings yet
Part II, Meet 4 - Ch 6 Dan 7 UNP
19 pages
Da CH 01 Answer Paper
No ratings yet
Da CH 01 Answer Paper
3 pages
EDA
100% (1)
EDA
9 pages
Notes - Unit 1 - Exploratory Data Analysis
No ratings yet
Notes - Unit 1 - Exploratory Data Analysis
33 pages
Bank Fraud Detuct Project
No ratings yet
Bank Fraud Detuct Project
50 pages
Notes - EDA-Unit1 (2)
No ratings yet
Notes - EDA-Unit1 (2)
34 pages
Advantages and Disadvantages of Data Analytics
No ratings yet
Advantages and Disadvantages of Data Analytics
6 pages
Data Understanding and Prepration
100% (1)
Data Understanding and Prepration
10 pages
CBDA Domain-II Source Data v0.1
No ratings yet
CBDA Domain-II Source Data v0.1
32 pages
BI_All_In_One (1)
No ratings yet
BI_All_In_One (1)
52 pages
Chapter 3
No ratings yet
Chapter 3
58 pages
6 Phrase of Data Analysis
No ratings yet
6 Phrase of Data Analysis
9 pages
IT Specialist: Data Analytics Certification Prep - 500 Exam Questions and Explanations
From Everand
IT Specialist: Data Analytics Certification Prep - 500 Exam Questions and Explanations
Steve Brown
No ratings yet
Unit 3
No ratings yet
Unit 3
16 pages
Data Science - III
No ratings yet
Data Science - III
94 pages
BDA-24_Lect (3-4)-(Fundamentals of Data Analysis)
No ratings yet
BDA-24_Lect (3-4)-(Fundamentals of Data Analysis)
15 pages
Business Undestanding and Data Collection
No ratings yet
Business Undestanding and Data Collection
27 pages
BDS Session 4
No ratings yet
BDS Session 4
65 pages
Chapter 2 Notes
No ratings yet
Chapter 2 Notes
5 pages
Notes Unit I
No ratings yet
Notes Unit I
47 pages
curso data analis
No ratings yet
curso data analis
7 pages
DWV Notes Units 1 to 5
No ratings yet
DWV Notes Units 1 to 5
158 pages
Data Quality Assessment: A Methodology For Success: Data: The Good, The Bad and The Money
No ratings yet
Data Quality Assessment: A Methodology For Success: Data: The Good, The Bad and The Money
8 pages
Unit - I - 2
No ratings yet
Unit - I - 2
63 pages
Module I(Introduction Data Analytics Life Cycle) Part II (1)
No ratings yet
Module I(Introduction Data Analytics Life Cycle) Part II (1)
103 pages
Basics of Data Integration
No ratings yet
Basics of Data Integration
67 pages
Revised NOTES on AI PROJECT CYCLE Class 9 and 10 as on 29-10-2024 1
No ratings yet
Revised NOTES on AI PROJECT CYCLE Class 9 and 10 as on 29-10-2024 1
21 pages
Systematic Approach To Perform Task Centric Exploratory Data Analysis With Case Study
No ratings yet
Systematic Approach To Perform Task Centric Exploratory Data Analysis With Case Study
8 pages
EDA 2
No ratings yet
EDA 2
69 pages
Process of Data Form Dirty Cleaning
No ratings yet
Process of Data Form Dirty Cleaning
48 pages
Data Science
No ratings yet
Data Science
10 pages
unit-1
No ratings yet
unit-1
50 pages
RHadoop
No ratings yet
RHadoop
50 pages
Unit 2 Data Analytics
No ratings yet
Unit 2 Data Analytics
16 pages
Dcova Framework
No ratings yet
Dcova Framework
7 pages
EuroVis2019 Capstone
No ratings yet
EuroVis2019 Capstone
113 pages
Data Analysis
No ratings yet
Data Analysis
6 pages
Course 2 - 121756
No ratings yet
Course 2 - 121756
29 pages
Data Lineage
No ratings yet
Data Lineage
14 pages
Unit-2 - DS Notes
No ratings yet
Unit-2 - DS Notes
22 pages
Big Data Case Study Evaluation (Component 2) : Submitted By: Khushi Mittal 20021021118
No ratings yet
Big Data Case Study Evaluation (Component 2) : Submitted By: Khushi Mittal 20021021118
2 pages
Lec4 designPattern
No ratings yet
Lec4 designPattern
48 pages
Lec5 flask
No ratings yet
Lec5 flask
5 pages
22413_SEN_QB5
No ratings yet
22413_SEN_QB5
18 pages
MNU CAI ICI334 Lec4&5
No ratings yet
MNU CAI ICI334 Lec4&5
33 pages
MNU CAI ICI334 Lec7
No ratings yet
MNU CAI ICI334 Lec7
30 pages
Answer Midterm 2024 - 11 - 19
No ratings yet
Answer Midterm 2024 - 11 - 19
4 pages
BDA-Lec10
No ratings yet
BDA-Lec10
33 pages
assignment_1
No ratings yet
assignment_1
12 pages
AI lecture 9
No ratings yet
AI lecture 9
39 pages
sodapdf-converted
No ratings yet
sodapdf-converted
4 pages
BDA-Lec1
No ratings yet
BDA-Lec1
25 pages
Lecture-02,03
No ratings yet
Lecture-02,03
54 pages
Lecture 9 - MapReduce
No ratings yet
Lecture 9 - MapReduce
50 pages
Lec. 3
No ratings yet
Lec. 3
18 pages
Lecture 7 - Wide Column Stores - Part 1
No ratings yet
Lecture 7 - Wide Column Stores - Part 1
30 pages
Section 5
No ratings yet
Section 5
7 pages
Chapter 8 Concurrency-P1
No ratings yet
Chapter 8 Concurrency-P1
30 pages
DMEE Configuration
No ratings yet
DMEE Configuration
45 pages
Grerat Story Series 17jan24d
No ratings yet
Grerat Story Series 17jan24d
11 pages
Ict 8 MS Excel
No ratings yet
Ict 8 MS Excel
22 pages
Chapter 1. Array and Cluster: Cluster
No ratings yet
Chapter 1. Array and Cluster: Cluster
14 pages
Chemical Processing of Ceramics Second Edition Materials Engineering Burtrand Lee download
100% (3)
Chemical Processing of Ceramics Second Edition Materials Engineering Burtrand Lee download
24 pages
(Second Language) Answers To This Paper Must Be Written On The Paper Provided Separately
No ratings yet
(Second Language) Answers To This Paper Must Be Written On The Paper Provided Separately
11 pages
Wa Ode Lili Nasria
No ratings yet
Wa Ode Lili Nasria
1 page
Account Statement - 2025-04-07T094336.988
No ratings yet
Account Statement - 2025-04-07T094336.988
6 pages
Choose The Correct Option. Complete The Sentences With The Correct Form of The Verbs
No ratings yet
Choose The Correct Option. Complete The Sentences With The Correct Form of The Verbs
1 page
This World of Inarticulate Power J.M. Synges Riders to the Sea and Magical Realism (C.Collins)
No ratings yet
This World of Inarticulate Power J.M. Synges Riders to the Sea and Magical Realism (C.Collins)
22 pages
6 - Building Your Confidence BEFORE The Day of Your Speech - PART 2
No ratings yet
6 - Building Your Confidence BEFORE The Day of Your Speech - PART 2
2 pages
Immediate download A Concise Introduction to Logic (14th Edition) Patrick J. Hurley - eBook PDF ebooks 2024
100% (8)
Immediate download A Concise Introduction to Logic (14th Edition) Patrick J. Hurley - eBook PDF ebooks 2024
59 pages
How To Purge An Program in Region
No ratings yet
How To Purge An Program in Region
5 pages
Computer Project
No ratings yet
Computer Project
46 pages
Sentence Transformations 14 Present Perfect and Pa 109063
100% (2)
Sentence Transformations 14 Present Perfect and Pa 109063
2 pages
Networking All 10 PRG
No ratings yet
Networking All 10 PRG
14 pages
Using Regular Expressions in Borland Delphi - RM - 20040302
No ratings yet
Using Regular Expressions in Borland Delphi - RM - 20040302
18 pages
LMS SRS
No ratings yet
LMS SRS
8 pages
Settings Provider
No ratings yet
Settings Provider
6 pages
Verbo To Be - Pasado
No ratings yet
Verbo To Be - Pasado
3 pages
Understanding The Self Module 1.2
No ratings yet
Understanding The Self Module 1.2
4 pages
Bab 2
No ratings yet
Bab 2
9 pages
Audio Visual Cue Sheet
No ratings yet
Audio Visual Cue Sheet
4 pages
AI Chatbot Unit 2
No ratings yet
AI Chatbot Unit 2
7 pages
Building Robots With Raspberry Pi and Python
100% (1)
Building Robots With Raspberry Pi and Python
6 pages
625 List Alphabetical
No ratings yet
625 List Alphabetical
4 pages
Around_the_world_in_eighty_days_script
No ratings yet
Around_the_world_in_eighty_days_script
8 pages
Kasi Purushothaman Resume
No ratings yet
Kasi Purushothaman Resume
4 pages
2.1 Microcontroller: Embedded Systems
No ratings yet
2.1 Microcontroller: Embedded Systems
23 pages
Unit 9 Supplementary Resources Overview: ARC SB: SSD: TSS: Vid: VRB: WB
No ratings yet
Unit 9 Supplementary Resources Overview: ARC SB: SSD: TSS: Vid: VRB: WB
2 pages

BDA-Lec4

Uploaded by

BDA-Lec4

Uploaded by

3rd grade

Big Data Analytics

03. Exam-like Questions

○ After a thorough evaluation process, the “detection of fraudulent

● Provenance can play an important role in

● Reconciling (‫ )التوفيق‬these differences can require complex logic that is

● So, you need to go deep into the data by performing

● However, In order to realize tangible)‫ (ملموس – حقيقي – مادي‬benefits from

• B) The speed at which data is generated and processed

• C) The trustworthiness of data sources

• D) The volume of data produced by a system

● Which of the following is an example of semi-structured data?

• b) An XML document c) A video file d) A log file without any schema

CREDITS: This presentation template was created by Slidesgo, and includes

You might also like