SlideShare a Scribd company logo
CL ASSIFICATION OF DATA
Dr. C.V. Suresh Babu
(CentreforKnowledgeTransfer)
institute
(CentreforKnowledgeTransfer)
institute
OBJECTIVES
• To understand the various Classification of Data
• To know What is Structured Data?
• To know What is Unstructured Data?
• To know What is Semistructured Data?
• To understand the Key Differences between Structured and Unstructured
Data
(CentreforKnowledgeTransfer)
institute
DISCUSSION TOPICS
• Classification of Data
• What is Structured Data?
• What is Unstructured Data?
• What is Semistructured Data?
• Structured vs Unstructured Data: 5 Key Differences
(CentreforKnowledgeTransfer)
institute
CLASSIFICATION OF DATA
• Data classification is broadly defined as the process of organizing data by relevant
categories so that it may be used more efficiently. On a basic level, the classification
process makes data easier to locate and retrieve. Data classification is of particular
importance when it comes to risk management, compliance, and data security.
• Data classification involves tagging data to make it easily searchable and trackable. It
also eliminates multiple duplications of data, which can reduce storage and backup
costs while speeding up the search process.
(CentreforKnowledgeTransfer)
institute
PYRAMID OF DATA
(CentreforKnowledgeTransfer)
institute
WHAT IS STRUCTURED DATA?
• The term structured data refers to data that resides in a fixed field within a file or
record. Structured data is typically stored in a relational database (RDBMS). It can
consist of numbers and text, and sourcing can happen automatically or manually, as
long as it's within an RDBMS structure. It depends on the creation of a data model,
defining what types of data to include and how to store and process it.
• The programming language used for structured data is SQL (Structured Query
Language). Typical examples of structured data are names, Reg. No., Marks,
Attendence, and so on.
S.No. First Name Last Name Reg. No.
1 Priya Dharshini 18132001
2 Mawa Chouhan 18132002
3
Sai
phanindra Muvvala 18132003
4 Nandhini
Venkatesa
n 18132004
(CentreforKnowledgeTransfer)
institute
WHAT IS UNSTRUCTURED DATA?
• Unstructured data is more or less all the data that is not structured. Even though
unstructured data may have a native, internal structure, it's not structured in a
predefined way. There is no data model; the data is stored in its native format.
• Typical examples of unstructured data are rich media, text, social media activity,
surveillance imagery, and so on.
The amount of unstructured data
is much larger than that of
structured data. Unstructured
data makes up a 80% of all
enterprise data, and the
percentage keeps growing. This
means that companies not taking
unstructured data into account
are missing out on a lot of
valuable business intelligence.
(CentreforKnowledgeTransfer)
institute
EXAMPLES: UNSTRUCTURED DATA
(CentreforKnowledgeTransfer)
institute
WHAT IS SEMI-STRUCTURED DATA?
• Semistructured data is a third category that falls somewhere between the other two.
It's a type of structured data that does not fit into the formal structure of a relational
database. But while not matching the description of structured data entirely, it still
employs tagging systems or other markers, separating different elements and enabling
search. Sometimes, this is referred to as data with a self-describing structure.
• A typical example of semistructured data is smartphone photos. Every photo taken
with a smartphone contains unstructured image content as well as the tagged time,
location, and other identifiable (and structured) information. Semi-structured data
formats include JSON, CSV, and XML file types.
(CentreforKnowledgeTransfer)
institute
STRUCTURED VS UNSTRUCTURED DATA:
• Defined vs Undefined Data
• Qualitative vs Quantitative Data
• Storage in Data Houses vs Data Lakes
• Easy vs Hard to Analyze
• Predefined format vs a variety of formats
(CentreforKnowledgeTransfer)
institute
DEFINED VS UNDEFINED DATA
Defined Undefined Data
Structured data is clearly defined
types of data in a structure
unstructured data is usually
stored in its native format
Structured data lives in rows and
columns and it can be mapped
into pre-defined fields
Unlike structured data, which
is organized and easy to access
relational databases,
data does not have a predefined
data model
(CentreforKnowledgeTransfer)
institute
QUANTITATIVE VS QUALITATIVE DATA
Quantitative Data Qualitative Data
Structured data is often quantitative
data, meaning it usually consists of
hard numbers or things that can be
counted.
Unstructured data, on the other hand,
is often categorized as qualitative data,
and cannot be processed and analyzed
using conventional tools and methods.
Methods for analysis include regression
(to predict relationships between
variables); classification (to estimate
probability); and clustering of data
(based on different attributes).
In a business context, qualitative data
can, for example, come from customer
surveys, interviews, and social media
interactions. Extracting insights from
qualitative data requires advanced
analytics techniques like data
mining and data stacking.
(CentreforKnowledgeTransfer)
institute
STORAGE IN DATA HOUSES VS DATA LAKES
Storage in Data Houses Storage in Data Lakes
Structured data is often stored in data
warehouses
unstructured data is stored in data
lakes
A data warehouse is the endpoint for
the data’s journey through an ETL
pipeline. Both have the potential for
cloud-use
A data lake, on the other hand, is a
of almost limitless repository where
data is stored in its original format or
after undergoing a basic “cleaning”
process.
Structured data requires less storage
space
unstructured data requires more. For
example, even a tiny image takes up
more space than many pages of text
As for databases, structured data is
usually stored in a relational
database (RDBMS),
the best fit for unstructured data
instead is so-called non-relational,
or NoSQL databases
(CentreforKnowledgeTransfer)
institute
EASE OF ANALYSIS
One of the most significant differences between structured and unstructured data is how
well it lends itself to analysis..
Structured
data
Unstructured data
Structured data is
easy to search,
both for humans
and for
algorithms
Unstructured data, on the other hand, is intrinsically more
difficult to search and requires processing to become
understandable
It's challenging to deconstruct since it lacks a predefined
data model and hence doesn't fit in in relational databases.
there are a wide
array of
sophisticated
analytics tools for
structured data
most analytics tools for mining and arranging unstructured
data are still in the developing phase
The lack of predefined structure makes data mining tricky,
and developing best practices on how to handle data
sources like rich media, blogs, social media data, and
(CentreforKnowledgeTransfer)
institute
PREDEFINED FORMAT VS VARIETY OF FORMATS
Predefined Format Variety of Formats
The most common
for structured data is text
and numbers
Unstructured data, on the other hand, comes in a
variety of shapes and sizes. It can consist of
everything from audio, video, and imagery to
and sensor data.
Structured data has been
defined beforehand in a
data model.
There is no data model for the unstructured data; it
is stored natively or in a data lake that doesn't
require any transformation.
Structured data requires
less storage space
unstructured data requires more. For example,
even a tiny image takes up more space than many
pages of text
As for databases,
structured data is usually
stored in a relational
the best fit for unstructured data instead is so-
non-relational, or NoSQL databases
Ad

More Related Content

What's hot (20)

Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
Dr. Hamdan Al-Sabri
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Gajanand Sharma
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
Sulman Ahmed
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
Sunita Sahu
 
3 Data Mining Tasks
3  Data Mining Tasks3  Data Mining Tasks
3 Data Mining Tasks
Mahmoud Alfarra
 
Data analytics vs. Data analysis
Data analytics vs. Data analysisData analytics vs. Data analysis
Data analytics vs. Data analysis
Dr. C.V. Suresh Babu
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
DataminingTools Inc
 
Data Quality
Data QualityData Quality
Data Quality
Michael Collins
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
Shailja Khurana
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
Amanda Whitmire
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
Institute of Technology Telkom
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
SOMASUNDARAM T
 
Decision tree
Decision treeDecision tree
Decision tree
ShraddhaPandey45
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
DataminingTools Inc
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
Davis David
 
Rule Based Algorithms.pptx
Rule Based Algorithms.pptxRule Based Algorithms.pptx
Rule Based Algorithms.pptx
RoshanSuvedi1
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
moni sindhu
 
Decision tree
Decision treeDecision tree
Decision tree
R A Akerkar
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
Dr. Hamdan Al-Sabri
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
Sulman Ahmed
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
Sunita Sahu
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
DataminingTools Inc
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
Shailja Khurana
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
Amanda Whitmire
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
SOMASUNDARAM T
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
DataminingTools Inc
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
Davis David
 
Rule Based Algorithms.pptx
Rule Based Algorithms.pptxRule Based Algorithms.pptx
Rule Based Algorithms.pptx
RoshanSuvedi1
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
moni sindhu
 

Similar to Classification of data (20)

the study of data to extract meaningful insights for business
the study of data to extract meaningful insights for businessthe study of data to extract meaningful insights for business
the study of data to extract meaningful insights for business
EyobTemesgen3
 
BA4206 UNIT 2.pptx business analytics ppt
BA4206 UNIT 2.pptx business analytics pptBA4206 UNIT 2.pptx business analytics ppt
BA4206 UNIT 2.pptx business analytics ppt
LogeshThondamar
 
Big data Analytics Unit - CCS334 Syllabus
Big data Analytics Unit - CCS334 SyllabusBig data Analytics Unit - CCS334 Syllabus
Big data Analytics Unit - CCS334 Syllabus
Sunanthini Rajkumar
 
Understanding the Types of Data in Data Science|ashokveda.pdf
Understanding the Types of Data in Data Science|ashokveda.pdfUnderstanding the Types of Data in Data Science|ashokveda.pdf
Understanding the Types of Data in Data Science|ashokveda.pdf
df2608021
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
Johnson Ubah
 
Data Exploration and Transformation.pptx
Data Exploration and Transformation.pptxData Exploration and Transformation.pptx
Data Exploration and Transformation.pptx
lovepreet33653
 
Big data analytics(BAD601) module-1 ppt
Big data analytics(BAD601)  module-1 pptBig data analytics(BAD601)  module-1 ppt
Big data analytics(BAD601) module-1 ppt
AmbikaVenkatesh4
 
MANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
MANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITYMANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
MANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
RhemaJoy2
 
computer.pdf
computer.pdfcomputer.pdf
computer.pdf
ShivamYadav886008
 
What are the implications of unstructured data to database design- Sup.docx
What are the implications of unstructured data to database design- Sup.docxWhat are the implications of unstructured data to database design- Sup.docx
What are the implications of unstructured data to database design- Sup.docx
loisj1
 
Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
VrushaliSolanke
 
Chapter 2.ppt on Types of Digital f Data
Chapter 2.ppt on Types of Digital f DataChapter 2.ppt on Types of Digital f Data
Chapter 2.ppt on Types of Digital f Data
FatimaNaqvi47
 
Navigating the BI Stack _
Navigating the BI Stack _Navigating the BI Stack _
Navigating the BI Stack _
Michael Phipps
 
Unit_II_1_Types_of_Data.pptx
Unit_II_1_Types_of_Data.pptxUnit_II_1_Types_of_Data.pptx
Unit_II_1_Types_of_Data.pptx
Shahdrashti4
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data Quality
Precisely
 
INTRODUCTION TO DATA ANALYTICS -MODULE 1.pptx
INTRODUCTION TO DATA ANALYTICS -MODULE 1.pptxINTRODUCTION TO DATA ANALYTICS -MODULE 1.pptx
INTRODUCTION TO DATA ANALYTICS -MODULE 1.pptx
paathuu04
 
AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013
Patricia A Gilson
 
Introduction to Big Data Analytics.ppsx
Introduction to Big Data Analytics.ppsxIntroduction to Big Data Analytics.ppsx
Introduction to Big Data Analytics.ppsx
JSujatha2
 
Database
DatabaseDatabase
Database
Vaibhav Bajaj
 
Database
DatabaseDatabase
Database
wwaqas2007
 
the study of data to extract meaningful insights for business
the study of data to extract meaningful insights for businessthe study of data to extract meaningful insights for business
the study of data to extract meaningful insights for business
EyobTemesgen3
 
BA4206 UNIT 2.pptx business analytics ppt
BA4206 UNIT 2.pptx business analytics pptBA4206 UNIT 2.pptx business analytics ppt
BA4206 UNIT 2.pptx business analytics ppt
LogeshThondamar
 
Big data Analytics Unit - CCS334 Syllabus
Big data Analytics Unit - CCS334 SyllabusBig data Analytics Unit - CCS334 Syllabus
Big data Analytics Unit - CCS334 Syllabus
Sunanthini Rajkumar
 
Understanding the Types of Data in Data Science|ashokveda.pdf
Understanding the Types of Data in Data Science|ashokveda.pdfUnderstanding the Types of Data in Data Science|ashokveda.pdf
Understanding the Types of Data in Data Science|ashokveda.pdf
df2608021
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
Johnson Ubah
 
Data Exploration and Transformation.pptx
Data Exploration and Transformation.pptxData Exploration and Transformation.pptx
Data Exploration and Transformation.pptx
lovepreet33653
 
Big data analytics(BAD601) module-1 ppt
Big data analytics(BAD601)  module-1 pptBig data analytics(BAD601)  module-1 ppt
Big data analytics(BAD601) module-1 ppt
AmbikaVenkatesh4
 
MANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
MANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITYMANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
MANAGING RESOURCES FOR BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
RhemaJoy2
 
What are the implications of unstructured data to database design- Sup.docx
What are the implications of unstructured data to database design- Sup.docxWhat are the implications of unstructured data to database design- Sup.docx
What are the implications of unstructured data to database design- Sup.docx
loisj1
 
Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
VrushaliSolanke
 
Chapter 2.ppt on Types of Digital f Data
Chapter 2.ppt on Types of Digital f DataChapter 2.ppt on Types of Digital f Data
Chapter 2.ppt on Types of Digital f Data
FatimaNaqvi47
 
Navigating the BI Stack _
Navigating the BI Stack _Navigating the BI Stack _
Navigating the BI Stack _
Michael Phipps
 
Unit_II_1_Types_of_Data.pptx
Unit_II_1_Types_of_Data.pptxUnit_II_1_Types_of_Data.pptx
Unit_II_1_Types_of_Data.pptx
Shahdrashti4
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data Quality
Precisely
 
INTRODUCTION TO DATA ANALYTICS -MODULE 1.pptx
INTRODUCTION TO DATA ANALYTICS -MODULE 1.pptxINTRODUCTION TO DATA ANALYTICS -MODULE 1.pptx
INTRODUCTION TO DATA ANALYTICS -MODULE 1.pptx
paathuu04
 
AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013AWC Career Bootcamp- August 21, 2013
AWC Career Bootcamp- August 21, 2013
Patricia A Gilson
 
Introduction to Big Data Analytics.ppsx
Introduction to Big Data Analytics.ppsxIntroduction to Big Data Analytics.ppsx
Introduction to Big Data Analytics.ppsx
JSujatha2
 
Ad

More from Dr. C.V. Suresh Babu (20)

Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
Dr. C.V. Suresh Babu
 
Association rules
Association rulesAssociation rules
Association rules
Dr. C.V. Suresh Babu
 
Clustering
ClusteringClustering
Clustering
Dr. C.V. Suresh Babu
 
Classification
ClassificationClassification
Classification
Dr. C.V. Suresh Babu
 
Blue property assumptions.
Blue property assumptions.Blue property assumptions.
Blue property assumptions.
Dr. C.V. Suresh Babu
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
Dr. C.V. Suresh Babu
 
DART
DARTDART
DART
Dr. C.V. Suresh Babu
 
Mycin
MycinMycin
Mycin
Dr. C.V. Suresh Babu
 
Expert systems
Expert systemsExpert systems
Expert systems
Dr. C.V. Suresh Babu
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
Dr. C.V. Suresh Babu
 
Bayes network
Bayes networkBayes network
Bayes network
Dr. C.V. Suresh Babu
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
Dr. C.V. Suresh Babu
 
Knowledge based agents
Knowledge based agentsKnowledge based agents
Knowledge based agents
Dr. C.V. Suresh Babu
 
Rule based system
Rule based systemRule based system
Rule based system
Dr. C.V. Suresh Babu
 
Formal Logic in AI
Formal Logic in AIFormal Logic in AI
Formal Logic in AI
Dr. C.V. Suresh Babu
 
Production based system
Production based systemProduction based system
Production based system
Dr. C.V. Suresh Babu
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
Dr. C.V. Suresh Babu
 
Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AI
Dr. C.V. Suresh Babu
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
Dr. C.V. Suresh Babu
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
Dr. C.V. Suresh Babu
 
Ad

Recently uploaded (20)

Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 

Classification of data

  • 1. CL ASSIFICATION OF DATA Dr. C.V. Suresh Babu (CentreforKnowledgeTransfer) institute
  • 2. (CentreforKnowledgeTransfer) institute OBJECTIVES • To understand the various Classification of Data • To know What is Structured Data? • To know What is Unstructured Data? • To know What is Semistructured Data? • To understand the Key Differences between Structured and Unstructured Data
  • 3. (CentreforKnowledgeTransfer) institute DISCUSSION TOPICS • Classification of Data • What is Structured Data? • What is Unstructured Data? • What is Semistructured Data? • Structured vs Unstructured Data: 5 Key Differences
  • 4. (CentreforKnowledgeTransfer) institute CLASSIFICATION OF DATA • Data classification is broadly defined as the process of organizing data by relevant categories so that it may be used more efficiently. On a basic level, the classification process makes data easier to locate and retrieve. Data classification is of particular importance when it comes to risk management, compliance, and data security. • Data classification involves tagging data to make it easily searchable and trackable. It also eliminates multiple duplications of data, which can reduce storage and backup costs while speeding up the search process.
  • 6. (CentreforKnowledgeTransfer) institute WHAT IS STRUCTURED DATA? • The term structured data refers to data that resides in a fixed field within a file or record. Structured data is typically stored in a relational database (RDBMS). It can consist of numbers and text, and sourcing can happen automatically or manually, as long as it's within an RDBMS structure. It depends on the creation of a data model, defining what types of data to include and how to store and process it. • The programming language used for structured data is SQL (Structured Query Language). Typical examples of structured data are names, Reg. No., Marks, Attendence, and so on. S.No. First Name Last Name Reg. No. 1 Priya Dharshini 18132001 2 Mawa Chouhan 18132002 3 Sai phanindra Muvvala 18132003 4 Nandhini Venkatesa n 18132004
  • 7. (CentreforKnowledgeTransfer) institute WHAT IS UNSTRUCTURED DATA? • Unstructured data is more or less all the data that is not structured. Even though unstructured data may have a native, internal structure, it's not structured in a predefined way. There is no data model; the data is stored in its native format. • Typical examples of unstructured data are rich media, text, social media activity, surveillance imagery, and so on. The amount of unstructured data is much larger than that of structured data. Unstructured data makes up a 80% of all enterprise data, and the percentage keeps growing. This means that companies not taking unstructured data into account are missing out on a lot of valuable business intelligence.
  • 9. (CentreforKnowledgeTransfer) institute WHAT IS SEMI-STRUCTURED DATA? • Semistructured data is a third category that falls somewhere between the other two. It's a type of structured data that does not fit into the formal structure of a relational database. But while not matching the description of structured data entirely, it still employs tagging systems or other markers, separating different elements and enabling search. Sometimes, this is referred to as data with a self-describing structure. • A typical example of semistructured data is smartphone photos. Every photo taken with a smartphone contains unstructured image content as well as the tagged time, location, and other identifiable (and structured) information. Semi-structured data formats include JSON, CSV, and XML file types.
  • 10. (CentreforKnowledgeTransfer) institute STRUCTURED VS UNSTRUCTURED DATA: • Defined vs Undefined Data • Qualitative vs Quantitative Data • Storage in Data Houses vs Data Lakes • Easy vs Hard to Analyze • Predefined format vs a variety of formats
  • 11. (CentreforKnowledgeTransfer) institute DEFINED VS UNDEFINED DATA Defined Undefined Data Structured data is clearly defined types of data in a structure unstructured data is usually stored in its native format Structured data lives in rows and columns and it can be mapped into pre-defined fields Unlike structured data, which is organized and easy to access relational databases, data does not have a predefined data model
  • 12. (CentreforKnowledgeTransfer) institute QUANTITATIVE VS QUALITATIVE DATA Quantitative Data Qualitative Data Structured data is often quantitative data, meaning it usually consists of hard numbers or things that can be counted. Unstructured data, on the other hand, is often categorized as qualitative data, and cannot be processed and analyzed using conventional tools and methods. Methods for analysis include regression (to predict relationships between variables); classification (to estimate probability); and clustering of data (based on different attributes). In a business context, qualitative data can, for example, come from customer surveys, interviews, and social media interactions. Extracting insights from qualitative data requires advanced analytics techniques like data mining and data stacking.
  • 13. (CentreforKnowledgeTransfer) institute STORAGE IN DATA HOUSES VS DATA LAKES Storage in Data Houses Storage in Data Lakes Structured data is often stored in data warehouses unstructured data is stored in data lakes A data warehouse is the endpoint for the data’s journey through an ETL pipeline. Both have the potential for cloud-use A data lake, on the other hand, is a of almost limitless repository where data is stored in its original format or after undergoing a basic “cleaning” process. Structured data requires less storage space unstructured data requires more. For example, even a tiny image takes up more space than many pages of text As for databases, structured data is usually stored in a relational database (RDBMS), the best fit for unstructured data instead is so-called non-relational, or NoSQL databases
  • 14. (CentreforKnowledgeTransfer) institute EASE OF ANALYSIS One of the most significant differences between structured and unstructured data is how well it lends itself to analysis.. Structured data Unstructured data Structured data is easy to search, both for humans and for algorithms Unstructured data, on the other hand, is intrinsically more difficult to search and requires processing to become understandable It's challenging to deconstruct since it lacks a predefined data model and hence doesn't fit in in relational databases. there are a wide array of sophisticated analytics tools for structured data most analytics tools for mining and arranging unstructured data are still in the developing phase The lack of predefined structure makes data mining tricky, and developing best practices on how to handle data sources like rich media, blogs, social media data, and
  • 15. (CentreforKnowledgeTransfer) institute PREDEFINED FORMAT VS VARIETY OF FORMATS Predefined Format Variety of Formats The most common for structured data is text and numbers Unstructured data, on the other hand, comes in a variety of shapes and sizes. It can consist of everything from audio, video, and imagery to and sensor data. Structured data has been defined beforehand in a data model. There is no data model for the unstructured data; it is stored natively or in a data lake that doesn't require any transformation. Structured data requires less storage space unstructured data requires more. For example, even a tiny image takes up more space than many pages of text As for databases, structured data is usually stored in a relational the best fit for unstructured data instead is so- non-relational, or NoSQL databases