MODULE 4

Uploaded by

devpanwar2907

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

MODULE 4

Uploaded by

devpanwar2907

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

MODULE 4

Rough Set Theory | An Introduction

• Rough set (RS) theory, introduced by Z. Pawlak
at the beginning of the 1980s, is extensively
used for pattern recognition.
• RS has emerged as a powerful mathematical
tool for handling uncertainty, that is,
indiscernibility between objects in a set.
• The utility of RS is also well recognized in
various knowledge discovery processes.
• In computer science, a rough set, first
described by Polish computer
scientist Zdzisław I. Pawlak, is a formal
approximation of a crisp set (i.e., conventional
set) in terms of a pair of sets which give
the lower and the upper approximation of the
original set.
• In the standard version of rough set theory
(Pawlak 1991), the lower- and upper-
approximation sets are crisp sets, but in other
variations, the approximating sets may
be fuzzy sets.
• The notion of Rough sets was introduced by Z Pawlak in
his seminal paper of 1982 (Pawlak 1982).
• It is a formal theory derived from fundamental research
on logical properties of information systems.
• Rough set theory has been a methodology of database
mining or knowledge discovery in relational databases.
• In its abstract form, it is a new area of uncertainty
mathematics closely related to fuzzy theory.
• We can use rough set approach to discover structural
relationship within imprecise and noisy data. Rough sets
and fuzzy sets are complementary generalizations of
classical sets. The approximation spaces of rough set
theory are sets with multiple memberships, while fuzzy
sets are concerned with partial memberships.
• The rapid development of these two
approaches provides a basis for “soft
computing, ” initiated by Lotfi A. Zadeh. Soft
Computing includes along with rough sets, at
least fuzzy logic, neural networks, probabilistic
reasoning, belief networks, machine learning,
evolutionary computing, and chaos theory.
• Goals of Rough Set Theory –
• The main goal of the rough set analysis is the
induction of (learning) approximations of concepts.
Rough sets constitute a sound basis for KDD. It
offers mathematical tools to discover patterns
hidden in data.
• It can be used for feature selection, feature
extraction, data reduction, decision rule generation,
and pattern extraction (templates, association
rules) etc.
• Identifies partial or total dependencies in data,
eliminates redundant data, gives approach to null
values, missing data, dynamic data and others.
A survey on rough set theory and its
applications
• Proposed by Professor Pawlak in 1982, the rough set theory is an
important mathematical tool to deal with imprecise,
inconsistent, incomplete information and knowledge.
• Originated from the simple information model, the basic idea of
the rough set theory can be divided into two parts.
• The first part is to form concepts and rules through
the classification of relational database.
• And the second part is to discovery knowledge through the
classification of the equivalence relation and classification for
the approximation of the target.
• As a theory of data analysis and processing, the rough set theory
is a new mathematical tool to deal with uncertain information
after probability theory, fuzzy set theory, and evidence theory.
• Because of novel thinking, unique method and easy
operation, the rough set theory has become an
important information processing tool in the field of
intelligent information processing.
• And it has been widely used in machine learning,
knowledge discovery, data mining, decision support
and analysis, etc.
• In 1992, the first International Workshop on rough
set theory was held in Poland. And the rough set
theory was considered as a new research topic in
computer science by the ACM in 1995.
• 2. Basic concepts of rough sets
• One of the main research problems of the rough
sets is the approximation of sets, and the other one
is the algorithms of the analysis or reasoning for
related data. Some basic concepts on rough set
theory are reviewed as follows.
• Consider a simple knowledge representation
scheme in which a finite set of objects is described
by a finite set of attributes. Formally, it can be
defined by an information system S expressed as
the 4-tuple.
• S=〈U,R,V,f〉, R=C∪D,
•
• where U is a finite nonempty set of
objects, R is a finite nonempty set of
attributes, the subsets C and D are called
condition attribute set and decision
attribute set, respectively.
• V=∪a∈RVa, where Va is the set of values of
attribute a, and card (Va) > 1, and f:R→V is an
information or a description function.
• In Table 1 U={x1,x2,⋯,x6} is a finite nonempty
set, also called a universe,
and R={Headache,Myalgia,Temperature,Flu} is a
finite nonempty set, also called an attribute set.
• Definition 1
• (Indiscernible relation) [5] Given a subset of
attribute set B⊆R, an indiscernible
relation ind(B) on the universe U can be defined
as follows,
• ind(B)={(x,y)|(x,y)∈U2,∀b∈B(b(x)=b(y))}
• Table 1. An information table [5].
Individual Headache Myalgia Temperature Flu
number
x1 Yes Yes Normal No
x2 Yes Yes High Yes
x3 Yes Yes Very high Yes
x4 No Yes Normal No
x5 No No High No
x6 No Yes Very high Yes
• The equivalence relation is an indiscernible
relation. And the equivalence class of an
object x is denoted by [x]ind(B), or simply
[x]B and [x], if no confusion arises. The pair (U,
[x]ind(B)) is called an approximation space.
• Definition 2
• (Upper and lower approximation
sets) [5] Given an information system S=
〈U,R,V,f〉, for a subset X⊆U, its lower and
upper approximation sets are defined,
respectively, by
• Decision Tree Classification Algorithm
• Introduction
• Decision Trees are a type of Supervised
Machine Learning (that is you explain what the
input is and what the corresponding output is
in the training data) where the data is
continuously split according to a certain
parameter.
• The tree can be explained by two entities,
namely decision nodes and leaves. The leaves
are the decisions or the final outcomes. And
the decision nodes are where the data is split.
• D e c i s i o n Tre e i s a S u p e r v i s e d l e a r n i n g
technique that can be used for both classification
and Regression problems, but mostly it is preferred
for solving Classification problems. It is a tree-
structured classifier, where internal nodes
represent the features of a dataset, branches
represent the decision rules and each leaf node
represents the outcome.
• In a Decision tree, there are two nodes, which are
the Decision Node and Leaf Node. Decision nodes
are used to make any decision and have multiple
branches, whereas Leaf nodes are the output of
those decisions and do not contain any further
branches.
• The decisions or the test are performed on the basis
of features of the given dataset.
• It is a graphical representation for getting all the
possible solutions to a problem/decision based on
given conditions.
• It is called a decision tree because, similar to a tree,
it starts with the root node, which expands on
further branches and constructs a tree-like
structure.
• In order to build a tree, we use the CART
algorithm, which stands for Classification and
Regression Tree algorithm.
• A decision tree simply asks a question, and
based on the answer (Yes/No), it further split
the tree into subtrees.
• Below diagram explains the general structure
of a decision tree:
• Note: A decision tree can contain categorical
data (YES/NO) as well as numeric data.
• Decision Tree Terminologies
• Root Node: Root node is from where the decision tree
starts. It represents the entire dataset, which further
gets divided into two or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and
the tree cannot be segregated further after getting a
leaf node.
• Splitting: Splitting is the process of dividing the
decision node/root node into sub-nodes according to
the given conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the
unwanted branches from the tree.
• Parent/Child node: The root node of the tree is called
the parent node, and other nodes are called the child
nodes.
• Example: Suppose there is a candidate who has a
job offer and wants to decide whether he should
accept the offer or Not.
• So, to solve this problem, the decision tree starts
with the root node (Salary attribute by ASM).
Attribute Selection Measure (ASM).
• The root node splits further into the next decision
node (distance from the office) and one leaf node
based on the corresponding labels.
• The next decision node further gets split into one
decision node (Cab facility) and one leaf node.
• Finally, the decision node splits into two leaf nodes
(Accepted offers and Declined offer). Consider the
below diagram:
• Advantages of the Decision Tree
• It is simple to understand as it follows the
same process which a human follow while
making any decision in real-life.
• It can be very useful for solving decision-
related problems.
• It helps to think about all the possible
outcomes for a problem.
• There is less requirement of data cleaning
compared to other algorithms.
• Disadvantages of the Decision Tree
• The decision tree contains lots of layers, which
makes it complex.
• It may have an overfitting issue, which can be
resolved using the Random Forest algorithm.
• For more class labels, the computational
complexity of the decision tree may increase.

A Review On Dimensionality Reduction
No ratings yet
A Review On Dimensionality Reduction
12 pages
Rough Set Notes
No ratings yet
Rough Set Notes
2 pages
Unit 1-1
No ratings yet
Unit 1-1
45 pages
Lecture Rough Sets
No ratings yet
Lecture Rough Sets
31 pages
Intelligent Decision Support - Handbook of Applications and Advances of The Rough Sets Theory PDF
No ratings yet
Intelligent Decision Support - Handbook of Applications and Advances of The Rough Sets Theory PDF
471 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
759-ArticleText-1527-1-10-20190412
No ratings yet
759-ArticleText-1527-1-10-20190412
7 pages
A New Rough Sets Model Based On Database Systems: Xiaohua Hu T. Y. Lin
No ratings yet
A New Rough Sets Model Based On Database Systems: Xiaohua Hu T. Y. Lin
18 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Multiple-Category Attribute Reduct Using Decision-Theoretic Rough Set Model
No ratings yet
Multiple-Category Attribute Reduct Using Decision-Theoretic Rough Set Model
18 pages
Module 5 - Supervised Learning Algorithms
No ratings yet
Module 5 - Supervised Learning Algorithms
38 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
19 -- Decision Tree -- ID3
No ratings yet
19 -- Decision Tree -- ID3
87 pages
Machine Learning: Prepared by
No ratings yet
Machine Learning: Prepared by
44 pages
Data Mining With Rough Set Using Map-Reduce
No ratings yet
Data Mining With Rough Set Using Map-Reduce
7 pages
Rough Set
No ratings yet
Rough Set
10 pages
DW&DM(Unit -4)
No ratings yet
DW&DM(Unit -4)
9 pages
Rough Set Theory
No ratings yet
Rough Set Theory
19 pages
ML Unit-Ii Notes
No ratings yet
ML Unit-Ii Notes
17 pages
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
No ratings yet
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
5 pages
ML Lecture 3
No ratings yet
ML Lecture 3
13 pages
Assignment 2 (02201022022)
No ratings yet
Assignment 2 (02201022022)
9 pages
HW1
No ratings yet
HW1
4 pages
Dbms Unit 3
No ratings yet
Dbms Unit 3
40 pages
Knowledge Mining Using Classification Through Clustering
No ratings yet
Knowledge Mining Using Classification Through Clustering
6 pages
Confusion Matrices and Rough Set Data Analysis
No ratings yet
Confusion Matrices and Rough Set Data Analysis
6 pages
Business Analytics.
No ratings yet
Business Analytics.
18 pages
Data Mining Questions and Answers
No ratings yet
Data Mining Questions and Answers
22 pages
Building Knowledge For Substation-Based Decision Support Using Rough Sets
No ratings yet
Building Knowledge For Substation-Based Decision Support Using Rough Sets
8 pages
DWM Solution May 2019
No ratings yet
DWM Solution May 2019
9 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
41 pages
M2 Decision trees
No ratings yet
M2 Decision trees
37 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Unit 3
No ratings yet
Unit 3
31 pages
ML
No ratings yet
ML
3 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Time-Series Data Analysis With Rough Sets
No ratings yet
Time-Series Data Analysis With Rough Sets
5 pages
Rough Sets Tutorial
No ratings yet
Rough Sets Tutorial
57 pages
Lecture 7 Overview of ML models
No ratings yet
Lecture 7 Overview of ML models
77 pages
Reduction Using Semi Correlation Factor
No ratings yet
Reduction Using Semi Correlation Factor
10 pages
c5 RoughSet
No ratings yet
c5 RoughSet
67 pages
Unit 4
No ratings yet
Unit 4
33 pages
Fuzzy Logic - Retrieval of Data From Database
No ratings yet
Fuzzy Logic - Retrieval of Data From Database
7 pages
FPGA Implementation of A Reduct Generation Algorithm Based On Rough Set Theory
No ratings yet
FPGA Implementation of A Reduct Generation Algorithm Based On Rough Set Theory
7 pages
DWDM Asgmnt Prog
No ratings yet
DWDM Asgmnt Prog
51 pages
Rough Set For Categorical
No ratings yet
Rough Set For Categorical
21 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Guided Tour To Random Forest
No ratings yet
Guided Tour To Random Forest
42 pages
Unit 2
No ratings yet
Unit 2
11 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
DM Witten 03
No ratings yet
DM Witten 03
56 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Entropy 20 00788
No ratings yet
Entropy 20 00788
13 pages
Minimal Decision Rules Based On The Apriori Algorithm: Int. J. Appl. Math. Comput. Sci., 2001, Vol.11, No.3, 691-704
No ratings yet
Minimal Decision Rules Based On The Apriori Algorithm: Int. J. Appl. Math. Comput. Sci., 2001, Vol.11, No.3, 691-704
14 pages
ML Classifiers
No ratings yet
ML Classifiers
48 pages
Unit-IV Rough Set Theory
No ratings yet
Unit-IV Rough Set Theory
40 pages
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
ADHD and Self Efficacy
No ratings yet
ADHD and Self Efficacy
20 pages
Module 2 - Lesson 3 TYPES AND DISTINCTIONS OF TESTS
No ratings yet
Module 2 - Lesson 3 TYPES AND DISTINCTIONS OF TESTS
3 pages
Role of Ai
No ratings yet
Role of Ai
2 pages
Mod3 - Learning Theory
No ratings yet
Mod3 - Learning Theory
10 pages
Project Based Learning Problem Based Learning
No ratings yet
Project Based Learning Problem Based Learning
2 pages
Fundamentals of Nursing Practice (Theory) - Module 2 - Communication
No ratings yet
Fundamentals of Nursing Practice (Theory) - Module 2 - Communication
16 pages
Succeeding With Adult ADHD Second Edition Sample Chapter
100% (1)
Succeeding With Adult ADHD Second Edition Sample Chapter
13 pages
Week e - Evaluate Opinions
No ratings yet
Week e - Evaluate Opinions
8 pages
1 Prepared By: Anand Pal Boudh
No ratings yet
1 Prepared By: Anand Pal Boudh
14 pages
The Wheel of Life
100% (3)
The Wheel of Life
11 pages
The Effects of Speaking English in Promoting Business of Selected Senior High School Students
No ratings yet
The Effects of Speaking English in Promoting Business of Selected Senior High School Students
11 pages
Group 8-Resume and Cover Letter
No ratings yet
Group 8-Resume and Cover Letter
5 pages
IEEE - Applications of Deep Learning and Reinforcement Learning To Biological Data PDF
No ratings yet
IEEE - Applications of Deep Learning and Reinforcement Learning To Biological Data PDF
17 pages
Lesson Plan 2 Ci 403
No ratings yet
Lesson Plan 2 Ci 403
12 pages
MAPEH 4 Intervention or Remediation Plan For The Identified Learning Gaps in The Different Learning Areas and Grade Levels
100% (2)
MAPEH 4 Intervention or Remediation Plan For The Identified Learning Gaps in The Different Learning Areas and Grade Levels
1 page
Clinpsych Lecture3
No ratings yet
Clinpsych Lecture3
17 pages
Casa Del Bambino Emmanuel Montessori Contreras Compound, Alangilan, Batangas City
No ratings yet
Casa Del Bambino Emmanuel Montessori Contreras Compound, Alangilan, Batangas City
37 pages
Distributive Pronouns
100% (1)
Distributive Pronouns
1 page
Educ 540 Domain 3 Reflection
No ratings yet
Educ 540 Domain 3 Reflection
6 pages
A Dark Side of Happiness How When and Why Happines
No ratings yet
A Dark Side of Happiness How When and Why Happines
14 pages
What Are The Adequate Pedagogical Approaches For Teaching Scientific Disciplines? Physics As A Case Study
No ratings yet
What Are The Adequate Pedagogical Approaches For Teaching Scientific Disciplines? Physics As A Case Study
8 pages
From The Principal's Desk
No ratings yet
From The Principal's Desk
2 pages
Avansa Naufal Hakim 08202244020
No ratings yet
Avansa Naufal Hakim 08202244020
161 pages
Fourth Dimensional Sub
No ratings yet
Fourth Dimensional Sub
7 pages
Fys Final Assessment Report
No ratings yet
Fys Final Assessment Report
21 pages
Detailed Lesson Plan Mathematics Grade L
No ratings yet
Detailed Lesson Plan Mathematics Grade L
3 pages
PDF (SG) - EAP 11 - 12 - UNIT 11 - LESSON 1 - Features and Structure of Surveys
No ratings yet
PDF (SG) - EAP 11 - 12 - UNIT 11 - LESSON 1 - Features and Structure of Surveys
14 pages
Kiribati Internship
No ratings yet
Kiribati Internship
17 pages
HW01
No ratings yet
HW01
29 pages
Ob Presentation Chapter 8
No ratings yet
Ob Presentation Chapter 8
43 pages

MODULE 4

Uploaded by

MODULE 4

Uploaded by

MODULE 4

Rough Set Theory | An Introduction

You might also like