0% found this document useful (0 votes)

11 views22 pages

DAV Unit 1

Uploaded by

Khushbu Pandya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

DAV Unit 1

Uploaded by

Khushbu Pandya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

IAR University

Department of Computer Sciences and Engineering

B.Tech (CE-AI) & B.Tech IT SEM VI
Subject: CE412_Data Analytics and Visualization
CS707_Data Analytics

Unit 1

What is Data in Data Analytics?

Data in data analytics refers to raw figures or facts that are collected, stored, and processed to
derive meaningful insights. It can be numbers, text, images, videos, or other types of
measurable information used to analyze trends, patterns, and behaviors.

In simple terms:

 Data is like raw materials.

 When processed, it helps make decisions, solve problems, and predict future outcomes.

Types of Data

1. Structured Data:
o Organized in rows and columns like in a table or database.
o Example: An Excel sheet with student names, ages, and marks.
2. Unstructured Data:
o Not organized or stored in a predefined format.
o Example: Photos, social media posts, emails, or videos.
3. Semi-structured Data:
o Not fully organized but has some structure (like tags or metadata).
o Example: JSON or XML files.

Characteristics of Data

1. Qualitative Data (Categorical):

o Non-numerical data.
o Example: Colors of cars (red, blue, black), customer feedback.
2. Quantitative Data (Numerical):
o Numbers or measurements.
o Example: Height, weight, sales figures.

Importance of Data in Analytics

 Data is the foundation of data analytics.

 Without data, it's impossible to perform analysis or make predictions.
 Data helps businesses understand customer behavior, improve operations, and increase
efficiency.

Dr. Pankti Bhatt

Example of Data in Data Analytics

Imagine a retail store wants to understand customer buying habits. Here’s the data they might
collect:

1. Customer details: Name, age, gender.

2. Sales transactions: Items bought, price, payment method.
3. Feedback: Customer ratings and reviews.

Using this data:

 The store can analyze popular products.

 Predict seasonal demand for items.
 Improve customer experience by addressing complaints.

What is Data Analytics?

Data Analytics is the process of examining, organizing, and interpreting data to extract useful
insights and patterns. These insights help businesses, organizations, and individuals make
better decisions.

In simple terms:

 Data analytics is like solving a puzzle where the pieces are bits of data.
 When you put the pieces together, you can understand a bigger picture, like trends,
behaviors, or outcomes.

Key Steps in Data Analytics

1. Collecting Data:
o Gather data from various sources like surveys, sales reports, sensors, or
websites.
o Example: A company collects data about product sales, customer feedback, and
website visits.
2. Cleaning Data:
o Remove errors, duplicates, or irrelevant data to ensure accuracy.
o Example: Fix missing data or correct typos in a customer database.
3. Analyzing Data:
o Use statistical and computational methods to find patterns or relationships in
the data.
o Example: Calculate average sales or identify which products are most popular.
4. Visualizing Data:
o Present findings using charts, graphs, or dashboards to make them easy to
understand.
o Example: Create a pie chart showing the percentage of sales from different
product categories.

Dr. Pankti Bhatt

Types of Data Analytics

1. Descriptive Analytics:
o Tells what happened in the past.
o Example: "Our sales increased by 20% last quarter."
2. Diagnostic Analytics:
o Explains why something happened.
o Example: "Sales increased because of a holiday season promotion."
3. Predictive Analytics:
o Predicts future outcomes based on past data.
o Example: "Sales are likely to increase by 30% next quarter."
4. Prescriptive Analytics:
o Suggests actions to achieve a desired outcome.
o Example: "Offer discounts during the holiday season to boost sales."

Importance of Data Analytics

 Helps businesses understand customers better.

 Improves decision-making by providing data-driven insights.
 Detects patterns or trends to predict future events.
 Optimizes processes and performance.

Example of Data Analytics

Scenario: A restaurant wants to improve its business.

1. Data Collection:
o The restaurant collects data on daily sales, popular dishes, customer
demographics, and feedback.
2. Analysis:
o They find:
 Most customers order burgers and fries.
 Sales peak during weekends.
 Customers complain about slow service.
3. Insights:
o The restaurant learns:
 They should focus on improving their burger menu.
 Hire extra staff on weekends to handle the rush.
4. Action:
o Introduce a new burger combo and streamline kitchen processes to improve
service speed.

What is Data Mining?

Data Mining is the process of discovering patterns, trends, and useful information from large
sets of data. It uses techniques from statistics, machine learning, and database systems to
analyze and extract hidden insights that might not be immediately obvious.

Dr. Pankti Bhatt

In simple terms:

 Data mining is like digging into a mountain of data to find valuable "gold nuggets" of
information.
 It helps organizations make better decisions by uncovering trends and patterns in data.

Steps in Data Mining

1. Data Collection:
o Gather large amounts of data from various sources like databases, websites, or
devices.
o Example: An online retailer collects data on customer purchases, browsing
habits, and product reviews.
2. Data Cleaning:
o Remove errors, duplicates, or irrelevant data to ensure the analysis is accurate.
o Example: Fix typos in customer names or remove incomplete records.
3. Data Integration:
o Combine data from multiple sources into a single dataset.
o Example: Merge customer purchase data with demographic information.
4. Data Analysis:
o Apply algorithms to identify patterns, trends, or relationships in the data.
o Example: Use clustering to group customers with similar buying habits.
5. Interpretation and Action:
o Translate the patterns into actionable insights.
o Example: Use findings to recommend products to customers or improve
marketing strategies.

Techniques in Data Mining

1. Classification:
o Assign data into predefined categories.
o Example: Classifying emails as "spam" or "not spam."
2. Clustering:
o Group similar data points together.
o Example: Grouping customers based on shopping habits.
3. Association Rules:
o Find relationships between items in a dataset.
o Example: "Customers who buy bread often buy butter."
4. Regression:
o Predict a numeric value based on existing data.
o Example: Predicting house prices based on size and location.
5. Outlier Detection:
o Identify data points that don't fit the usual pattern.
o Example: Detecting fraudulent transactions in a credit card dataset.

Importance of Data Mining

 Helps businesses understand customer behavior.

 Identifies trends and patterns to improve decision-making.
 Detects anomalies or risks, such as fraud or equipment failure.

Dr. Pankti Bhatt

 Increases efficiency by uncovering hidden opportunities.

Example of Data Mining

Scenario: A supermarket wants to increase sales.

1. Data Collection:
o The supermarket collects data on customer purchases over a year.
2. Analysis Using Association Rules:
o They find that customers who buy diapers often buy beer.
3. Insight:
o There is a strong association between these two products.
4. Action:
o The supermarket places beer and diapers closer to each other to encourage
combined purchases, leading to increased sales.

Difference Between Data Mining and Data Analytics

 Data Mining focuses on finding hidden patterns in data.

 Data Analytics focuses on interpreting data to solve problems or make decisions.

What is Knowledge Discovery?

Knowledge Discovery is the overall process of finding useful and meaningful information or
patterns in large datasets. It involves multiple steps to extract insights that can help in decision-
making. Data Mining is a key step in this process.

In simple terms:

 Knowledge Discovery is like finding a hidden treasure in a sea of information.

 It transforms raw data into knowledge that can be understood and used effectively.

Steps in Knowledge Discovery

1. Data Selection:
o Identify and choose the relevant data needed for analysis.
o Example: A retail store selects sales data for the last two years.
2. Data Preprocessing:
o Clean and organize the data to remove errors, duplicates, or missing values.
o Example: Correct typos in customer names or remove records with incomplete
purchase details.
3. Data Transformation:
o Convert the data into a suitable format for analysis.
o Example: Transform raw sales data into monthly sales summaries.
4. Data Mining:
o Apply algorithms and techniques to uncover patterns or relationships in the data.
o Example: Use clustering to group customers based on their shopping habits.
5. Pattern Evaluation:

Dr. Pankti Bhatt

o
Assess the patterns discovered to ensure they are meaningful and useful.
o
Example: A pattern showing that customers buy more during holiday seasons is
useful for planning promotions.
6. Knowledge Representation:
o Present the findings in an easy-to-understand format like charts, graphs, or
reports.
o Example: Create a bar chart showing peak sales months.

Importance of Knowledge Discovery

 Helps in decision-making by providing actionable insights.

 Uncovers hidden patterns or trends that may not be obvious.
 Improves efficiency by identifying opportunities and risks.
 Helps organizations adapt to changing environments by understanding data better.

Example of Knowledge Discovery

Scenario: A bank wants to reduce loan defaults.

1. Data Selection:
o The bank collects customer data, including income, credit history, and previous
loan repayments.
2. Data Preprocessing:
o Remove incomplete records and correct errors in the dataset.
3. Data Transformation:
o Organize the data into categories like "low risk" and "high risk."
4. Data Mining:
o Use classification algorithms to identify characteristics of customers who are
likely to default on loans.
5. Pattern Evaluation:
o Identify that customers with a credit score below 600 and income below $50,000
are more likely to default.
6. Knowledge Representation:
o Create a report highlighting these risk factors and suggest strategies to reduce
defaults, like offering smaller loans to high-risk customers.

Difference Between Data Mining and Knowledge Discovery

 Data Mining: A step in the knowledge discovery process, focusing on finding patterns
in data.
 Knowledge Discovery: The complete process, from selecting data to presenting
actionable insights.

What are Relations in Data Analytics?

Relations refer to the connections or associations between different pieces of data.

Understanding these relationships is essential for discovering patterns and insights in data
analytics.

Dr. Pankti Bhatt

Types of Relations in Data

1. One-to-One Relationship:
o Each item in one dataset is related to only one item in another dataset.
o Example: A person and their unique passport number.
2. One-to-Many Relationship:
o One item in one dataset is related to multiple items in another dataset.
o Example: A customer can make multiple purchases at a store.
3. Many-to-Many Relationship:
o Multiple items in one dataset are related to multiple items in another dataset.
o Example: Students enrolled in multiple courses, and each course having
multiple students.
4. Hierarchical Relationships:
o Data arranged in a tree-like structure.
o Example: A company’s organizational structure where one manager supervises
several employees.
5. Network Relationships:
o Complex, interconnected relationships among data points.
o Example: Social media connections where users are linked to their friends.

Why are Relations Important in Data Analytics?

 Understanding Connections: Relations help identify how different pieces of data are
connected.
 Finding Patterns: Relations reveal trends or patterns, such as customer behavior or
product preferences.
 Making Predictions: Relations enable predictive analytics, such as forecasting sales
based on customer interactions.

Example of Data and Relations in Data Analytics

Scenario: An e-commerce company wants to analyze its customer data to improve sales.

1. Data:
o Collect customer data such as name, age, gender, purchase history, and
feedback.
2. Relations:
o One-to-One: Each customer has a unique customer ID.
o One-to-Many: A single customer may have multiple orders in their purchase
history.
o Many-to-Many: Customers can buy multiple products, and each product can
be bought by multiple customers.
3. Analysis:
o Identify that younger customers (18-25 years) frequently buy trendy gadgets.
o Analyze relations between product categories and purchase frequency to
recommend products.

Dr. Pankti Bhatt

4. Outcome:
o Use these insights to create personalized marketing campaigns or recommend
related products, boosting sales.

What is the Iris Dataset in Data Analytics?

The Iris dataset is one of the most famous and widely used datasets in data analytics and
machine learning. It is a small dataset that is simple to work with, making it ideal for beginners
learning data analysis, classification, and clustering techniques.

Key Features of the Iris Dataset

1. Dataset Size:
o It contains information about 150 samples of iris flowers.
o These samples are equally distributed across three species of iris flowers:
 Iris-setosa
 Iris-versicolor
 Iris-virginica
2. Attributes (Features): The dataset has four numerical features for each flower:
o Sepal Length: Length of the outer petal in cm.
o Sepal Width: Width of the outer petal in cm.
o Petal Length: Length of the inner petal in cm.
o Petal Width: Width of the inner petal in cm.
3. Target Variable (Label):
o The species of the flower (setosa, versicolor, or virginica).
4. Data Format: It is often stored in a tabular format, like this:

Sepal Length Sepal Width Petal Length Petal Width Species

5.1 3.5 1.4 0.2 Iris-setosa

7.0 3.2 4.7 1.4 Iris-versicolor

6.3 3.3 6.0 2.5 Iris-virginica

Why is the Iris Dataset Popular in Data Analytics?

1. Simplicity:
o The dataset is small and easy to understand.
o Perfect for beginners learning data visualization, classification, and machine
learning.
2. Variety:
o It contains both continuous features (like petal length) and a categorical target
variable (species).
3. Balanced Classes:

Dr. Pankti Bhatt

o Each species has an equal number of samples (50), making it suitable for
supervised learning tasks.

Applications of the Iris Dataset

1. Data Visualization:
o Helps create scatter plots, histograms, and pair plots to observe relationships
between features.
2. Classification:
o Used to train machine learning models like k-Nearest Neighbors (k-NN),
Support Vector Machines (SVM), and Decision Trees to classify the flower
species.
3. Clustering:
o Helps in unsupervised learning to group flowers based on their features (e.g., k-
means clustering).
4. Feature Analysis:
o Allows analyzing which features (like petal length) are most useful for
distinguishing between species.

Example Analysis Using the Iris Dataset

Scenario: You want to classify an iris flower based on its sepal and petal dimensions.

1. Step 1: Data Exploration:

o Visualize the dataset to see how features like petal length differ across species.
o Example: A scatter plot showing petal length vs. petal width may show that Iris-
setosa is distinct from the other species.
2. Step 2: Train a Model:
o Use a classification algorithm like Decision Trees.
o Train the model on 80% of the dataset and test it on the remaining 20%.
3. Step 3: Predict:
o Input the dimensions of a new flower (e.g., sepal length: 5.8, petal width: 1.8).
o The model predicts the species as Iris-versicolor.
4. Step 4: Evaluate:
o Check the model's accuracy using metrics like precision, recall, or accuracy
score.

What are Data Scales in Data Analytics?

Data scales describe the different ways that data can be measured or classified. In data
analytics, understanding data scales is important because it helps determine which statistical or
analytical methods can be used.

There are four main types of data scales:

1. Nominal Scale
2. Ordinal Scale
3. Interval Scale

Dr. Pankti Bhatt

4. Ratio Scale

1. Nominal Scale

 Definition:
Data is categorized into distinct groups or categories without any order or ranking.
 Characteristics:
o No numerical value or order.
o Categories are mutually exclusive (no overlap).
 Example:
o Types of fruits: Apple, Banana, Orange.
o Gender: Male, Female.
 Usage:
o Used for classification, grouping, and counting.
o Example in Analytics: Counting how many people prefer each type of fruit.

2. Ordinal Scale

 Definition:
Data is categorized into ordered categories, but the intervals between the categories are
not uniform.
 Characteristics:
o There is an order or ranking.
o Differences between rankings are not meaningful.
 Example:
o Customer satisfaction levels: Poor, Average, Good, Excellent.
o Education level: High School, Bachelor's, Master's, Ph.D.
 Usage:
o Used for ranking or prioritizing.
o Example in Analytics: Analyzing customer satisfaction trends over time.

3. Interval Scale

 Definition:
Data is measured on a scale where intervals between values are meaningful, but there
is no true zero point.
 Characteristics:
o Differences between values are meaningful.
o No "absolute zero" (e.g., zero does not mean "nothing").
 Example:
o Temperature in Celsius or Fahrenheit: 20°C, 30°C (difference of 10°C is
meaningful, but 0°C does not mean "no temperature").
o Time of day: 2 PM, 3 PM (intervals are consistent).
 Usage:
o Used for comparing differences.
o Example in Analytics: Analyzing temperature changes over a period.

Dr. Pankti Bhatt

4. Ratio Scale

 Definition:
Data is measured on a scale with meaningful intervals and a true zero point, where zero
indicates "nothing."
 Characteristics:
o Differences and ratios are meaningful.
o Allows for all mathematical operations (addition, subtraction, multiplication,
division).
 Example:
o Weight: 0 kg, 50 kg, 100 kg (0 kg means no weight).
o Income: $0, $10,000, $50,000.
 Usage:
o Used for quantitative analysis, like calculating averages or percentages.
o Example in Analytics: Analyzing the average income of a group.

Why Are Data Scales Important in Data Analytics?

1. Choosing the Right Method:

o Different scales require different statistical and analytical techniques.
o Example: You can calculate averages for ratio data but not for nominal data.
2. Data Visualization:
o The choice of graph or chart depends on the data scale.
o Example: A bar chart is suitable for nominal data, but a line chart works better
for ratio data.
3. Accurate Analysis:
o Using the wrong methods for a specific data scale can lead to incorrect results.
o Example: Running a regression analysis on ordinal data may not be appropriate.

Examples in Simple Scenarios

Scenario 1: Analyzing Customer Demographics

 Nominal: Gender (Male, Female).

 Ordinal: Education level (High School, Bachelor's, Master's).
 Ratio: Annual income ($50,000, $60,000).

Scenario 2: Weather Analysis

 Nominal: Type of weather (Sunny, Rainy, Cloudy).

 Interval: Temperature (25°C, 30°C).
 Ratio: Amount of rainfall (0 mm, 10 mm, 20 mm).

Dr. Pankti Bhatt

What is Set and Matrix Representation in Data Analytics?

Dr. Pankti Bhatt

Dr. Pankti Bhatt
Dr. Pankti Bhatt
What are Dissimilarity Measures in Data Analytics?

Dissimilarity measures in data analytics are methods used to quantify how different two data
points are. These measures help in understanding the "distance" or "difference" between
objects, which is critical for tasks like clustering, classification, and recommendation systems.

Key Points About Dissimilarity Measures:

1. Purpose:
To identify how "similar" or "different" two data points are based on their attributes.
2. Types of Data:
Dissimilarity measures can be applied to:
o Numerical data (e.g., age, height, income).
o Categorical data (e.g., gender, color, product type).
o Mixed data (both numerical and categorical).
3. Use Case:
Dissimilarity measures are used in algorithms like k-means clustering, nearest
neighbor classification, and hierarchical clustering.

Dr. Pankti Bhatt

Dr. Pankti Bhatt
Why Are Dissimilarity Measures Important?

1. Clustering:
Dissimilarity measures group similar data points into clusters (e.g., grouping customers
with similar purchasing behavior).
2. Recommendation Systems:
Netflix or Amazon uses dissimilarity measures to recommend movies/products based
on similar users.
3. Outlier Detection:
Identifies data points that are far apart from others.

Dr. Pankti Bhatt

4. Classification:
Helps assign new data points to predefined categories based on their similarity to
existing data.

What are Similarity Measures in Data Analytics?

Similarity measures in data analytics quantify how "similar" two data points are. They are
used to compare and find relationships between objects based on their features. Unlike
dissimilarity measures (which measure how different two points are), similarity measures focus
on how close or related the objects are.

Key Points About Similarity Measures

1. Purpose:
To evaluate the degree of resemblance between two data points.
2. Applications:
o Clustering (e.g., grouping similar customers).
o Recommendation systems (e.g., Netflix recommending movies).
o Information retrieval (e.g., finding similar documents).
o Classification (e.g., categorizing data based on similarity).
3. Types of Data:
o Numerical data (e.g., age, height, income).
o Categorical data (e.g., gender, preferences).
o Text data (e.g., documents, reviews).

Dr. Pankti Bhatt

Dr. Pankti Bhatt
Dr. Pankti Bhatt
Why Are Similarity Measures Important?

1. Clustering:
Similarity measures group data points into clusters, such as grouping customers with
similar buying patterns.
2. Recommendation Systems:
Suggest products, movies, or books based on similarity to user preferences.
Example: "People who bought this also bought..."
3. Text Analysis:
Compare documents for plagiarism or recommend similar articles.
4. Pattern Recognition:
Identify similar patterns in time-series data (e.g., stock prices or weather patterns).

Dr. Pankti Bhatt

Data Analytics for Beginners: Introduction to Data Analytics
From Everand
Data Analytics for Beginners: Introduction to Data Analytics
Anthony S. Williams
4/5 (19)
Kantar - Consultant Interview Questions
No ratings yet
Kantar - Consultant Interview Questions
11 pages
Introduction To API Testing
100% (9)
Introduction To API Testing
3 pages
Soilmec r210 PDF
100% (1)
Soilmec r210 PDF
8 pages
ARU V144BTE5: Multi V™ 5 Cooling Only 12 RT Outdoor Unit
No ratings yet
ARU V144BTE5: Multi V™ 5 Cooling Only 12 RT Outdoor Unit
2 pages
640394541-Kantar-Consultant-Interview-questions-1
No ratings yet
640394541-Kantar-Consultant-Interview-questions-1
11 pages
Data analytics_1
No ratings yet
Data analytics_1
21 pages
Archana Data Mining
No ratings yet
Archana Data Mining
24 pages
Datamining 1
No ratings yet
Datamining 1
30 pages
What Is Data Mining
No ratings yet
What Is Data Mining
8 pages
Data Analytics Unit-1
No ratings yet
Data Analytics Unit-1
83 pages
Chapter 1 DA
No ratings yet
Chapter 1 DA
73 pages
Data Mining
No ratings yet
Data Mining
52 pages
DM
No ratings yet
DM
15 pages
IME 672-Chapter 1 PDF
No ratings yet
IME 672-Chapter 1 PDF
41 pages
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
No ratings yet
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
35 pages
Unit 1
No ratings yet
Unit 1
59 pages
UNITWISE-IMP-NOTES
No ratings yet
UNITWISE-IMP-NOTES
34 pages
Data Mining Notes
No ratings yet
Data Mining Notes
46 pages
Chapter 1 Data Mining Lecture Note
No ratings yet
Chapter 1 Data Mining Lecture Note
31 pages
Data Mining
No ratings yet
Data Mining
6 pages
ba unit 3 own (1)
No ratings yet
ba unit 3 own (1)
7 pages
Data Mining Concept (MMU)
No ratings yet
Data Mining Concept (MMU)
38 pages
DMI UNIT 1_186_N3
No ratings yet
DMI UNIT 1_186_N3
12 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
2.1_Data_Analytics[1]
No ratings yet
2.1_Data_Analytics[1]
16 pages
Chap 1
No ratings yet
Chap 1
45 pages
PPP
No ratings yet
PPP
38 pages
Data Mining
No ratings yet
Data Mining
7 pages
Business Analytics Summary (Units 1.2 - 1.8)
No ratings yet
Business Analytics Summary (Units 1.2 - 1.8)
8 pages
Data Mining in Search Engine Analytics
No ratings yet
Data Mining in Search Engine Analytics
7 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
combinepdf-1
No ratings yet
combinepdf-1
74 pages
CSM6404 DM L1
No ratings yet
CSM6404 DM L1
29 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
3 pages
Data Mining Notes
100% (1)
Data Mining Notes
45 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
BI Lecture 5ppt
No ratings yet
BI Lecture 5ppt
18 pages
L_1 Data Mining
No ratings yet
L_1 Data Mining
17 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Major Issues in Data Mining
75% (4)
Major Issues in Data Mining
45 pages
HubSpots Guide To Data Analytics
No ratings yet
HubSpots Guide To Data Analytics
50 pages
DM 1 PDF
No ratings yet
DM 1 PDF
67 pages
UNIT I DBMI
No ratings yet
UNIT I DBMI
35 pages
Data Analytics
No ratings yet
Data Analytics
32 pages
AA THeory and Methods
No ratings yet
AA THeory and Methods
40 pages
UNIT-2_BI
No ratings yet
UNIT-2_BI
58 pages
Data Analytics Key Notes
No ratings yet
Data Analytics Key Notes
5 pages
The Importance of Data Mining in IT Industry
No ratings yet
The Importance of Data Mining in IT Industry
50 pages
important questions unit-1
No ratings yet
important questions unit-1
20 pages
UNIT 3 NIVELACIÓN DE INGLÉS
No ratings yet
UNIT 3 NIVELACIÓN DE INGLÉS
34 pages
UNIT 3
No ratings yet
UNIT 3
22 pages
Data Mining 1
No ratings yet
Data Mining 1
56 pages
UNIT-1 A
No ratings yet
UNIT-1 A
47 pages
Dta Mining
No ratings yet
Dta Mining
15 pages
Data-Mining-OVERVIEW (1)
No ratings yet
Data-Mining-OVERVIEW (1)
8 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
unit 2
No ratings yet
unit 2
81 pages
Manan1
No ratings yet
Manan1
65 pages
UNIT-3 DATA MINING - Part1
No ratings yet
UNIT-3 DATA MINING - Part1
111 pages
Module 1 & 2 DAEH QB
No ratings yet
Module 1 & 2 DAEH QB
69 pages
DA MOD 1
No ratings yet
DA MOD 1
60 pages
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
From Everand
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
Riley Adams
5/5 (1)
Ultrasonic Asme B31.3-2022
No ratings yet
Ultrasonic Asme B31.3-2022
1 page
Internet of Things For Industrial Monitoring and Control Applications PDF
No ratings yet
Internet of Things For Industrial Monitoring and Control Applications PDF
5 pages
Complete Introduction To Real Analysis 1st Edition Christopher Heil PDF For All Chapters
100% (2)
Complete Introduction To Real Analysis 1st Edition Christopher Heil PDF For All Chapters
62 pages
Am8530H/Am85C30: Serial Communications Controller
No ratings yet
Am8530H/Am85C30: Serial Communications Controller
195 pages
Learn Python in Three Hours: Some Material Adapted From Upenn Cmpe391 Slides and Other Sources
No ratings yet
Learn Python in Three Hours: Some Material Adapted From Upenn Cmpe391 Slides and Other Sources
43 pages
2021 Fmea Template
No ratings yet
2021 Fmea Template
9 pages
Pelton Turbine
No ratings yet
Pelton Turbine
18 pages
Solid Figures
No ratings yet
Solid Figures
6 pages
Mc7302 Embedded Systems
No ratings yet
Mc7302 Embedded Systems
2 pages
Manual Combina Frigorifica Candy Alba
No ratings yet
Manual Combina Frigorifica Candy Alba
230 pages
PCon - Planner 8.1 Features
No ratings yet
PCon - Planner 8.1 Features
11 pages
VR-UNIT 4-MODELING THE PHYSICAL WORLD (1)
No ratings yet
VR-UNIT 4-MODELING THE PHYSICAL WORLD (1)
39 pages
Bte Catalog 2018 (New)
100% (1)
Bte Catalog 2018 (New)
4 pages
Chapter 10 Comparing Two Populations or Groups-10.2
No ratings yet
Chapter 10 Comparing Two Populations or Groups-10.2
31 pages
Mapaga Lesson Plan in Grade VI Mathematics
100% (1)
Mapaga Lesson Plan in Grade VI Mathematics
8 pages
Ferraro & Taylor
100% (1)
Ferraro & Taylor
14 pages
1.heavy Earth Moving Equipment A.dozers: Key Components
No ratings yet
1.heavy Earth Moving Equipment A.dozers: Key Components
8 pages
Module 1
No ratings yet
Module 1
95 pages
Delta Ia-plc Dvp-es3 Pm en 20220407
No ratings yet
Delta Ia-plc Dvp-es3 Pm en 20220407
1,244 pages
Isolation Forest Algorithm For Anomaly Detection
No ratings yet
Isolation Forest Algorithm For Anomaly Detection
16 pages
Comparisonof 40 MM Gunfirings
No ratings yet
Comparisonof 40 MM Gunfirings
9 pages
Amplification Writing
No ratings yet
Amplification Writing
3 pages
ch-13 Light
No ratings yet
ch-13 Light
6 pages
ABAP Debugger and SAP Query
No ratings yet
ABAP Debugger and SAP Query
34 pages
MFRS 119 Employee Benefits
No ratings yet
MFRS 119 Employee Benefits
38 pages
Experiment No. 5 Preparation of Aspirin (Initial)
No ratings yet
Experiment No. 5 Preparation of Aspirin (Initial)
2 pages
A Novel Ferrofluids Based Hign Tuning Solenoid 07394059
No ratings yet
A Novel Ferrofluids Based Hign Tuning Solenoid 07394059
4 pages

DAV Unit 1

Uploaded by

DAV Unit 1

Uploaded by

IAR University

Department of Computer Sciences and Engineering

What is Data in Data Analytics?

 Data is like raw materials.

1. Qualitative Data (Categorical):

Importance of Data in Analytics

 Data is the foundation of data analytics.

Dr. Pankti Bhatt

1. Customer details: Name, age, gender.

Using this data:

 The store can analyze popular products.

What is Data Analytics?

Key Steps in Data Analytics

Dr. Pankti Bhatt

Importance of Data Analytics

 Helps businesses understand customers better.

Example of Data Analytics

Scenario: A restaurant wants to improve its business.

What is Data Mining?

Dr. Pankti Bhatt

Steps in Data Mining

Techniques in Data Mining

Importance of Data Mining

 Helps businesses understand customer behavior.

Dr. Pankti Bhatt

Example of Data Mining

Scenario: A supermarket wants to increase sales.

Difference Between Data Mining and Data Analytics

 Data Mining focuses on finding hidden patterns in data.

What is Knowledge Discovery?

 Knowledge Discovery is like finding a hidden treasure in a sea of information.

Steps in Knowledge Discovery

Dr. Pankti Bhatt

Importance of Knowledge Discovery

 Helps in decision-making by providing actionable insights.

Example of Knowledge Discovery

Scenario: A bank wants to reduce loan defaults.

Difference Between Data Mining and Knowledge Discovery

What are Relations in Data Analytics?

Relations refer to the connections or associations between different pieces of data.

Dr. Pankti Bhatt

Why are Relations Important in Data Analytics?

Example of Data and Relations in Data Analytics

Dr. Pankti Bhatt

What is the Iris Dataset in Data Analytics?

Key Features of the Iris Dataset

Sepal Length Sepal Width Petal Length Petal Width Species

5.1 3.5 1.4 0.2 Iris-setosa

7.0 3.2 4.7 1.4 Iris-versicolor

6.3 3.3 6.0 2.5 Iris-virginica

Why is the Iris Dataset Popular in Data Analytics?

Dr. Pankti Bhatt

Applications of the Iris Dataset

Example Analysis Using the Iris Dataset

1. Step 1: Data Exploration:

What are Data Scales in Data Analytics?

There are four main types of data scales:

Dr. Pankti Bhatt

Dr. Pankti Bhatt

Why Are Data Scales Important in Data Analytics?

1. Choosing the Right Method:

Examples in Simple Scenarios

Scenario 1: Analyzing Customer Demographics

 Nominal: Gender (Male, Female).

Scenario 2: Weather Analysis

 Nominal: Type of weather (Sunny, Rainy, Cloudy).

Dr. Pankti Bhatt

Dr. Pankti Bhatt

Key Points About Dissimilarity Measures:

Dr. Pankti Bhatt

Dr. Pankti Bhatt

What are Similarity Measures in Data Analytics?

Key Points About Similarity Measures

Dr. Pankti Bhatt

Dr. Pankti Bhatt

You might also like