6720 Labs Chapter 2

This document contains review questions for Chapter 2 of a data mining textbook. The questions cover topics like identifying supervised vs unsupervised learning tasks, the roles of validation and test partitions in modeling, overfitting of models to training data, and choosing between models based on their performance on validation vs training data. Additional questions involve using data mining software to pre-process categorical variables by converting them to dummy variables and partitioning a dataset into training and validation samples.

Uploaded by

sweetie05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

116 views3 pages

6720 Labs Chapter 2

Uploaded by

sweetie05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 3

Data Mining Review Questions / XLMiner Labs

Chapter 2 Overview of the Data Mining Process

1. Assuming that data mining techniques are to be used in the following cases,
identify whether the task required is supervised or unsupervised learning
(textbook reference - 2.1).
a. Deciding whether or not to issue a loan to an applicant based on
demographic and financial data (with reference to a database of similar
data on prior customers).
b. In an online bookstore, making recommendations to customers concerning
additional items to buy based on the buying patterns of prior transactions.
c. Identifying a network data packet as dangerous (e.g., virus, hacker attack)
based on comparison to other packets whose threat status is known.
d. Identifying segments of similar customers.
e. Predicting whether a company will go bankrupt based on comparing its
financial data to those of similar bankrupt and non-bankrupt firms.
f.

Estimating the repair time required for an aircraft based on a trouble

ticket.

g. Automated sorting of mail by zip code scanning.

h. Printing of customer discount coupons at the conclusion of a grocery store
checkout based on what you just bought and what others have bought
previously.
2. Describe the difference in roles assumed by the validation partition and the test
partition (textbook reference - 2.2).
3. Using the concept of over fitting, explain why that when a model is fit to training
data, zero error with those data are not necessarily good (textbook reference 2.5).
4. Two models are applied to a dataset that has been partitioned. Model A is
considerably more accurate than Model B on the training data but slightly less
accurate than Model B on the validation data. Which model are you more likely
to consider for final deployment? Explain your choice. (textbook reference 2.10)

Page 1 of 3

Page 2 of 3

5. The next 2 Questions require the Use of XLMiner Data Mining software and the
UniversalBank.xls dataset . . .
a. Use XLMiners Convert to Dummies utility to convert the categorical
variable Education to binary dummy variables. After the conversion, how
many resulting columns exist for the Education variable? Why is this
conversion performed?
b. Using the newly created dataset (with binary dummy variables), use
XLMiners Partitioning function to perform Standard Partitioning (accept
the default percentages for partitioning). How many records were
assigned to the Training Partition? How many records were assigned to
the Validation Partition? Why was a Test Partition not created?

Page 3 of 3

Test Bank
No ratings yet
Test Bank
55 pages
A. What Are The Coordinates of The Centroids For The Good Students and The Weak Students?
No ratings yet
A. What Are The Coordinates of The Centroids For The Good Students and The Weak Students?
18 pages
SSau Exercises
No ratings yet
SSau Exercises
6 pages
Optimization
No ratings yet
Optimization
5 pages
Chap2 Overview
No ratings yet
Chap2 Overview
17 pages
Test Bank
No ratings yet
Test Bank
55 pages
Data Mining For Business Intelligence: Shmueli, Patel & Bruce
No ratings yet
Data Mining For Business Intelligence: Shmueli, Patel & Bruce
37 pages
Ga1 Deguzman Delto Regodon
No ratings yet
Ga1 Deguzman Delto Regodon
5 pages
Multiple Choice Questions
No ratings yet
Multiple Choice Questions
56 pages
8ad860919c5ec4dc4960ee8687ebd471_recitation1
No ratings yet
8ad860919c5ec4dc4960ee8687ebd471_recitation1
4 pages
DMDW Lab Oral Question Bank
No ratings yet
DMDW Lab Oral Question Bank
4 pages
Data Mining Notes
No ratings yet
Data Mining Notes
43 pages
CHAPTER1-datamining
No ratings yet
CHAPTER1-datamining
33 pages
HW3
No ratings yet
HW3
3 pages
Data Mining and Visualization Question Bank
100% (1)
Data Mining and Visualization Question Bank
11 pages
Top 50 Data Mining Interview Questions & Answers - GeeksforGeeks
No ratings yet
Top 50 Data Mining Interview Questions & Answers - GeeksforGeeks
25 pages
III Yr B.Tech. - Computer Science & Engineering/Information Technology Data Mining
No ratings yet
III Yr B.Tech. - Computer Science & Engineering/Information Technology Data Mining
2 pages
Data Mining Quiz
No ratings yet
Data Mining Quiz
4 pages
Data Mining MCQ (Multiple Choice Questions)
No ratings yet
Data Mining MCQ (Multiple Choice Questions)
7 pages
40 REAL TIME DATA MINING Multiple Choice Questions and Answers PDF
No ratings yet
40 REAL TIME DATA MINING Multiple Choice Questions and Answers PDF
8 pages
40 REAL TIME DATA MINING Multiple Choice Questions and Answers PDF
No ratings yet
40 REAL TIME DATA MINING Multiple Choice Questions and Answers PDF
8 pages
3 Marks Dobara
No ratings yet
3 Marks Dobara
6 pages
QUESTION BANK BCA_IDS
No ratings yet
QUESTION BANK BCA_IDS
3 pages
MFDS - Test 1 Problems
No ratings yet
MFDS - Test 1 Problems
9 pages
Evans Analytics2e PPT 10 Data Mining
No ratings yet
Evans Analytics2e PPT 10 Data Mining
69 pages
Chap3 Sec2 Overfitting
No ratings yet
Chap3 Sec2 Overfitting
22 pages
Data Mining MCQs unit1&2
No ratings yet
Data Mining MCQs unit1&2
11 pages
Top 50 Data Mining Interview Questions & Answers PDF
No ratings yet
Top 50 Data Mining Interview Questions & Answers PDF
30 pages
Data Mining Solved PP Short Q's
No ratings yet
Data Mining Solved PP Short Q's
11 pages
Aie - Concept of Data Mining
No ratings yet
Aie - Concept of Data Mining
5 pages
Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
30 pages
dataminingshort Question part2
No ratings yet
dataminingshort Question part2
17 pages
Assignment 1 Solution
No ratings yet
Assignment 1 Solution
2 pages
Data Minig Anwers
No ratings yet
Data Minig Anwers
37 pages
Data Mining (Gtu Sem-6)002
No ratings yet
Data Mining (Gtu Sem-6)002
5 pages
DMW - Unit 1
No ratings yet
DMW - Unit 1
21 pages
1635838720082
No ratings yet
1635838720082
35 pages
Final Exam Review
No ratings yet
Final Exam Review
6 pages
Assignment DMW
No ratings yet
Assignment DMW
2 pages
Assignment Data Mining
No ratings yet
Assignment Data Mining
27 pages
Interview Questions On Machine Learning
100% (4)
Interview Questions On Machine Learning
22 pages
chap5_02_overfitting
No ratings yet
chap5_02_overfitting
17 pages
DWM Mid 2 Question Bank
No ratings yet
DWM Mid 2 Question Bank
5 pages
Sample Question DMW
No ratings yet
Sample Question DMW
4 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
Data_Mining_BCA_10_Point_Answers
No ratings yet
Data_Mining_BCA_10_Point_Answers
3 pages
5 What Is Data-WPS Office
No ratings yet
5 What Is Data-WPS Office
19 pages
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 5 Data Mining
100% (1)
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 5 Data Mining
13 pages
Tutorial 1 Solutions
No ratings yet
Tutorial 1 Solutions
3 pages
Data Mining: Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Data Mining: Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
15 pages
Machine Learning Interview Questions PDF
No ratings yet
Machine Learning Interview Questions PDF
14 pages
DADM S2 Data Preprocessing-Data Cleaning and Transformation
No ratings yet
DADM S2 Data Preprocessing-Data Cleaning and Transformation
12 pages
Data Mining List of Important Question
No ratings yet
Data Mining List of Important Question
4 pages
Assignment 02
No ratings yet
Assignment 02
9 pages
DWDM-bits-new
No ratings yet
DWDM-bits-new
5 pages
Data Final
No ratings yet
Data Final
17 pages
CH02 Data Mining A Closer Look
No ratings yet
CH02 Data Mining A Closer Look
34 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
From Everand
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
Satou Takahiro
No ratings yet
Azure Fundamentals Success Kit
From Everand
Azure Fundamentals Success Kit
PRIYANKA
No ratings yet
Cloud Brokering
From Everand
Cloud Brokering
Felipe Díaz-Sánchez
No ratings yet
ISC2 Certified in Cybersecurity (CC) Practice Exams: Over 650 Practice Questions of Exam-Level Difficulty with Very Detailed Explanations to Right and Wrong Answers
From Everand
ISC2 Certified in Cybersecurity (CC) Practice Exams: Over 650 Practice Questions of Exam-Level Difficulty with Very Detailed Explanations to Right and Wrong Answers
Daniel House
No ratings yet
Sample Form W 8BEN
100% (1)
Sample Form W 8BEN
1 page
TB Prevention Factsheet
100% (1)
TB Prevention Factsheet
2 pages
TB Day Brochure
No ratings yet
TB Day Brochure
20 pages
Or CHAPTER 14 Postanesthesia Care Units
No ratings yet
Or CHAPTER 14 Postanesthesia Care Units
16 pages
Algorithms For Drug Sensitivity Prediction
No ratings yet
Algorithms For Drug Sensitivity Prediction
25 pages
2.1: Frequency Distributions, Histograms, and Related Topics
No ratings yet
2.1: Frequency Distributions, Histograms, and Related Topics
4 pages
Logistic Regression PDF
No ratings yet
Logistic Regression PDF
29 pages
2.1: Frequency Distributions, Histograms, and Related Topics
No ratings yet
2.1: Frequency Distributions, Histograms, and Related Topics
4 pages
Objectives: Local Consistency Notions
No ratings yet
Objectives: Local Consistency Notions
10 pages
Chapter 02
No ratings yet
Chapter 02
157 pages
Exercise 7
No ratings yet
Exercise 7
4 pages
Chapter 14 Association Rules Collaborative Filtering
No ratings yet
Chapter 14 Association Rules Collaborative Filtering
34 pages
6720 Labs Chapter 9
No ratings yet
6720 Labs Chapter 9
2 pages
6720 Labs Chapter 7
No ratings yet
6720 Labs Chapter 7
2 pages

6720 Labs Chapter 2

Uploaded by

6720 Labs Chapter 2

Uploaded by

Data Mining Review Questions / XLMiner Labs

Chapter 2 Overview of the Data Mining Process

Estimating the repair time required for an aircraft based on a trouble

g. Automated sorting of mail by zip code scanning.

You might also like