Data Mining Using SAS Applications Chapman Hall CRC Data Mining and Knowledge Discovery Series 1st Edition George Fernandez instant download
Data Mining Using SAS Applications Chapman Hall CRC Data Mining and Knowledge Discovery Series 1st Edition George Fernandez instant download
https://ebookname.com/product/data-mining-using-sas-applications-
chapman-hall-crc-data-mining-and-knowledge-discovery-series-1st-
edition-george-fernandez/
https://ebookname.com/product/geographic-data-mining-and-
knowledge-discovery-second-edition-chapman-hall-crc-data-mining-
and-knowledge-discovery-series-harvey-j-miller/
https://ebookname.com/product/geographic-data-mining-and-
knowledge-discovery-1st-edition-harvey-j-miller-editor/
https://ebookname.com/product/collaborative-filtering-using-data-
mining-and-analysis-vishal-bhatnagar/
https://ebookname.com/product/crystal-cove-the-cottages-and-
environs-as-they-were-3rd-edition-michael-j-blum/
Introduction to Communications Technologies A Guide for
Non Engineers 3rd Edition Stephan S. Jones
https://ebookname.com/product/introduction-to-communications-
technologies-a-guide-for-non-engineers-3rd-edition-stephan-s-
jones/
https://ebookname.com/product/relieving-pelvic-pain-during-and-
after-pregnancy-how-women-can-heal-chronic-pelvic-
instability-1st-edition-cecile-rost/
https://ebookname.com/product/china-s-new-consumers-social-
development-and-domestic-demand-1st-edition-croll/
https://ebookname.com/product/digital-filters-design-for-signal-
and-image-processing-mohamed-najim/
https://ebookname.com/product/ki-44-%ca%bftojo-aces-of-world-
war-2-1st-edition-nicholas-millman/
Theology After Darwin First Edition Michael S Northcott
And R J Berry (Eds.)
https://ebookname.com/product/theology-after-darwin-first-
edition-michael-s-northcott-and-r-j-berry-eds/
Data Mining
Using
SAS Applications
George Fernandez
This book contains information obtained from authentic and highly regarded sources. Reprinted material
is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable
efforts have been made to publish reliable data and information, but the authors and the publisher cannot
assume responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying, microfilming, and recording, or by any information storage or
retrieval system, without prior permission in writing from the publisher.
The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for
creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC
for such copying.
Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation, without intent to infringe.
Objective
The objective of this book is to introduce data mining concepts, describe methods
in data mining from sampling to decision trees, demonstrate the features of user-
friendly data mining SAS tools, and, above all, allow readers to download data
mining SAS macro-call files and help them perform complete data mining. The
user-friendly SAS macro approach integrates the statistical and graphical analysis
tools available in SAS systems and offers complete data mining solutions without
writing SAS program codes or using the point-and-click approach. Step-by-step
instructions for using SAS macros and interpreting the results are provided in each
chapter. Thus, by following the step-by-step instructions and downloading the user-
friendly SAS macros described in the book, data analysts can perform complete
data mining analysis quickly and effectively.
Coverage
The following types of analyses can be performed using the user-friendly SAS
macros:
Potential Audience
䡲 This book is suitable for data analysts who need to apply data mining
techniques using existing SAS modules for successful data mining, without
investing a lot of time to research and buy new software products or to
learn how to use additional software.
䡲 Experienced SAS programmers can utilize the SAS macro source codes
available in the companion CD-ROM and customize it to fit in their
business goals and different computing environments.
䡲 Graduate students in business and the natural and social sciences can
successfully complete data analysis projects quickly using these SAS macros.
䡲 Large business enterprises can use data mining SAS macros in pilot studies
involving the feasibility of conducting a successful data mining endeavor,
before making a significant investment in full-scale data mining.
䡲 Finally, any SAS users who want to impress their supervisors can do so
with quick and complete data analysis presented in PDF, RTF, or HTML
formats.
Additional Resources
䡲 Book website: A website has been set up at
http://www.ag.unr.edu/gf/dm.html
Users can find information regarding downloading the sample data files
used in the book and the necessary SAS macro-call files. Readers are
encouraged to visit this site for information on any errors in the book,
SAS macro updates, and links for additional resources.
䡲 Companion CD-ROM: For experienced SAS programmers, a companion CD-
ROM is available for purchase that contains sample datasets, macro-call
George Fernandez
1.1 Introduction
Data mining, or knowledge discovery in databases (KDD), is a powerful informa-
tion technology tool with great potential for extracting previously unknown and
potentially useful information from large databases. Data mining automates the
process of finding relationships and patterns in raw data and delivers results that
can be either utilized in an automated decision support system or assessed by
decision makers. Many successful organizations practice data mining for intelligent
decision-making.1 Data mining allows the extraction of nuggets of knowledge from
business data that can help enhance customer relationship management (CRM)2
and can help estimate the return on investment (ROI).3 Using powerful analytical
techniques, data mining enables institutions to turn raw data into valuable infor-
mation to gain a critical competitive advantage
With data mining, the possibilities are endless. Although data mining applica-
tions are popular among forward-thinking businesses, other disciplines that main-
tain large databases could reap the same benefits from properly carried out data
mining. Some of the potential applications of data mining include characterizations
of genes in animal and plant genomics, clustering and segmentation in remote
sensing of satellite image data, and predictive modeling in wildfire incidence data-
bases.
The purpose of this chapter is to introduce data mining concepts, provide
some examples of data mining applications, list the most commonly used data
mining techniques, and briefly discuss the data mining applications available in
䡲 Multiple linear regression (MLR). In MLR, the association between the two
sets of variables is described by a linear equation that predicts the contin-
uous response variable from a function of predictor variables.
䡲 Logistic regressions. This type of regression uses a binary or an ordinal variable
as the response variable and allows construction of more complex models
than the straight linear models do.
䡲 Neural net (NN) modeling. Neural net modeling can be used for both pre-
diction and classification. NN models enable construction of trains and
validate multiplayer feed-forward network models for modeling large data
and complex interactions with many predictor variables. NN models usually
contain more parameters than a typical statistical model, the results are not
easily interpreted, and no explicit rationale is given for the prediction. All
variables are considered to be numeric and all nominal variables are coded
as binary. Relatively more training time is needed to fit the NN models.
䡲 Classification and regression tree (CART). These models are useful in generating
binary decision trees by splitting the subsets of the dataset using all pre-
dictor variables to create two child nodes repeatedly beginning with the
entire dataset. The goal is to produce subsets of the data that are as
homogeneous as possible with respect to the target variable. Continuous,
binary, and categorical variables can be used as response variables in CART.
䡲 Discriminant function analysis. This is a classification method used to deter-
mine which predictor variables discriminate between two or more naturally
occurring groups. Only categorical variables are allowed to be the response
variable and both continuous and ordinal variables can be used as predic-
tors.
䡲 Chi-square automatic interaction detector (CHAID) decision tree. This is a classi-
fication method used to study the relationships between a categorical
response measure and a large series of possible predictor variables that may
interact with each other. For qualitative predictor variables, a series of chi-
square analyses are conducted between the response and predictor variables
to see if splitting the sample based on these predictors leads to a statistically
significant discrimination in the response.
By assessing the results gained from each stage of the SEMMA process, users
can determine how to model new questions raised by previous results and thus
proceed back to the exploration phase for additional refinement of the data. The
SAS data mining solution integrates everything necessary for discovery at each
stage of the SEMMA process: These data mining tools indicate patterns or excep-
tions, and mimic human abilities for comprehending spatial, geographical, and
visual information sources. Complex mining techniques are carried out in a totally
code-free environment, allowing analysts to concentrate on visualization of the
data, discovery of new patterns, and new questions to ask.
䡲 Users can perform comprehensive data mining tasks by inputting the macro
parameters in the macro-call window and by running the SAS macro.
䡲 SAS codes required for performing data exploration, model fitting, model
assessment, validation, prediction, and scoring are included in each macro
so complete results can be obtained quickly.
The fact that these SAS macros do not use Enterprise Miner is something of a
limitation in that SAS macros could not be included for performing neural net,
CART, and market basket analysis, as these data mining tools require the use of
Enterprise Miner.
1.10 Summary
Data mining is a journey — a continuous effort to combine business knowledge
with information extracted from acquired data. This chapter briefly introduces the
concept and applications of data mining, which is the secret and intelligent weapon
that unleashes the power hidden in data. The SAS Institute, the industry leader in
analytical and decision support solutions, provides the powerful software Enterprise
Miner to perform complete data mining solutions; however, because of the high
price tag for Enterprise Miner, application of this software is not feasible for all
business analysts and academic institutions. As alternatives to the point-and-click
menu interface modules and Enterprise Miner, user-friendly SAS macro applications
for performing several data mining tasks are included in this book. Instructions
are given in the book for downloading and applying these user-friendly SAS macros
for producing quick and complete data mining solutions.
References
1. SAS Institute, Inc., Customer Success Stories (http://www.sas.com/news/suc-
cess/solutions.html).
Language: English
SECOND EDITION
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookname.com