0% found this document useful (0 votes)
44 views

Data Mining and Warehousing

The document discusses Chernoff faces, a data visualization technique from the 1970s that represents multivariate data using facial features mapped to variables. It can detect clusters but is difficult to interpret quantitative values and implement due to challenges in data normalization and scaling. While it depicts many dimensions like other techniques, those techniques are easier to interpret. Faces also do not make unbiased representation easy due to connotations of features.

Uploaded by

Malvika Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Data Mining and Warehousing

The document discusses Chernoff faces, a data visualization technique from the 1970s that represents multivariate data using facial features mapped to variables. It can detect clusters but is difficult to interpret quantitative values and implement due to challenges in data normalization and scaling. While it depicts many dimensions like other techniques, those techniques are easier to interpret. Faces also do not make unbiased representation easy due to connotations of features.

Uploaded by

Malvika Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

DATA MINING AND WAREHOUSING

ASSIGNMENT – CHERNOFF FACES

INTRODUCTION -T h e C h e r n o f f F a c e s m e t h o d i s a d a t a v i s u a l i z a t i o n t e c h n i q u e
brought to us by the 70's. It was developed by Herman Chernoff to
represent multivariate data, ostensibly effectively representing up to 18
variables. Facial features (eyes, nose, eyebrows) are mapped to multiple
variables, with size, orientation, shape, color, and placement potentially
representing different attribut es of a single observation.

EVALUATION-The Chernoff faces technique is an interesting way to


represent multivariate data. It can be used to detect similarities between
different items, but it is not the most efficient or the most accurate way
to do so. Other techniques, such as parallel coordinates, star graphs, or
radar charts, depict as many dimensions as Chernoff faces, but are easier
to interpret.

Implementing visualizations of Chernoff faces is quite challenging. Data


must be normalized, and often binn ed in order to convey meaning.
Scaling and normalization can be complicated, particulary if variables
represent many different kinds of data. On the viewers end, it becomes
difficult to extrapolate meaningful quantitative data from normalized
and binned representations.

Faces do not make it eas y to present data without perceived bias. For
example, a curved mouth holds positive and negative connotations,
which must be considered in order to avoid unwanted implications.

To Summarize the concept,

Used for:

 Detecting clusters
 Representing multi-variate data

 Disadvantages:

 No quantitative value associations


 Hard to implement
 Hard to interpret

Footwear buying behavior analysis-


Questionnaire-

https://goo.gl/forms/7ki2xPBweB2txleq1

Responses

https://drive.google.com/open?id=1nkIJpdXW5vRFp8GtIK0-dKp66aAMOjfr

Code in R

install.packages("aplpack")

library(aplpack)

data <- read.csv("E:/Data Mining/New folder/chernoff faces.csv", header = T, sep = ",")

data$Gender<-as.numeric(data$Gender)

data$Age.of.the.Respondent<-as.numeric(data$Age.of.the.Respondent)
data$Which.category.of.footwear.u.buy.the.most..<-
as.numeric(data$Which.category.of.footwear.u.buy.the.most..)

data$How.many.pairs.of.footwear.you.have.purchased.in.last.six.months.<-
as.numeric(data$How.many.pairs.of.footwear.you.have.purchased.in.last.six.months.)

data$Which.brand.of.footwear.u.buy.often.<-
as.numeric(data$Which.brand.of.footwear.u.buy.often.)

data$On.which.attribute.is.your.buying.behavior.more.influenced.for.a.specific.brand.of.footw
ear.<-
as.numeric(data$On.which.attribute.is.your.buying.behavior.more.influenced.for.a.specific.bra
nd.of.footwear.)

data$Which.attribute.has.a.higher.level.of.importance.while.u.make.a.purchase.<-
as.numeric(data$Which.attribute.has.a.higher.level.of.importance.while.u.make.a.purchase.)

data$How.likely.do.you.recommend.your.current.brand.to.your.close.friends.<-
as.numeric(data$How.likely.do.you.recommend.your.current.brand.to.your.close.friends.)

data$Given.a.chance...would.you.switch.to.some.another.brand.of.footwear.<-
as.numeric(data$Given.a.chance...would.you.switch.to.some.another.brand.of.footwear.)

str(data)

faces(data[,2:10],labels = data$Name)

- MALVIKA SINGH

- M.Sc. Operational Research (SEM-4)

- 1893450

You might also like