dsr8,9

The document describes implementing a K-Means clustering algorithm in R to identify teen market segments from a dataset. It details preprocessing the data, standardizing variables, running K-Means clustering with 5 clusters, and evaluating the results by examining cluster sizes, centers, and characteristics.

Uploaded by

poorvaja.r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views6 pages

dsr8,9

Uploaded by

poorvaja.r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Ex No:8

Date: Finding Teen Segment of Market

AIM:
To implement a R program to find Teen segment of market using K-Means Clustering.

ALGORITHM:
1. Start
2. Import snsdata dataset using read.csv() and display the structure of the dataset using
str() for exploring and preparation of the data.
3. To eliminate the null values in Gemder, assign two more columns separately for
females and unknown gender.
4. To find the average age of each graduation year for the subgroup, use aggregate()
function with argument FUN = mean.
5. To merge the resultant data frame as a vector to the original data, we would use ave()
function, with argument FUN to find mean of non-empty values. Store it in ave_age.
6. If the age value is empty, then assign the ave_age to it using ifelse().
7. As the main focus is interests, assign a data frame interests with the 36 features from
5 to 40th index.
8. Apply z-score standardization for the interests using lapply and store it in interests_Z.
9. Apply k-means clustering on interests_z and store it in ‘teen_clusters’ which stores
the properties of each of the five clusters.
10. For the evaluation of the model performance, look at the clusters and cluster centers.
11. Analyze the rows of the output and the numbers in the output indicating the average
value for the interest listed at the top at the column.
12. Improve the model performance by applying the cluster IDs to the original data frame
13. Look at the mean age by cluster, proportion of females by clusters and mean number
of friends by cluster.
14. Stop

CODE:
teens <- read.csv("snsdata.csv")
str(teens)
table(teens$gender)
table(teens$gender, useNA = "ifany")
summary(teens$age)
teens$female <- ifelse(teens$gender == "F" & !is.na(teens$gender),1,0)
teens$no_gender <- ifelse(is.na(teens$gender), 1, 0)
table(teens$gender, useNA = "ifany")
table(teens$female, useNA = "ifany")
table(teens$no_gender, useNA = "ifany")
mean(teens$age, na.rm = TRUE)
aggregate(data = teens, age ~ gradyear, mean, na.rm = TRUE)
ave_age <- ave(teens$age, teens$gradyear,
FUN = function(x) mean(x, na.rm = TRUE))
teens$age <- ifelse(is.na(teens$age), ave_age, teens$age)
summary(teens$age)
interests <- teens[5:40]
interests_z <- as.data.frame(lapply(interests, scale))
set.seed(2345)
teen_clusters <- kmeans(interests_z, 5)

AI19542 201501035
teen_clusters$size
teen_clusters$centers
teens$clusters <- teen_clusters$cluster
teens[1:5, c("clusters", "gender", "age", "friends")]
aggregate(data = teens, age ~ clusters, mean)
aggregate(data = teens, female ~ clusters, mean)
aggregate(data = teens, friends ~ clusters, mean)

OUTPUT:

AI19542 201501035
RESULTS
Thus the R program to find teen market segment with K-Means algorithm is
executed successfully and the output is verified.

AI19542 201501035
Ex No: 9 TUNING STOCK MODELS FOR BETTER PERFORMANCE
Date:

AIM:
To implement a R program to tune Stock models for better performance.

ALGORITHM:
1. Start
2. Import the credit dataset using read.csv().
3. Import the library ‘caret’ to use the different machine learning models using train().
4. Set the seed to initialize the random number Generator.
5. Define a tree with ‘default’ and train the model and store it in m. Display the model
‘m’
6. To work with the model for predictions, create a resulting vector using
predict(m,credit).
7. Display the table of the predictions for analysing the proportions.
8. To customize the tuning model. Use trainControl() with controlobject ‘ctrl’ that uses
10-fold validation with selection with only one Standard Error for best performance.
9. Create a dataFrame from the combination of model, trials and winnow, we use
expand.grid(). Assign the grid with 8 different values of trials, with winnow as false.
10. Train a new model tree with kappa as the metric, ctrl object for trControl, grid as
tuneGrid as parameters.
11. Display the new model.
12. Stop.

CODE:

credit <- read.csv('../input/credit-2/credit.csv')

str(credit)
library(caret)
set.seed(300)
m <- train(default ~., data = credit, method = "C5.0")
m
p <- predict(m, credit)
table(p, credit$default)
head(predict(m, credit))
head(predict(m, credit, type = "prob"))
ctrl <- trainControl(method = "cv", number =10, selectionFunction = "oneSE")
grid <- expand.grid(.model = "tree", .trials = c(1,5,10,15,20,25,30,25), .winnow =
"FALSE")
grid
set.seed(300)
m<-train(default ~., data = credit, method = "C5.0", metric = "Kappa", trControl = ctrl,
tuneGrid = grid)
m

AI19542 201501035
OUTPUT:

AI19542 201501035
RESULT:
Thus, the R program to tune the stock model to improve performance is
executed successfully and the output is verified.

AI19542 201501035

Chat GPT
No ratings yet
Chat GPT
145 pages
Windows 11
No ratings yet
Windows 11
125 pages
National Reading Program (NRP) : Quarter 1 Week 1 Day 1
No ratings yet
National Reading Program (NRP) : Quarter 1 Week 1 Day 1
62 pages
R1_uptoVisualisation
No ratings yet
R1_uptoVisualisation
122 pages
Operating Systems From 0 To 1
No ratings yet
Operating Systems From 0 To 1
309 pages
R Tools Manual New
No ratings yet
R Tools Manual New
35 pages
31deonluyentienganhlop3 - GIAOANDETHITIENGANH - INFO
No ratings yet
31deonluyentienganhlop3 - GIAOANDETHITIENGANH - INFO
49 pages
R_record-1
No ratings yet
R_record-1
57 pages
Basic R Programming
No ratings yet
Basic R Programming
37 pages
da thoery
No ratings yet
da thoery
24 pages
01 Types of Software and Interrupts
No ratings yet
01 Types of Software and Interrupts
30 pages
Theorem Examples From Literature
No ratings yet
Theorem Examples From Literature
41 pages
r program
No ratings yet
r program
22 pages
Statistics and Data Science with R Part -4
No ratings yet
Statistics and Data Science with R Part -4
23 pages
BDA Lab Manual (12 Weeks)
No ratings yet
BDA Lab Manual (12 Weeks)
22 pages
Datamining Lab Record
No ratings yet
Datamining Lab Record
36 pages
Chat GPT
No ratings yet
Chat GPT
24 pages
NLP Assignment 5
No ratings yet
NLP Assignment 5
5 pages
Rlab
No ratings yet
Rlab
7 pages
KMEANS
No ratings yet
KMEANS
13 pages
DSR LAB MANUAL - 10 programs
No ratings yet
DSR LAB MANUAL - 10 programs
34 pages
ML Fundamentals
No ratings yet
ML Fundamentals
38 pages
Ex1_R
No ratings yet
Ex1_R
8 pages
WEEK 1
No ratings yet
WEEK 1
10 pages
Handout 3
No ratings yet
Handout 3
24 pages
EM622 Data Analysis and Visualization Techniques For Decision-Making
No ratings yet
EM622 Data Analysis and Visualization Techniques For Decision-Making
47 pages
U-II-Science of Yoga
No ratings yet
U-II-Science of Yoga
26 pages
mtech final
No ratings yet
mtech final
16 pages
saurabh
No ratings yet
saurabh
22 pages
21CS644 Module
No ratings yet
21CS644 Module
30 pages
R Practicals
No ratings yet
R Practicals
32 pages
R_Analysis_Summary
No ratings yet
R_Analysis_Summary
6 pages
Data Science - Copy
No ratings yet
Data Science - Copy
13 pages
InsideSherpa - Task2 - DraftSolutions - Template - RMD - Notepad - InsideSherpa - Task2 - DraftSolutions - Template
No ratings yet
InsideSherpa - Task2 - DraftSolutions - Template - RMD - Notepad - InsideSherpa - Task2 - DraftSolutions - Template
18 pages
Machine Learning Assignment Report - Cars
100% (4)
Machine Learning Assignment Report - Cars
42 pages
Commands for Data Analysis using R
No ratings yet
Commands for Data Analysis using R
11 pages
End To End Machine Learning Problem
No ratings yet
End To End Machine Learning Problem
20 pages
Da Laqs Saqs
No ratings yet
Da Laqs Saqs
23 pages
Lecture 7 - Integrated Analysis With R
No ratings yet
Lecture 7 - Integrated Analysis With R
79 pages
DS_IAT_2_Question_Bank[1] (1)
No ratings yet
DS_IAT_2_Question_Bank[1] (1)
7 pages
Day 1
No ratings yet
Day 1
8 pages
Matrix, Dataframes, List
No ratings yet
Matrix, Dataframes, List
8 pages
2.6 Data Representation End of Unit Quiz
100% (1)
2.6 Data Representation End of Unit Quiz
17 pages
Anushka - Keshav-Shreya Jury Data Analytics&r
No ratings yet
Anushka - Keshav-Shreya Jury Data Analytics&r
14 pages
R programming end term
No ratings yet
R programming end term
4 pages
R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
K Nearest Neighbours (KNN) : Short Intro To KNN
No ratings yet
K Nearest Neighbours (KNN) : Short Intro To KNN
13 pages
model_lab[1]
No ratings yet
model_lab[1]
6 pages
It-Web-Essential-Lab-Manual ORG
No ratings yet
It-Web-Essential-Lab-Manual ORG
34 pages
R Programming
No ratings yet
R Programming
11 pages
R Lab Program
No ratings yet
R Lab Program
20 pages
CSE 3121 Information Visualization R Studio All Codes
No ratings yet
CSE 3121 Information Visualization R Studio All Codes
9 pages
6 Working With Data Frames in R
No ratings yet
6 Working With Data Frames in R
8 pages
R Commands
No ratings yet
R Commands
18 pages
DS Lab
No ratings yet
DS Lab
31 pages
R Intro STAT5000
No ratings yet
R Intro STAT5000
17 pages
Practical File - IP XII-23-24
No ratings yet
Practical File - IP XII-23-24
10 pages
Project Documentation
No ratings yet
Project Documentation
22 pages
Amta - Final Exams: Code: # Load The Toyotacorolla - CSV
No ratings yet
Amta - Final Exams: Code: # Load The Toyotacorolla - CSV
13 pages
BDT Assignment4
No ratings yet
BDT Assignment4
4 pages
IT _QP.
No ratings yet
IT _QP.
3 pages
2letter Writing Powerpoint - 085631
No ratings yet
2letter Writing Powerpoint - 085631
52 pages
Elizabethan & Jacobean Drama
No ratings yet
Elizabethan & Jacobean Drama
3 pages
2_starland3_mod7_a
No ratings yet
2_starland3_mod7_a
1 page
1
No ratings yet
1
19 pages
R Examples
No ratings yet
R Examples
56 pages
All Codes
No ratings yet
All Codes
10 pages
Queer Collective Utopias
No ratings yet
Queer Collective Utopias
13 pages
Assignment 2 PDF
No ratings yet
Assignment 2 PDF
25 pages
R Lab File Deepak
No ratings yet
R Lab File Deepak
27 pages
7708 - MBA PredAnanBigDataNov21
No ratings yet
7708 - MBA PredAnanBigDataNov21
11 pages
Project 4 - Cars-Datasets PDF
100% (2)
Project 4 - Cars-Datasets PDF
44 pages
NLP Assignment 4
No ratings yet
NLP Assignment 4
3 pages
History of Computer
No ratings yet
History of Computer
20 pages
9º Ano - Inglês - Atividade
No ratings yet
9º Ano - Inglês - Atividade
2 pages
PDF Texts 21st Century Literature Unit 2 Lesson 4 Literary Elements in Prose Discussion
No ratings yet
PDF Texts 21st Century Literature Unit 2 Lesson 4 Literary Elements in Prose Discussion
5 pages
English Around The World
No ratings yet
English Around The World
13 pages
Cluster R
No ratings yet
Cluster R
1 page
Colin Baker
No ratings yet
Colin Baker
20 pages
Week 1 HW
No ratings yet
Week 1 HW
3 pages
21st century monthly exam
No ratings yet
21st century monthly exam
3 pages
A Comparison Between The KJV and NIV Bibles
No ratings yet
A Comparison Between The KJV and NIV Bibles
20 pages
Tracking and Monitoring of Learners Progress
No ratings yet
Tracking and Monitoring of Learners Progress
5 pages
Rstudio Study Notes For PA 20181126
No ratings yet
Rstudio Study Notes For PA 20181126
6 pages
Assessment 2: Case Study Analysis
No ratings yet
Assessment 2: Case Study Analysis
2 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Data Mininig Project
67% (3)
Data Mininig Project
28 pages
Deaf Education
No ratings yet
Deaf Education
3 pages
How To Become A Hacker
No ratings yet
How To Become A Hacker
19 pages
Anti Oedipus Papers
100% (7)
Anti Oedipus Papers
218 pages
PPNCKH Loss and Gain
No ratings yet
PPNCKH Loss and Gain
24 pages
All Values in The First Column
No ratings yet
All Values in The First Column
7 pages
Modelling With R
No ratings yet
Modelling With R
3 pages
Photo Essay Rubric 4 Pts 3 Pts 2 Pts 1 Pts Visual Text
No ratings yet
Photo Essay Rubric 4 Pts 3 Pts 2 Pts 1 Pts Visual Text
1 page
SAP WM Vs eWM
No ratings yet
SAP WM Vs eWM
6 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet

dsr8,9

Uploaded by

dsr8,9

Uploaded by

Ex No:8

Date: Finding Teen Segment of Market

credit <- read.csv('../input/credit-2/credit.csv')

You might also like