dsr8,9
dsr8,9
AIM:
To implement a R program to find Teen segment of market using K-Means Clustering.
ALGORITHM:
1. Start
2. Import snsdata dataset using read.csv() and display the structure of the dataset using
str() for exploring and preparation of the data.
3. To eliminate the null values in Gemder, assign two more columns separately for
females and unknown gender.
4. To find the average age of each graduation year for the subgroup, use aggregate()
function with argument FUN = mean.
5. To merge the resultant data frame as a vector to the original data, we would use ave()
function, with argument FUN to find mean of non-empty values. Store it in ave_age.
6. If the age value is empty, then assign the ave_age to it using ifelse().
7. As the main focus is interests, assign a data frame interests with the 36 features from
5 to 40th index.
8. Apply z-score standardization for the interests using lapply and store it in interests_Z.
9. Apply k-means clustering on interests_z and store it in ‘teen_clusters’ which stores
the properties of each of the five clusters.
10. For the evaluation of the model performance, look at the clusters and cluster centers.
11. Analyze the rows of the output and the numbers in the output indicating the average
value for the interest listed at the top at the column.
12. Improve the model performance by applying the cluster IDs to the original data frame
13. Look at the mean age by cluster, proportion of females by clusters and mean number
of friends by cluster.
14. Stop
CODE:
teens <- read.csv("snsdata.csv")
str(teens)
table(teens$gender)
table(teens$gender, useNA = "ifany")
summary(teens$age)
teens$female <- ifelse(teens$gender == "F" & !is.na(teens$gender),1,0)
teens$no_gender <- ifelse(is.na(teens$gender), 1, 0)
table(teens$gender, useNA = "ifany")
table(teens$female, useNA = "ifany")
table(teens$no_gender, useNA = "ifany")
mean(teens$age, na.rm = TRUE)
aggregate(data = teens, age ~ gradyear, mean, na.rm = TRUE)
ave_age <- ave(teens$age, teens$gradyear,
FUN = function(x) mean(x, na.rm = TRUE))
teens$age <- ifelse(is.na(teens$age), ave_age, teens$age)
summary(teens$age)
interests <- teens[5:40]
interests_z <- as.data.frame(lapply(interests, scale))
set.seed(2345)
teen_clusters <- kmeans(interests_z, 5)
AI19542 201501035
teen_clusters$size
teen_clusters$centers
teens$clusters <- teen_clusters$cluster
teens[1:5, c("clusters", "gender", "age", "friends")]
aggregate(data = teens, age ~ clusters, mean)
aggregate(data = teens, female ~ clusters, mean)
aggregate(data = teens, friends ~ clusters, mean)
OUTPUT:
AI19542 201501035
RESULTS
Thus the R program to find teen market segment with K-Means algorithm is
executed successfully and the output is verified.
AI19542 201501035
Ex No: 9 TUNING STOCK MODELS FOR BETTER PERFORMANCE
Date:
AIM:
To implement a R program to tune Stock models for better performance.
ALGORITHM:
1. Start
2. Import the credit dataset using read.csv().
3. Import the library ‘caret’ to use the different machine learning models using train().
4. Set the seed to initialize the random number Generator.
5. Define a tree with ‘default’ and train the model and store it in m. Display the model
‘m’
6. To work with the model for predictions, create a resulting vector using
predict(m,credit).
7. Display the table of the predictions for analysing the proportions.
8. To customize the tuning model. Use trainControl() with controlobject ‘ctrl’ that uses
10-fold validation with selection with only one Standard Error for best performance.
9. Create a dataFrame from the combination of model, trials and winnow, we use
expand.grid(). Assign the grid with 8 different values of trials, with winnow as false.
10. Train a new model tree with kappa as the metric, ctrl object for trControl, grid as
tuneGrid as parameters.
11. Display the new model.
12. Stop.
CODE:
AI19542 201501035
OUTPUT:
AI19542 201501035
RESULT:
Thus, the R program to tune the stock model to improve performance is
executed successfully and the output is verified.
AI19542 201501035