LP1 Oral Answers
LP1 Oral Answers
- Range - Sometimes it is also useful to use the min and max to calculate the
range of a dataset. The range
- is a numerical indication of the span of our data. To calculate a range, simply
subtract the min
- (13) from the max (110). The range for this dataset is 97.
- Standard Deviation - First, we calculate the mean (or average) of the data. Now,
subtract the mean from every item in the set. Square the difference between
each number and the mean.
- Sum the squared differences.Divide this sum by the number of items. Take the
square root of the variance to find the standard deviation.
-
28. How to create boxplot for each feature in the dataset?
-
29. How to create histogram?
-
30. What is dataset?
- A data set (or dataset) is a collection of data. In the case of tabular data, a data
set corresponds to
- one or more database tables, where every column of a table represents a
particular variable, and
- each row corresponds to a given record of the data set in question
-
31. What is Bayes theorem?
-
32. What is confusion matrix?
- A confusion matrix is a table that is often used to describe the performance of a
classification
- model (or "classifier") on a set of test data for which the true values are known.
-
33. Which function is used to split the dataset in R?
-
34. What are the steps of Naive Bayes algorithm?
35. What is conditional probability?
36. What are decision trees?
- A decision tree is a type of flowchart that shows a clear pathway to a decision. In
terms of data analytics, it is a type of algorithm that includes conditional ‘control’
statements to classify data.
37. What is rPart?
- Recursive partitioning and regression trees
- Rpart is a powerful machine learning library in R that is used for building
classification and regression trees. This library implements recursive partitioning and
is very easy to use. In this guide, you will learn how to work with the rpart library in R.
38. What are applications of rPart?
39. Advantages of decision trees
- Compared to other algorithms decision trees requires less effort for data
preparation during
- pre-processing
-