Week 9 Lecture - Revision Test-dual-translated
Week 9 Lecture - Revision Test-dual-translated
KQC7015 Test
Date : 22 December 2024
Time : 1.00 – 2.00 afternoon (1 hour)
Venue : Block Y, Department OF Electrical Engineering
What is C
In this case, our model performs poorly on the training our — our classifier
is not able to model the relationship between the input data and the
output class labels.
Types of Logistic Regression
1. Binary • The categorical response has only two 2
Logistic possible outcomes.
• Example: Spam or Not
Regression
3. Ordinal
• Three or more categories with ordering.
Logistic • Example: Movie rating from 1 to 5
Regression
EXAMPLES :
11
HOW DO YOU DO A PCA?
12
1. Standardization
❖ Standardize the range of the continuous initial variables → contributes equally to the
analysis.
❖ Why it is critical to perform standardization prior to PCA, is that the latter is quite sensitive
regarding the variances of the initial variables.
❖ If there are large differences between the ranges of initial variables, those variables with
larger ranges will dominate over those with small ranges
❖ (For example, a variable that ranges between 0 and 100 will dominate over a variable that
ranges between 0 and 1), which will lead to biased results.
❖ Mathematically, this can be done by subtracting the mean and dividing by the standard
deviation for each value of each variable.
𝑉𝐴𝐿𝑈𝐸 −𝑀𝐸𝐴𝑁
❖ Z=
𝑆𝑇𝐴𝑁𝐷𝐴𝑅𝐷 𝐷𝐸𝑉𝐼𝐴𝑇𝐼𝑂𝑁
❖ Once the standardization is done, all the variables will be transformed to the same scale.
13
2. COVARIANCE MATRIX COMPUTATION
➢ To understand how the variables of the input data set are varying from the
mean with respect to each other
➢ sometimes, variables are highly correlated ( contain redundant
information).
➢ For example, for a 3-dimensional data set with 3 variables x, y, and z, the
covariance matrix is a 3×3 matrix of this from:
14
Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)), in the
main diagonal (Top left to bottom right) we actually have the variances of each initial
variable.
And since the covariance is commutative (Cov(a,b)=Cov(b,a)), the entries of the
covariance matrix are symmetric with respect to the main diagonal, which means that
the upper and the lower triangular portions are equal.
What do the covariances that we have as entries of the matrix tell us about the
correlations between the variables?
15
3. COMPUTE THE EIGENVECTORS AND EIGENVALUES OF THE
COVARIANCE MATRIX TO IDENTIFY THE PRINCIPAL
COMPONENTS
Principal components
▪ new variables - linear combinations or mixtures of the initial
variables. are uncorrelated and most of the information within the
initial variables is squeezed or compressed into the first
components.
▪ So, the idea is 10-dimensional data gives you 10 principal
components, but PCA tries to put maximum possible information in
the first component, then maximum remaining information in the
second and so on.
16
Reduce dimensionality without losing much information,
by discarding the components with low information
and considering the remaining components as your
new variables.
18
2nd PC → with the condition that it is uncorrelated with (i.e.,
perpendicular to) the 1st PC and that it accounts for the next highest
variance.
19
The eigenvectors of the Covariance matrix are
actually the directions of the axes where there is the most
variance(most information) and that we call PC.
20
4: FEATURE VECTOR
i. Choose whether to keep all these components or discard those of lesser
significance (of low eigenvalues),
ii. Form with the remaining ones a matrix of vectors that we call Feature vector.
iii. Feature vector is simply a matrix that has as columns the eigenvectors of the
components
iv. This makes it the first step towards dimensionality reduction, because if we
choose to keep only p eigenvectors (components) out of n, the final data set
will have only p dimensions.
21
22
5. RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS
AXES
❖ In the previous steps, you just select the PCs and form the feature vector,
but the input data set remains always in terms of the original axes.
❖ In this step, which is the last one, the aim is to use the feature vector
formed using the eigenvectors of the covariance matrix, to reorient the
data from the original axes to the ones represented by the PC.
❖ This can be done by multiplying the transpose of the original data set by
the transpose of the feature vector.
It is a sample based
Max pooling layer helps reduce discretization process. It is
the spatial size of the convolved similar to the convolution layer
features and also helps reduce but instead of taking a dot
over-fitting by providing an product between the input and
abstracted representation of the kernel we take the max of
them. the region from the input
overlapped by the kernel.
It uses feature randomness when building each individual tree to try to create an uncorrelated forest of trees whose
prediction by committee is more accurate than that of any individual tree.
Random forest is created by randomly splitting the data.
Each decision tree is formed using feature selection indicators like information gain, gain
ratio of each feature.
Considering it to be a classification problem, then each tree computes votes and the highest
votes class is chosen.
If its regression, the average of all the tree's outputs is declared as the result.
Assumptions for Random Forest
The Working process can be explained in the below steps and diagram:
Step-5: For new data points, find the predictions of each decision tree, and
assign the new data points to the category that wins the majority votes.
Four sectors where Random forest mostly used:
Land Marketing
Banking: Medicine:
Use: :