0% found this document useful (0 votes)
4 views

Interview Questions Data Science

Data Science Interview questions and Solutions

Uploaded by

Tarun Shah
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Interview Questions Data Science

Data Science Interview questions and Solutions

Uploaded by

Tarun Shah
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Interview Questions

Q. Tell me about yourself?

Q. Tell Me about your current project?

Q. Tell me about your roles and responsibilities?

Q. Why are you leaving your current organization?

I was working on similar kinds of projects for some time now. But the
market is rapidly changing, and the skill set required to be relevant in the
market is changing as well. The reason for searching a new job is to work
on several kinds of projects and improve my skill set. <Mention about the
company profile and if you have the project name that you are being
interviewed for as new learning opportunities for you>.

Q. What was your day-to-day task?

Q. How were you doing deployment?

Q. How did you optimize your solution?

Well, model optimization depends on a lot of factors.

 Train with better data (increase the quality), or do data pre-


processing steps more efficiently.
 Keep the resolution of the images identical.
 Increase the quantity of data used for training.
 Increase the number of epochs for which the model was trained
 Tweak the batch input size, the number of hidden layers, the
learning rate, rate of decay, etc. to produce the best results.
 If you are not using transfer learning, then you can alter the number
of hidden layers, activation function.
 Change the function used in the output layer based on the
requirement. The sigmoid functions work well with binary
classification problems, whereas for multi-class problems, we use a
sigmoid model.
 Try and use multithreaded approaches, if possible.
 Reduce Learning Rate in plateau reasons optimizes the model even
further.

Q. How much time did your model take to get trained?

With a batch size of 128 and the number of epochs 100000 with 7000
images, it took around 110 hours to train the model using Nvidia Pascal
Titan GPU.

Q. At what frequency are you retraining and updating your


model?

Q. In which mode have you deployed your model?

I have deployed the model both in cloud environments as well in the on-
premise ones based on the client and project requirements.

Q. What is L1 Regularization (L1 = lasso)?

The main objective of creating a model (training data) is making sure it


fits the data properly and reduce the loss. Sometimes the model that is
trained which will fit the data but it may fail and give a poor performance
during analysing of data (test data). This leads to overfitting.
Regularization came to overcome overfitting. Lasso Regression (Least
Absolute Shrinkage and Selection Operator) adds “Absolute value of
magnitude” of coefficient, as penalty term to the loss function.

Lasso shrinks the less important feature’s coefficient to zero; thus,


removing some feature altogether. So, this works well for feature selection
in case we have a huge number of features.
Methods like Cross-validation, Stepwise Regression are there to handle
overfitting and perform feature selection work well with a small set of
features. These techniques are good when we are dealing with a large set
of features. Along with shrinking coefficients, the lasso performs feature
selection, as well. (Remember the ‘selection’ in the lasso full-form?)
Because some of the coefficients become exactly zero, which is equivalent
to the particular feature being excluded from the model.

Q. L2 Regularization (L2 = Ridge Regression)?

Overfitting happens when the model learns signal as well as noise in the
training data and wouldn’t perform well on new/unseen data on which
model wasn’t trained on. To avoid overfitting your model on training data
like cross-validation sampling, reducing the number of features,
pruning, regularization, etc.

So, to avoid overfitting, we perform Regularization.

The Regression model that uses L2 regularization is called Ridge


Regression.

The formula for Ridge Regression: -


Regularization adds the penalty as model complexity increases. The
regularization parameter (lambda) penalizes all the parameters except
intercept so that the model generalizes the data and won’t overfit.

Ridge regression adds “squared magnitude of the coefficient” as penalty


term to the loss function. Here the box part in the above image represents
the L2 regularization element/term.

Lambda is a hyperparameter.

If lambda is zero, then it is equivalent to OLS. But if lambda is very large,


then it will add too much weight, and it will lead to under-fitting.

Ridge regularization forces the weights to be small but does not make
them zero and does not give the sparse solution.

Ridge is not robust to outliers as square terms blow up the error


differences of the outliers, and the regularization term tries to fix it by
penalizing the weights.

Ridge regression performs better when all the input features influence the
output, and all with weights are of roughly equal size.

L2 regularization can learn complex data patterns.

You might also like