0% found this document useful (0 votes)
13 views

Untitled 10

The document outlines an examination for a Machine Learning course at Parul University, detailing various topics such as PAC Learning, data preprocessing techniques, model evaluation metrics, and classification methods. It includes questions on concepts like Version Space, Decision Trees, Multi-layer Perceptrons, and clustering algorithms like K-Means, as well as theoretical frameworks and learning paradigms. The document serves as a comprehensive guide for students preparing for their summer examination in 2024-25.

Uploaded by

trendcloth4u
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Untitled 10

The document outlines an examination for a Machine Learning course at Parul University, detailing various topics such as PAC Learning, data preprocessing techniques, model evaluation metrics, and classification methods. It includes questions on concepts like Version Space, Decision Trees, Multi-layer Perceptrons, and clustering algorithms like K-Means, as well as theoretical frameworks and learning paradigms. The document serves as a comprehensive guide for students preparing for their summer examination in 2024-25.

Uploaded by

trendcloth4u
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Parul University

FACULTY OF ENGINEERING & TECHNOLOGY


B.Tech/Int. B.Tech, Summer 2024–25 Examination
Subject Name: Machine Learning
Semester: VI/VII
Time: 2.00 p.m to 4.30 p.m
Total Marks: 60

SECTION - A
Q.1 Answer the following questions (Compulsory)

(6 Marks – CO1 – BT2)

A.1. What is PAC Learning? Explain with an example.


Answer: PAC (Probably Approximately Correct) Learning is a framework in machine learning
theory that evaluates whether an algorithm can learn a function that is approximately correct with
high probability. It de nes conditions under which learning is feasible. The learning algorithm aims
to nd a hypothesis that, with high probability (1 - δ), has an error rate less than ε.

Example: Consider a spam classi er. If trained with suf cient labeled emails, the algorithm should
classify new emails as spam/non-spam with high probability and within an acceptable error rate. If
this happens, the algorithm is PAC-learnable.

A.2. Explain data preprocessing techniques in Machine Learning.


Answer: Data preprocessing is a crucial step that involves cleaning and transforming raw data
before feeding it into a machine learning model. Common techniques include:

• Handling Missing Values: Replace missing data using mean, median, or mode, or remove
incomplete rows.

• Normalization: Scaling numerical values to a common range (e.g., 0-1) using Min-Max or
Z-score normalization.

• Outlier Detection: Use statistical methods like Z-score to identify and handle outliers that
can skew results.

• Encoding Categorical Data: Convert categories to numerical values using one-hot


encoding or label encoding.

• Feature Selection: Remove irrelevant or redundant features that don’t contribute to the
prediction.

A.3. What is Z-Score and how is it used in outlier detection?


Answer: Z-score represents the number of standard deviations a data point is from the mean of the
dataset.
fi
fi
fi
fi
Formula:
Z = (X - μ) / σ
Where:

• X = data point

• μ = mean

• σ = standard deviation

Use in Outlier Detection: If a data point has a Z-score greater than 3 or less than -3, it is usually
considered an outlier.

Q.2A. What is Version Space? Explain with example.

(4 Marks – CO1 – BT2)


Answer: Version Space is a concept that represents the set of all hypotheses in the hypothesis space
that are consistent with the observed training data.

• Speci c Boundary (S): Most speci c hypothesis that ts all positive examples.

• General Boundary (G): Most general hypothesis that still ts all positive examples.

Example: If we are classifying whether an animal is a bird:

• S might be: Has feathers, can y

• G might be: Can move, has wings

As more examples are provided, version space narrows until an optimal hypothesis is found.

Q.2B. What is the importance of model evaluation? Name and explain any two
metrics.

(5 Marks – CO2 – BT3)


Answer: Model evaluation is critical for measuring the performance of machine learning models
and comparing different algorithms.

Important Metrics:

1. Accuracy: The ratio of correctly predicted observations to the total observations. Formula:
(TP + TN) / (TP + TN + FP + FN)

2. Precision: The ratio of correctly predicted positive observations to total predicted positives.
Formula: TP / (TP + FP)

These metrics help assess the reliability and robustness of models in real-world scenarios.

Q.3A. What is the difference between Multi-Class and Multi-Label


classi cation?
fi
fi
fl
fi
fi
fi
(4 Marks – CO2 – BT2)
Answer:

• Multi-Class Classi cation: Each instance is assigned to only one label from multiple
possible classes. Example:Classifying animals as cat, dog, or rabbit.

• Multi-Label Classi cation: Each instance can belong to multiple classes simultaneously.
Example: An email can be labeled as both "spam" and "promotional".

Q.3B. Explain the working of Decision Trees using ID3 algorithm.

(5 Marks – CO3 – BT3)


Answer:

• ID3 (Iterative Dichotomiser 3) builds a decision tree by selecting the attribute that yields the
highest information gain.

• It uses entropy to measure the impurity in a dataset and splits data accordingly.

• The process continues recursively until all data is classi ed or stopping conditions are met.

Steps:

1. Calculate entropy for the dataset.

2. For each attribute, calculate information gain.

3. Select attribute with highest gain as root.

4. Repeat for each branch.

Example: Classifying if someone will play tennis based on weather.

SECTION - B
Q.1 (Compulsory)

(6 Marks – CO3 – BT3)

1. Explain how a Multi-layer Perceptron (MLP) works. Answer: An MLP is a type of


feedforward arti cial neural network. It consists of:

• Input Layer: Receives features

• Hidden Layers: Apply weighted sums and activation functions (ReLU, sigmoid)

• Output Layer: Outputs nal predictions

Training: Uses backpropagation to minimize error through gradient descent by updating weights.
fi
fi
fi
fi
fi
2. How is Logistic Regression different from Linear Regression? Answer:

• Linear Regression predicts continuous numerical values.

• Logistic Regression predicts probabilities and is used for classi cation tasks using the
sigmoid function to map values between 0 and 1.

3. What is the use of the kernel in SVM? Answer: Kernels enable Support Vector Machines
(SVMs) to operate in high-dimensional spaces ef ciently. They allow non-linear classi cation by
computing inner products in the transformed feature space.

Common kernels:

• Linear

• Polynomial

• Radial Basis Function (RBF)

Q.2A. Describe the K-Means clustering algorithm.

(4 Marks – CO4 – BT2)


Answer: K-Means is an unsupervised clustering algorithm that partitions data into K clusters.

Steps:

1. Choose K initial centroids randomly.

2. Assign each point to the nearest centroid.

3. Recalculate the centroid of each cluster.

4. Repeat until convergence (no changes in assignments).

Applications: Customer segmentation, document clustering.

Q.2B. What is PCA? Explain how it works.

(5 Marks – CO4 – BT3)


Answer: Principal Component Analysis (PCA) is a technique for reducing the dimensionality of
large datasets while retaining most of the variance.

Steps:

1. Standardize the data.

2. Compute covariance matrix.

3. Calculate eigenvalues and eigenvectors.

4. Choose top components and project data.


fi
fi
fi
Applications: Visualization, noise reduction, speeding up algorithms.

Q.3A. Explain ROC Curve. How does it help in evaluating classi cation models?

(4 Marks – CO5 – BT2)


Answer:

• ROC (Receiver Operating Characteristic) curve plots True Positive Rate (TPR) vs False
Positive Rate (FPR).

• AUC (Area Under Curve) measures classi er's ability to distinguish classes.

AUC Score:

• 1: Perfect classi er

• 0.5: Random guessing

Helps in comparing models and selecting the best threshold.

Q.3B. What is Ensemble Learning? Explain Bagging and Boosting.

(5 Marks – CO6 – BT2)


Answer: Ensemble learning combines multiple weak models to form a stronger predictor.

• Bagging (Bootstrap Aggregation):

◦ Trains multiple models in parallel.

◦ Reduces variance.

◦ Example: Random Forest

• Boosting:

◦ Trains models sequentially.

◦ Each model corrects the errors of its predecessor.

◦ Example: AdaBoost, XGBoost

Introduction to Machine Learning: Detailed Answers

1. What is machine learning?


Machine learning (ML) is a branch of arti cial intelligence (AI) that enables systems to
learn from data and improve their performance over time without being explicitly
programmed. It focuses on developing algorithms that can analyze data, identify patterns,
and make decisions or predictions. ML is used in a wide range of applications such as image
recognition, speech recognition, recommendation systems, and autonomous vehicles.
fi
fi
fi
fi
2. Explain the different types of machine learning paradigms.
There are four main types:

• Supervised Learning: The model is trained on labeled data. Example: Spam detection in
emails.

• Unsupervised Learning: The model nds patterns from unlabeled data. Example:
Customer segmentation.

• Semi-supervised Learning: A small amount of labeled data is used with a large amount of
unlabeled data. Example: Web page classi cation.

• Reinforcement Learning: The model learns by interacting with an environment and


receiving rewards or penalties. Example: Game playing agents like AlphaGo.

3. How is machine learning different from traditional programming?


In traditional programming, a developer writes code to specify exact instructions for the
computer. In machine learning, the algorithm learns patterns from data and creates a model
that makes predictions or decisions. ML focuses on data-driven learning, whereas traditional
programming is rule-based.

4. De ne supervised learning and provide an example.


Supervised learning involves training a model on a labeled dataset, where the input data is
associated with the correct output. The model learns to map inputs to outputs. Example:
Predicting house prices based on features like size, location, and number of rooms.

5. De ne unsupervised learning and provide an example.


Unsupervised learning deals with unlabeled data. The model tries to nd hidden patterns or
structures in the input data. Example: Clustering customers into different groups based on
purchasing behavior.

6. What is semi-supervised learning?


Semi-supervised learning is a hybrid approach that uses a small amount of labeled data and
a large amount of unlabeled data. It helps improve model accuracy when labeling data is
expensive or time-consuming. Example: Classifying large volumes of social media posts
with minimal manual labeling.

7. Explain reinforcement learning with an example.


Reinforcement learning is a feedback-based learning process where an agent interacts with
an environment and learns to make decisions by receiving rewards or penalties. Example: A
robot learning to walk by trial and error, receiving rewards for successful steps.

8. What is the Probably Approximately Correct (PAC) learning framework?


PAC learning is a theoretical framework that analyzes the feasibility of learning algorithms.
It provides guarantees on the accuracy and con dence of a learning algorithm. A model is
PAC-learnable if it can produce a hypothesis that is approximately correct with high
probability.

9. How does PAC learning help in understanding machine learning algorithms?


PAC learning provides a mathematical basis for evaluating learning algorithms. It answers
whether a learning algorithm can achieve good generalization with a nite amount of data. It
helps de ne bounds on training data requirements and error rates.
fi
fi
fi
fi
fi
fi
fi
fi
10. What are the limitations of PAC learning?
PAC learning assumes ideal conditions (e.g., noise-free data, well-de ned hypothesis space),
which may not re ect real-world scenarios. It may not account for computational feasibility
and often provides conservative error bounds.

Learning Paradigms

11. What is the difference between generative and discriminative models?


Generative models learn the joint probability distribution P(x, y) and can generate new data
instances. They model how the data is generated and can perform classi cation by applying
Bayes’ theorem. Examples: Naive Bayes, Hidden Markov Models. Discriminative models
learn the conditional probability P(y|x) or directly map inputs to outputs. They focus on the
decision boundary between classes. Examples: Logistic Regression, SVM, Neural
Networks.

12. Explain the concept of online learning.


Online learning is a learning paradigm where the model learns one instance at a time, rather
than on the entire dataset. It is useful when data arrives in a stream or cannot be stored
completely. It enables the model to adapt quickly to new data.

13. What is transfer learning and how is it used in machine learning?


Transfer learning leverages knowledge gained from solving one problem and applies it to a
different but related problem. For instance, a model trained to recognize animals can be ne-
tuned to identify birds. It is widely used in computer vision and NLP.

14. De ne multi-task learning.


Multi-task learning is an approach where a model is trained on multiple related tasks
simultaneously. It shares common representations across tasks, leading to better
generalization and ef ciency. Example: Joint learning of object detection and segmentation.

15. What is the concept of meta-learning in machine learning?


Meta-learning, or "learning to learn," involves designing models that learn new tasks more
ef ciently based on prior experience. It is often applied in few-shot learning where the
model must adapt to new tasks with very few examples.

Basics of Probability

16. How is probability theory used in machine learning?


Probability theory underpins many ML models. It is used to model uncertainty, make
predictions, and derive learning algorithms. Probabilistic models like Naive Bayes, Bayesian
Networks, and Hidden Markov Models rely on probability theory.

17. De ne and explain Bayes' Theorem.


Bayes’ Theorem is a way of nding a probability when we know certain other probabilities.
It is expressed as:
P(A|B) = [P(B|A) * P(A)] / P(B)
It helps in updating the probability of a hypothesis based on new evidence.

18. What is the difference between prior, likelihood, and posterior probabilities?

• Prior (P(A)): The initial belief before seeing the data.

• Likelihood (P(B|A)): The probability of observing the data given the hypothesis.
fi
fi
fi
fl
fi
fi
fi
fi
fi
• Posterior (P(A|B)): The updated belief after observing the data.

19. What is a probability distribution?


A probability distribution describes how the probabilities are distributed over different
outcomes. It can be discrete (like a die roll) or continuous (like height of people).

20. Explain the difference between discrete and continuous probability distributions.

• Discrete distributions represent variables that take on a countable number of outcomes


(e.g., Binomial, Poisson).

• Continuous distributions represent variables that take on an in nite number of values (e.g.,
Normal, Exponential).

21. What is the role of probability in supervised learning?


Probability helps in making predictions and estimating con dence in outputs. For example,
classi ers output probabilities of each class, and probabilistic loss functions like cross-
entropy are widely used.

Version Spaces

22. What is a version space in machine learning?


A version space is the subset of all hypotheses that are consistent with the observed training
examples. It represents all possible models that explain the data correctly.

23. How does the version space algorithm work?


It maintains two sets of hypotheses:

• Speci c boundary (S): Most speci c hypotheses.

• General boundary (G): Most general hypotheses. As new training data arrives, the
algorithm updates S and G to retain only consistent hypotheses.

24. Explain the concept of the General-to-Speci c ordering of hypotheses in a version


space.
This ordering allows comparison of hypotheses based on their generality. One hypothesis is
more general if it makes fewer assumptions. The version space is bounded by the most
speci c and most general consistent hypotheses.

25. What is the importance of version spaces in hypothesis selection?


Version spaces help narrow down the hypothesis space, ensuring only consistent models are
considered. This helps in selecting a better generalizing hypothesis and avoiding over tting.

26. How can the version space be used for concept learning?
Concept learning involves identifying a general rule from speci c examples. The version
space helps track consistent hypotheses, and learners re ne it with each new example until
convergence.

Machine Learning in Practice - Data Collection

27. What are the common methods for data collection in machine learning?

• Manual data entry


fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
• Sensors and IoT devices

• Web scraping

• Public datasets

• Surveys and feedback forms

• Logs from applications

28. How can data quality impact the performance of a machine learning model?
Poor quality data can lead to inaccurate models. Missing values, noise, outliers, and biased
data reduce performance. Clean, accurate, and relevant data ensures better generalization
and model reliability.

29. What are the ethical considerations in data collection?

• Privacy: Respect user consent and avoid unauthorized data collection.

• Bias: Avoid collecting biased or non-representative data.

• Transparency: Inform users how their data will be used.

• Security: Protect data from breaches.

Preprocessing

30. Why is data preprocessing important in machine learning?


Preprocessing transforms raw data into a clean format suitable for modeling. It improves
data quality, model performance, and reduces errors during training.

31. How can missing values be handled in a dataset?

• Removal of rows/columns

• Imputation (mean, median, mode)

• Using algorithms that support missing values (e.g., XGBoost)

• Predicting missing values using other features

32. Explain the concept of normalization in data preprocessing.


Normalization scales features to a speci c range (e.g., [0,1]). It ensures that features
contribute equally to the model. Common methods include Min-Max scaling and Z-score
normalization.

33. How can you adapt a dataset to the chosen algorithm?

• For tree-based models: Less preprocessing is needed.

• For distance-based models: Normalize/standardize features.

• Encode categorical data (e.g., One-Hot, Label Encoding)

• Handle imbalanced data (e.g., SMOTE, undersampling)


fi
Outlier Analysis (Z-Score)

34. What is an outlier in a dataset?


An outlier is a data point that signi cantly differs from other observations. It may indicate
variability in data or errors in data collection.

35. How can the Z-Score be used for outlier detection?


Z-score measures how many standard deviations a data point is from the mean. If the
absolute Z-score is greater than a threshold (e.g., 3), the point is considered an outlier.

36. Explain the importance of handling outliers in machine learning.


Outliers can skew model predictions and reduce accuracy. Removing or properly treating
outliers improves robustness and generalization of the model.

Model Selection & Evaluation

37. What are the common methods for model selection?

• Train-test split

• Cross-validation

• Grid search with validation

• Ensemble comparison

38. How can cross-validation be used for model evaluation?


Cross-validation splits the data into K folds. The model is trained on K-1 folds and tested on
the remaining fold. This process is repeated K times, and the results are averaged for
evaluation.

39. What is over tting and how can it be avoided?


Over tting occurs when a model learns the training data too well, including noise. It
performs poorly on new data. Prevention methods:

• Use simpler models

• Regularization (L1, L2)

• Pruning (for trees)

• Cross-validation

40. Explain the concept of bias-variance tradeoff.


Bias is the error from incorrect assumptions. Variance is the error from sensitivity to small
uctuations. A good model balances bias and variance to minimize total error.

41. What are the common evaluation metrics for regression models?

• Mean Absolute Error (MAE)

• Mean Squared Error (MSE)

• Root Mean Squared Error (RMSE)


fl
fi
fi
fi
• R-squared (R²)

42. What are the common evaluation metrics for classi cation models?

• Accuracy

• Precision

• Recall

• F1 Score

• ROC-AUC

Optimization of Tuning Parameters

43. What is hyperparameter tuning in machine learning?


Hyperparameter tuning involves selecting the best con guration of hyperparameters (like
learning rate, depth) to optimize model performance.

44. How can grid search be used for hyperparameter tuning?


Grid search performs exhaustive search over a speci ed hyperparameter grid. It trains
models on each combination and selects the one with best performance.

45. Explain the concept of random search for hyperparameter optimization.


Random search samples random combinations from the hyperparameter space instead of
checking all combinations, making it faster and more scalable.

46. What are the challenges in hyperparameter tuning?

• High computation cost

• Curse of dimensionality

• Time-consuming

• Risk of over tting to validation data

Setting the Environment

47. What are the common development environments for machine learning?

• Jupyter Notebooks

• Google Colab

• Anaconda

• VS Code with ML plugins

• PyCharm

48. Explain the importance of reproducibility in machine learning experiments.


Reproducibility ensures that results can be consistently repeated. It helps in debugging,
fi
fi
fi
fi
collaboration, and deploying reliable ML models. Using version control, xing random
seeds, and documenting processes aid reproducibility.

Visualization of Results

49. Why is visualization important in machine learning?


Visualization helps interpret models, understand data, communicate results, and detect
patterns or issues. It bridges the gap between data and decision-making.

50. What are some common visualization techniques used to interpret machine learning
results?

• Confusion Matrix

• ROC Curve

• Feature Importance

• Scatter plots, histograms

• Heatmaps

• SHAP and LIME for model interpretation


fi

You might also like