0% found this document useful (0 votes)

8 views

mla_unit-5'2 (1)

The document discusses text classification using Naive Bayes and Support Vector Machine (SVM) algorithms, detailing the processes of binary, multiclass, and multi-label classification. It explains the mechanics of Naive Bayes, Bayes theorem, and performance evaluation metrics such as accuracy, precision, recall, and F1 score. Additionally, it covers SVM's functionality in classifying news articles, including the use of TF-IDF for feature extraction and the implementation of SVM for achieving high classification accuracy.

Uploaded by

siri.brdvl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

mla_unit-5'2 (1)

Uploaded by

siri.brdvl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 74

TEXT CLASSIFICATION

USING
NAIVES BAYES
TEXT CLASSIFICATION

Text classification is the process of assigning predefined categories

or labels to text data.
It’s a form of supervised machine learning where the input is a
piece of text, and the output is the class it belongs to.
This process can be applied to various types of text such as
documents, emails, reviews, or social media posts.
TYPESOFCLASSIFICATION

Machine learning classification can be categorized into

Binary Classification
Multiclass Classification
Multi-label Classification

BINARY
CLASSIFICATION:
Involves only two categories,
such as positive/negative.
It is the simplest form of text classification, where the model decides
between two possible outcomes.
MULTI-CLASS CLASSIFICATION:
Text is classified into one category out of more than two possible
classes (e.g., classifying news articles as sports, politics, tech).
Each text input gets only one label, even if others may seem
relevant.
MULTI-LABEL CLASSIFICATION:
A single text can be assigned multiple labels (e.g., a tweet labeled as
sports and health).
It reflects the real-world scenarios where content may belong to more
than one category simultaneously.
NAIVE BAYES

• The naïve bayes classifier belongs to the family of probabilistic classifiers that
computes the probabilities of each predictive feature of the data belonging to each
class in order to make a prediction of probability distribution over all classes, besides
the most likely class that the data sample is associated with.
•Bayes: It maps the probabilities of observing input features given belonging classes,
to the probability distribution over classes based on bayes theorem.
•Naïve: It simplifies probability computations by assuming that predictive features are
mutually independent.
BAYES THEOREM

•Let A and B denote two events. An event can be that it will rain tomorrow ,two
kings are drawn from a deck of cards, a person has cancer.
•P(A/B) is the probability that A occurs given B is true can be computed by:
• P(A/B)=p(B/A)p(A)/p(B)
•Where p(B/A) is the probability of observing B given A occurs, and p(A),p(B) the
probability of A occurs and B occurs respectively.
PERFORMANCE EVALUATION

Confusion Matrix: It is a table that is used to define the performance of a

classification algorithms.
• It visualizes and summarizes the performance of a classification.
True positive : The model or test correctly predicts positive when it is
actually positive.
True negative : The model or test correctly predicts negative when it is
actually negative.
False positive : The model or test incorrectly predicts positive but it is
actually negative.
False negative : The model or test incorrectly predicts negative when it is
actually positive.
• Accuracy : It is the probability of number of correct predictions to the
total number of predictions.
Accuracy=TP+TN/TP+TN+FP+FN
• Precision : It is a metric that gives the proportion of true positives to the
amount of total positives that the model predicts.
Precision=TP/TP+FP
• Recall : It focuses on how good the model is finding all the positives. It is
also called as true positive rate.
Recall=TP/TP+FN
• F1 Score : It is a measure that combines recall and precision. It is used to
measure how effectively models make the trade off between precision and
recall.
F1=2*(precision*recall)/(precision+recall)
Email Spam
Detection using
Naive Bayes
Classification:
• Spam email detection is basically a machine learning
classification problem.
• Classification is one of the main instances of supervised
learning in machine learning.
• A trained classification model will be generated by
learning from the features and targets of training
samples.
Types of Classification:
• Machine learning classification can
be categorized into binary
Classification, multiclass
Classification, and multi-label
Classification
BINARY CLASSIFICATION:
Binary classification is the problem of
classifying observations into one of
the two possible classes.
One frequently mentioned example is
email spam filtering , which identifies
email messages as spam or not spam.
Multiclass Classification:
It is also called as multinominal
Classification, allows more than
two possible classes, as
opposed to only two classes in
binary cases.
• Handwritten digit recognition
is common instance and it has
a long history of research and
development since the early
1900s
Multi-label Classification:
It is different from first two
types of classification where
target classes are disjointed.
Naïve bayes:
• The naïve bayes classifier belongs to the family of
probabilistic classifiers that computes the probabilities
of each predictive feature of the data belonging to each
class in order to make a prediction of probability
distribution over all classes, besides the most likely class
that the data sample is associated with.
• Bayes: It maps the probabilities of observing input
features given belonging classes, to the probability
distribution over classes based on bayes theorem.
• Naïve: It simplifies probability computations by
assuming that predictive features are mutually
independent.
Bayes theorem:
• Let A and B denote two events. An event can be
that it will rain tomorrow ,two kings are drawn
from a deck of cards, a person has cancer.
• P(A/B) is the probability that A occurs given B is
true can be computed by:
• P(A/B)=p(B/A)p(A)/p(B)
• Where p(B/A) is the probability of observing B
given A occurs, and p(A),p(B) the probability of
A occurs and B occurs respectively.
Performance evaluation:
• Confusion Matrix: It is a table that is used to define the
performance of a classification algorithms.
• It visualizes and summarizes the performance of a
classification.
• True positive : The model or test correctly predicts
positive when it is actually positive.
• True negative : The model or test correctly predicts
negative when it is actually negative.
• False positive : The model or test incorrectly predicts
positive but it is actually negative.
• False negative : The model or test incorrectly predicts
negative when it is actually positive.
• Accuracy : It is the probability of number of correct
predictions to the total number of predictions.
Accuracy=TP+TN/TP+TN+FP+FN
• Precision : It is a metric that gives the proportion of true
positives to the amount of total positives that the model
predicts.
Precision=TP/TP+FP
• Recall : It focuses on how good the model is finding all
the positives. It is also called as true positive rate.
Recall=TP/TP+FN
• F1 Score : It is a measure that combines recall and
precision. It is used to measure how effectively models
make the trade off between precision and recall.
F1=2*(precision*recall)/(precision+recall)
News Topic Classification with
Support Vector Machine
…………………………………………………………………………………..
This chapter focuses on classifying news articles by topic using the Support Vector Machine (SVM)
algorithm, building upon the 20 Newsgroups dataset introduced earlier. It offers a practical approach
to understanding machine learning classification with text data.

Key topics covered include:

• TF-IDF (Term Frequency–Inverse Document Frequency): A method to transform text into numerical
features.
• Support Vector Machine (SVM): A robust classifier suitable for high-dimensional data like text.
• SVM Mechanics: The theory behind how SVM separates classes with a decision boundary.
• SVM Implementation: Practical steps to apply SVM in code.
• Multiclass Classification: How SVM handles multiple topic categories.
• Nonlinear Kernels: Using kernels like the Gaussian (RBF) to handle non-linear data.
• Choosing Kernels: When to use linear vs. Gaussian kernels.
• Overfitting: Challenges with model generalization and how to prevent it.
• News Topic Classification Example: Applying SVM to classify real-world news data.
• Hyperparameter Tuning: Using grid search and cross-validation to optimize SVM performance.
In the previous chapter, spam email detection was done using a Naive Bayes classifier on a feature space
represented by term frequency (tf) — counting how often each word appeared in individual documents.

Limitation: This method didn't account for how widely terms appeared across the entire collection. Common
words (like “the”, “get”, “make”) may appear frequently, reducing their usefulness for classification.

Solution: TF-IDF (Term Frequency–Inverse Document Frequency)

TF-IDF improves text feature extraction by assigning a weight to each term that:
•Increases with term frequency within a document.
•Decreases with the number of documents containing that term in the corpus.
IDF formula:

Where nD is the total number of documents, is the number of documents containing t,

and the 1 is added to avoid division by zero.
This reduces the influence of common words and highlights rarer, more meaningful terms.

We can test the effectiveness of tf-idf on our existing spam email detection model, by
simply
replacing the tf feature extractor, CountVectorizer, with the tf-idf feature extractor,
TfidfVectorizer, from scikit-learn. W
The best averaged 10-fold AUC 0.9943 is achieved, which outperforms 0.9856 obtained
based on tf features.
Support Vector Machine (SVM)
•SVM is a powerful classifier often used for text data classification
tasks.
•In classification, SVM finds an optimal hyperplane that separates data
points from different classes.
•A hyperplane is a decision boundary in an n-dimensional feature
space:
• In 2D, it’s a line.
• In 3D, it’s a surface.
• In n dimensions, it’s an (n-1)-dimensional plane.
•The goal is to find the hyperplane that maximizes the margin — the
distance between the hyperplane and the nearest data points from
each class.
•These nearest points to the hyperplane are called support vectors.
•Support vectors are critical because they define the position and
orientation of the hyperplane.
The Mechanics of SVM
•There can be infinite possible hyperplanes that separate data points from different
classes.
•The task is to find the optimal separating hyperplane — the one that correctly divides
the data and maximizes the margin.
•Scenario 1: Identifying the Separating Hyperplane
• A valid hyperplane must successfully separate data points based on their labels.
• In an example with hyperplanes A, B, and C:
• Only Hyperplane C correctly separates the classes.
• Hyperplanes A and B fail to segregate them properly.
•Mathematical Definition:
• In 2D, a line (hyperplane) is defined by:
• A slope vector w (a 2D vector)
• An intercept b
• In n-dimensional space, a hyperplane is similarly defined by:
• An n-dimensional vector w
• An intercept b
• Any point x lying on the hyperplane satisfies the equation:
• A hyperplane is a separating hyperplane if:
• For any data point x from one class, it satisfies

• For any data point x from another class, it satisfies

There can be countless possible solutions for w and b. So, next we will learn how to
identify
the best hyperplane among possible separating hyperplanes.
Scenario 2 — Determining the Optimal Hyperplane
•Among many separating hyperplanes, the optimal hyperplane is the one that maximizes the margin between classes.
•Margin = the sum of:
• The distance from the nearest data point on the positive side to the hyperplane.
• The distance from the nearest data point on the negative side to the hyperplane.
•These nearest points from each class define two additional hyperplanes:
• Positive hyperplane: passes through the closest positive class point(s).
• Negative hyperplane: passes through the closest negative class point(s).
•The perpendicular distance between the positive and negative hyperplanes is called the margin.
•A decision hyperplane is optimal when this margin is maximized.
•The points that lie exactly on the margin boundaries (on the positive or negative hyperplane) are called support vectors.
•Support vectors are the critical data points that influence the position and orientation of the optimal hyperplane.
can be portrayed as the distance from the data point to the decision
hyperplane, and also interpreted as the confidence of prediction: the higher the value, the
further away from the decision boundary, the more certainty of the prediction.
Although we cannot wait to implement the SVM algorithm, let's take a step back and look
at a frequent scenario where data points are not perfectly linearly separable.
Scenario 3 - handling outliers
To deal with a set of observations containing outliers that make it unable to linearly segregate the entire dataset, we
allow misclassification of such outliers and try to minimize the introduced error.

The misclassification error (also called hinge loss) for a sample

For a training set of m samples , where the parameter C

controls the trade-off between two terms.
When C of large value is chosen, the penalty for misclassification becomes relatively high, which makes the thumb rule of
data segregation stricter and the model prone to overfitting.
An SVM model with a large C has a low bias, but it might suffer high variance.
Conversely, when the value of C is sufficiently small, the influence of misclassification becomes relatively low, which allows
more misclassified data points and thus makes the separation less strict. An SVM model with a small C has a low variance,
but it might compromise with high bias.
The implementations of SVM
Apply it right away on news topic classification. We start with a binary case classifying two topics,
comp.graphics and sci.space:
First, load the training and testing subset of the computer graphics and science space news
data respectively:
>>> categories = ['comp.graphics', 'sci.space']
>>> data_train = fetch_20newsgroups(subset='train',
categories=categories, random_state=42)
>>> data_test = fetch_20newsgroups(subset='test',
categories=categories, random_state=42)

Again, don't forget to specify a random state for reproducing experiments.

Clean the text data and retrieve label information:
>>> cleaned_train = clean_text(data_train.data)
>>> label_train = data_train.target
>>> cleaned_test = clean_text(data_test.data)
>>> label_test = data_test.target
>>> len(label_train), len(label_test)

As a good practice, check whether the classes are imbalanced:

>>> from collections import Counter
>>> Counter(label_train)
Counter({1: 593, 0: 584})
>>> Counter(label_test)
Counter({1: 394, 0: 389})

Next, extract tf-idf features using the TfidfVectorizer extractor that we just acquired:
>>> tfidf_vectorizer = TfidfVectorizer(sublinear_tf=True,
max_df=0.5, stop_words='english', max_features=8000)
>>> term_docs_train =
tfidf_vectorizer.fit_transform(cleaned_train)
>>> term_docs_test = tfidf_vectorizer.transform(cleaned_test)
Now we can apply our SVM algorithm with features ready. Initialize an SVC model with the
kernel parameter set to linear (we will explain this shortly) and penalty C set to the
default value 1:
>>> from sklearn.svm import SVC
>>> svm = SVC(kernel='linear', C=1.0, random_state=42)

Then fit our model on the training set:

>>> svm.fit(term_docs_train, label_train)
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape=None, degree=3, gamma='auto',
kernel='linear',max_iter=-1, probability=False, random_state=42,
shrinking=True, tol=0.001, verbose=False)

And then predict on the testing set with the trained model and obtain the prediction
accuracy directly:
>>> accuracy = svm.score(term_docs_test, label_test)
>>> print('The accuracy on testing set is:
{0:.1f}%'.format(accuracy*100))
The accuracy on testing set is: 96.4%

Our first SVM model just works so well with 96.4% accuracy achieved
Scenario 4 - dealing with more than two classes
SVM and many other classifiers can be generalized to the multiple class case by two
common approaches, one-vs-rest (also called one-vs-all) and one-vs-one.
In scikit-learn, classifiers handle multiclass cases internally and we do not need to explicitly
write any additional codes to enable it. We can see how simple it is in the following
example of classifying five topics comp.graphics, sci.space, alt.atheism,
talk.religion.misc, and rec.sport.hockey:
>>> categories = [
... 'alt.atheism',
... 'talk.religion.misc',
... 'comp.graphics',
... 'sci.space',
... 'rec.sport.hockey'
... ]

>>> data_train = fetch_20newsgroups(subset='train',

categories=categories, random_state=42)
>>> data_test = fetch_20newsgroups(subset='test',
categories=categories, random_state=42)
>>> cleaned_train = clean_text(data_train.data)
>>> label_train = data_train.target
>>> cleaned_test = clean_text(data_test.data)
>>> label_test = data_test.target
>>> term_docs_train =
tfidf_vectorizer.fit_transform(cleaned_train)
>>> term_docs_test = tfidf_vectorizer.transform(cleaned_test)
In SVC, multiclass support is implicitly handled according to the one-vs-one scheme:
>>> svm = SVC(kernel='linear', C=1.0, random_state=42)
>>> svm.fit(term_docs_train, label_train)
>>> accuracy = svm.score(term_docs_test, label_test)
>>> print('The accuracy on testing set is:
{0:.1f}%'.format(accuracy*100))
The accuracy on testing set is: 88.6%

We check how it performs for individual classes as follows:

>>> from sklearn.metrics import classification_report
>>> prediction = svm.predict(term_docs_test)
>>> report = classification_report(label_test, prediction)
>>> print(report)

precision recall f1-score support

0 0.81 0.77 0.79 319
1 0.91 0.94 0.93 389
2 0.98 0.96 0.97 399
3 0.93 0.93 0.93 394
4 0.73 0.76 0.74 251
avg / total 0.89 0.89 0.89 1752

Not bad! And we could, as usual, tweak the value of the parameters kernel='linear' and
C=1.0 as specified in our SVC model. We discussed that parameter C controls the strictness
of separation, and it can be tuned to achieve the best trade-off between bias and variance.
The kernels of SVM
Scenario 5 - solving linearly non-separable problems
The hyperplane we have looked at till now is linear, for example, a line in a two dimensional feature space, a surface in a three dimensional one.
However, in frequently seen scenarios like the following one, we are not able to find any linear hyperplane to
separate two classes.
Again, can be fine-tuned via cross-validation to obtain the best performance.
Some other common kernel functions include the polynomial kernel and sigmoid
kernel:
Choosing between the linear and RBF kernel
The rule of thumb, of course, is linear separability. However, this is most of the time very
difficult to identify, unless you have sufficient prior knowledge or the features are of low
dimension (1 to 3)
Case 1: both the numbers of features and instances are large (more than 104 or 105). As the
dimension of the feature space is high enough, additional features as a result of RBF
transformation will not provide any performance improvement, but will increase
computational expense. Some examples from the UCI Machine Learning Repository are of
this type:
• URL Reputation Data Set: https://archive.ics.uci.edu/ml/datasets/URL+Reputation (number of instances: 2396130,
number of features: 3231961) for malicious URL detection based on their lexical and host information
• YouTube Multiview Video Games Data Set:
https://archive.ics.uci.edu/ml/datasets/YouTube+Multiview+Video+Games+Dataset (number of instances:
120000,number of features: 1000000) for topic classification.
Case 2: the number of features is noticeably large compared to the number of training
samples. Apart from the reasons stated in Scenario 1, the RBF kernel is significantly more
prone to overfitting. Such a scenario occurs in, for example:
• Dorothea Data Set: https://archive.ics.uci.edu/ml/datasets/Dorothea (number of instances: 1950, number of
features: 100000) for drug discovery that classifies chemical compounds as active or inactive by structural molecular
features .
• Arcene Data Set: https://archive.ics.uci.edu/ml/datasets/Arcene
(number of instances: 900, number of features: 10000) a mass-spectrometry
dataset for cancer detection
Case 3: the number of instances is significantly large compared to the number of features.
For a dataset of low dimension, the RBF kernel will, in general, boost the performance by
mapping it to a higher dimensional space. However, due to the training complexity, it
usually becomes no longer efficient on a training set with more than 106 or 107 samples.
Some exemplar datasets include:
• Heterogeneity Activity Recognition Data Set:
https://archive.ics.uci.edu/ml/datasets/Heterogeneity+Activity+Recognition (number of
instances: 43930257, number of features: 16) for human activity recognition
• HIGGS Data Set: https://archive.ics.uci.edu/ml/datasets/HIGGS (number of instances:
11000000, number of features: 28) for distinguishing between a signal process
producing Higgs bosons or a background process
News topic classification with support vector machine

Load and Clean the Data:

categories = None
data_train = fetch_20newsgroups(subset='train', categories=categories, random_state=42)
data_test = fetch_20newsgroups(subset='test', categories=categories, random_state=42)
cleaned_train = clean_text(data_train.data)
label_train = data_train.target
cleaned_test = clean_text(data_test.data)
label_test = data_test.target
term_docs_train = tfidf_vectorizer.fit_transform(cleaned_train)
term_docs_test = tfidf_vectorizer.transform(cleaned_test)

Train the SVM Model with Cross-validation:

from sklearn.svm import SVC

from sklearn.model_selection import GridSearchCV
parameters = {'C': (0.1, 1, 10, 100)}
svc_libsvm = SVC(kernel='linear')
grid_search = GridSearchCV(svc_libsvm, parameters, n_jobs=-1, cv=3)
grid_search.fit(term_docs_train, label_train)
print("--- %0.3fs seconds ---" % (timeit.default_timer() - start_time))
print(grid_search.best_params_)
print(grid_search.best_score_)
Evaluate the Model:

svc_libsvm_best = grid_search.best_estimator_
accuracy = svc_libsvm_best.score(term_docs_test, label_test)
print(f'The accuracy on testing set is: {accuracy * 100:.1f}%’)

Comparison with LinearSVC:

from sklearn.svm import LinearSVC

svc_linear = LinearSVC()
grid_search = GridSearchCV(svc_linear, parameters, n_jobs=-1, cv=3)
grid_search.fit(term_docs_train, label_train)
print(grid_search.best_params_)
print(grid_search.best_score_)
accuracy = grid_search.best_estimator_.score(term_docs_test, label_test)
print(f'The accuracy on testing set is: {accuracy * 100:.1f}%')
Optimize Using a Pipeline (TF-IDF + LinearSVC):

from sklearn.pipeline import Pipeline

pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words='english')),
('svc', LinearSVC()),
])
parameters_pipeline = {
'tfidf__max_df': (0.25, 0.5),
'tfidf__max_features': (40000, 50000),
'tfidf__sublinear_tf': (True, False),
'tfidf__smooth_idf': (True, False),
'svc__C': (0.1, 1, 10, 100),
}
grid_search = GridSearchCV(pipeline, parameters_pipeline, n_jobs=-1, cv=3)
grid_search.fit(cleaned_train, label_train)
print("--- %0.3fs seconds ---" % (timeit.default_timer() - start_time))
print(grid_search.best_params_)
Stock Price
Prediction
Using Linear Regression
Introduction

• Stock price prediction is a critical task in the financial industry. Using

machine learning models like Linear Regression, we can forecast future
prices based on historical data, identify trends, and support strategic
investment decisions.
Supervised Learning

• Supervised learning is a type of machine learning where the model is

trained using labeled data. The algorithm learns the mapping between
input features and the output (target) value.

• Linear Regression is a common example of supervised learning.

What is Linear Regression?

• Linear Regression is a supervised learning algorithm that models the

relationship between a dependent variable and one or more independent
variables. It fits a straight line through the data to predict future values based
on trends.
• The first thing we think of is linear regression. It explores the
linear relationship between observations and targets and the
relationship is represented in a linear equation or weighted
sum function. Given a data sample x with n features , x1,x2,..,
Xn, represents a feature vector and
• x = (x1,x2, ..., xn)), and weights (also called coefficients) of the
linear regression model w (w represents a vector
(w1,w2,.....,wn)), the target y is expressed as
Formula: Y = w0 +w1x1+w2x2+……+wnxn = (w^T)*x
CODE
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns from sklearn.linear_model
import LinearRegression from sklearn.model_selection
import train_test_split from sklearn.metrics
import mean_squared_error, r2_score

df = pd.read_csv("NSE-TATAGLOBAL.csv")
df.columns = df.columns.str.strip
print(df.columns)
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True) df.set_index('Date', inplace=True)
df.dropna(inplace=True)
plt.figure(figsize=(14, 6))
plt.plot(df['Close'], color='blue’)
plt.title('TATAGLOBAL Closing Price Over Time’)
plt.xlabel('Date') plt.ylabel('Closing Price (INR)’)
plt.grid(True)
plt.show()
df['Open-Close'] = df['Open'] - df['Close’]
df['High-Low'] = df['High'] - df['Low’]
df['7day MA'] = df['Close'].rolling(window=7).mean()
df['14day MA'] = df['Close'].rolling(window=14).mean()
df.dropna(inplace=True)

df['Target'] = df['Close'].shift(-1)
df.dropna(inplace=True)
X = df[features]
y = df['Target’]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False
print("Training Set:", X_train.shape)
print("Test Set:", X_test.shape)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"R2 Score: {r2:.2f}")
print(f"RMSE: {rmse:.2f}")

OUTPUT-
R2 Score: 0.99
RMSE: 0.94
plt.figure(figsize=(14, 6))
plt.plot(y_test.index, y_test, label='Actual', color='blue’)
plt.plot(y_test.index, y_pred, label='Predicted', color='orange’)
plt.title("Actual vs Predicted Stock Prices")
plt.xlabel("Date")
plt.ylabel("Price (INR)")
plt.legend()
plt.grid(True)
plt.show()
next_day_input = X.tail(1)
next_day_prediction = model.predict(next_day_input)
print(f"Predicted next day's closing price: ₹{next_day_prediction[0]:.2f}")

OUTPUT-
Predicted next day's closing price: ₹155.44
Applications of Stock Prediction
using Linear Regression

1. Investment Decision Support

2. Trend Analysis
3. Risk Management
4. Portfolio Optimization
5. Algorithmic Trading
6. Stock Screening
Decision tree regression
After linear regression, the next regression algorithm we will be learning is decision tree
regression, which is also called regression tree.In classification, the decision tree is
constructed by recursive binary splitting and growing each node into left and right children.
In each partition, it greedily searches for the most significant combination of features and
its value as the optimal splitting point.
The quality of separation is measured by the weighted purity of labels of two resulting
children, specifically via metric Gini impurity or information gain. In regression, the tree
construction process is almost identical to the classification one, with only two differences
due to the fact that the target becomes continuous:
The quality of the splitting point is now measured by the weighted mean squared error
(MSE) of two children; the MSE of a child is equivalent to the variance of all target values
and the smaller the weighted MSE, the better split.The average value of targets in a terminal
node becomes the leaf value, instead of the majority of labels in a classification tree.

Unit 3 PPT
No ratings yet
Unit 3 PPT
20 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Lecture 6 Text Classification
No ratings yet
Lecture 6 Text Classification
19 pages
Practical # 11
No ratings yet
Practical # 11
10 pages
BSC ML CH1.pptx
No ratings yet
BSC ML CH1.pptx
63 pages
03 Classification
No ratings yet
03 Classification
66 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
6d7701 - Bayesean Classifer
No ratings yet
6d7701 - Bayesean Classifer
8 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
IR unit 2 (1,2)
No ratings yet
IR unit 2 (1,2)
76 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
13 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
SUPERVISED-LEARNING
No ratings yet
SUPERVISED-LEARNING
30 pages
Text Classification in ML
No ratings yet
Text Classification in ML
47 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Ch5
No ratings yet
Ch5
21 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
39 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
6 Classification
No ratings yet
6 Classification
53 pages
QUESTIONS
No ratings yet
QUESTIONS
20 pages
Algorithm Selection
No ratings yet
Algorithm Selection
9 pages
651276118-Naive-Bayes-Classifier-in-Machine-Learning-Javatpoint
No ratings yet
651276118-Naive-Bayes-Classifier-in-Machine-Learning-Javatpoint
23 pages
Chapter3 Classification Summary Final
No ratings yet
Chapter3 Classification Summary Final
11 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
6.Data Mining - Classification Ppt
No ratings yet
6.Data Mining - Classification Ppt
37 pages
Classification Metrics.pptx
No ratings yet
Classification Metrics.pptx
39 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
ML Lec-11
No ratings yet
ML Lec-11
12 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
Naïve Bayesian Classifier
No ratings yet
Naïve Bayesian Classifier
15 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Data MIning Chapter 8
No ratings yet
Data MIning Chapter 8
11 pages
Naive Bayes Classifiers - Parta
No ratings yet
Naive Bayes Classifiers - Parta
17 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Unit 2
No ratings yet
Unit 2
28 pages
FPA Notes
No ratings yet
FPA Notes
13 pages
ARTIFICIAL INTELLIGENCE LEC 2
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 2
17 pages
UNIT- iv
No ratings yet
UNIT- iv
169 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
Lecture 12 Dr. Lamiaa
No ratings yet
Lecture 12 Dr. Lamiaa
21 pages
Classification And Clustering Techniques In Data Mining
No ratings yet
Classification And Clustering Techniques In Data Mining
18 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
CP4252 Machine Learning lab manual
No ratings yet
CP4252 Machine Learning lab manual
37 pages
Unit 5
No ratings yet
Unit 5
28 pages
DM assignment 2
No ratings yet
DM assignment 2
23 pages
Chapter 19
No ratings yet
Chapter 19
30 pages
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
ADDA
No ratings yet
ADDA
50 pages
Math10 - Q1 - Mod2 - Lessons1-2 Proving-Remainder Theorem and Factor Theorem v3
70% (10)
Math10 - Q1 - Mod2 - Lessons1-2 Proving-Remainder Theorem and Factor Theorem v3
21 pages
M. Younes - M. Rahli - A. Koridak: Economic Power Dispatch Using Evolutionary Algorithm
No ratings yet
M. Younes - M. Rahli - A. Koridak: Economic Power Dispatch Using Evolutionary Algorithm
4 pages
Top 170 Machine Learning Interview Questions and Answers (2024) - Reader View
No ratings yet
Top 170 Machine Learning Interview Questions and Answers (2024) - Reader View
51 pages
"Page Replacement Algorithms": - A-28 Manasi Dhote A-32 Akshat Gandhi A-63 Dhruv Mistry
No ratings yet
"Page Replacement Algorithms": - A-28 Manasi Dhote A-32 Akshat Gandhi A-63 Dhruv Mistry
18 pages
Investigatory - Projects - Maths Rohitsasmal
No ratings yet
Investigatory - Projects - Maths Rohitsasmal
11 pages
BasicNeuralNetwork TrainingAndEvaluation - Ipynb Colaboratory
No ratings yet
BasicNeuralNetwork TrainingAndEvaluation - Ipynb Colaboratory
2 pages
ImageNet Classification With Deep Convolutional Convolutional Neural Networks PDF
No ratings yet
ImageNet Classification With Deep Convolutional Convolutional Neural Networks PDF
37 pages
MCQs on Transportation Problems in Supply Chain Management
No ratings yet
MCQs on Transportation Problems in Supply Chain Management
10 pages
VII Semester B. Tech (Computer Engineering) ...... 2016-17 Subject: Artificial Intelligence Paper Code ... Co 401
No ratings yet
VII Semester B. Tech (Computer Engineering) ...... 2016-17 Subject: Artificial Intelligence Paper Code ... Co 401
6 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
Round Robin Algorithm
100% (1)
Round Robin Algorithm
16 pages
Ad3351 Set1
No ratings yet
Ad3351 Set1
3 pages
Multilayer Feedforward Networks Are Universal Approximators
No ratings yet
Multilayer Feedforward Networks Are Universal Approximators
13 pages
Linked List Applications
No ratings yet
Linked List Applications
36 pages
FALLSEM2022-23 MCSE502L TH VL2022230106912 2022-11-24 Reference-Material-I
No ratings yet
FALLSEM2022-23 MCSE502L TH VL2022230106912 2022-11-24 Reference-Material-I
18 pages
Spectral Modeling Synthesis: Past and Present
100% (1)
Spectral Modeling Synthesis: Past and Present
20 pages
A Butterfly Structured Design of The Hybrid Transform Coding Scheme
No ratings yet
A Butterfly Structured Design of The Hybrid Transform Coding Scheme
4 pages
Final - Assessment CIE 227
No ratings yet
Final - Assessment CIE 227
4 pages
ALG Part1 ASSIGNMENT
No ratings yet
ALG Part1 ASSIGNMENT
28 pages
ADA Course Plan NAG
No ratings yet
ADA Course Plan NAG
2 pages
Korf Multi Player Alpha Beta Pruning
No ratings yet
Korf Multi Player Alpha Beta Pruning
13 pages
Numerical Evaluation of Dynamic Response
No ratings yet
Numerical Evaluation of Dynamic Response
30 pages
Adsa Scheme
No ratings yet
Adsa Scheme
3 pages
Quantum Algorithm - Deutsch's Algorithm
No ratings yet
Quantum Algorithm - Deutsch's Algorithm
20 pages
51 Stringsorts
No ratings yet
51 Stringsorts
69 pages
Soft Computing Question Bank
No ratings yet
Soft Computing Question Bank
4 pages
Solutions Manual to accompany Digital Signal Processing: A Computer Based Approach 3rd edition 9780073048376 instant download
100% (4)
Solutions Manual to accompany Digital Signal Processing: A Computer Based Approach 3rd edition 9780073048376 instant download
41 pages
Prim's Algorithm
No ratings yet
Prim's Algorithm
14 pages
System Identification and Adaptive Control
No ratings yet
System Identification and Adaptive Control
2 pages

mla_unit-5'2 (1)

Uploaded by

mla_unit-5'2 (1)

Uploaded by

TEXT CLASSIFICATION

Text classification is the process of assigning predefined categories

Machine learning classification can be categorized into

Confusion Matrix: It is a table that is used to define the performance of a

Key topics covered include:

Solution: TF-IDF (Term Frequency–Inverse Document Frequency)

Where nD is the total number of documents, is the number of documents containing t,

• For any data point x from another class, it satisfies

The misclassification error (also called hinge loss) for a sample

For a training set of m samples , where the parameter C

Again, don't forget to specify a random state for reproducing experiments.

As a good practice, check whether the classes are imbalanced:

Then fit our model on the training set:

>>> data_train = fetch_20newsgroups(subset='train',

We check how it performs for individual classes as follows:

precision recall f1-score support

Load and Clean the Data:

Train the SVM Model with Cross-validation:

from sklearn.svm import SVC

Comparison with LinearSVC:

from sklearn.svm import LinearSVC

from sklearn.pipeline import Pipeline

• Stock price prediction is a critical task in the financial industry. Using

• Supervised learning is a type of machine learning where the model is

• Linear Regression is a common example of supervised learning.

• Linear Regression is a supervised learning algorithm that models the

1. Investment Decision Support

You might also like