0% found this document useful (0 votes)
10 views

Comprehensive Overview of Common ML Techniques

Uploaded by

Shreya M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Comprehensive Overview of Common ML Techniques

Uploaded by

Shreya M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Comprehensive Overview of Common ML

Techniques, Functions, and Terms


Below are highlighted key machine learning techniques, their functions, and brief descriptions to
help with quick revision.

1. Decision Trees
1. DecisionTreeClassifier: Creates a tree-based model for classification.

o Parameters:

▪ criterion='gini' or 'entropy': Determines the split quality metric.

▪ max_depth: Limits the depth of the tree to prevent overfitting.

o Code usage:

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(criterion='entropy', max_depth=3)

model.fit(X_train, y_train)

2. plot_tree(): Visualizes the decision tree.

from sklearn.tree import plot_tree

plot_tree(model, feature_names=X.columns, class_names=['Class1', 'Class2'])

2. Random Forests
1. RandomForestClassifier: Constructs multiple decision trees and combines outputs
(ensemble learning).

o Parameters:

▪ n_estimators: Number of trees in the forest.

▪ max_features: Number of features considered at each split.

o Code usage:

from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

2. Feature Importance:

model.feature_importances_

3. Support Vector Machines (SVM)


1. SVC: Implements support vector classification.

o Parameters:

▪ kernel='linear' or 'rbf': Specifies the kernel type.

▪ C: Regularization parameter.

from sklearn.svm import SVC

model = SVC(kernel='rbf', C=1.0)

model.fit(X_train, y_train)

2. Hyperplane Visualization (2D case):

import matplotlib.pyplot as plt


plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis')

4. K-Nearest Neighbors (KNN)


1. KNeighborsClassifier: Classifies data points based on the nearest neighbors.

o Parameters:

▪ n_neighbors: Number of neighbors to consider.

from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=5)

model.fit(X_train, y_train)

5. Logistic Regression
1. LogisticRegression: Models binary classification problems using the sigmoid function.

o Parameters:

▪ penalty='l2': Regularization type.

▪ solver='lbfgs': Optimization algorithm.


from sklearn.linear_model import LogisticRegression

model = LogisticRegression()

model.fit(X_train, y_train)

2. Predict Probabilities:

model.predict_proba(X_test)

6. Naive Bayes
1. GaussianNB: Implements the Gaussian Naive Bayes algorithm for continuous data.

from sklearn.naive_bayes import GaussianNB

model = GaussianNB()

model.fit(X_train, y_train)

2. Multinomial Naive Bayes (for text classification):

from sklearn.naive_bayes import MultinomialNB

model = MultinomialNB()

7. K-Means Clustering
1. KMeans: Performs clustering by minimizing within-cluster variance.

o Parameters:

▪ n_clusters: Number of clusters.

from sklearn.cluster import KMeans

model = KMeans(n_clusters=3)
model.fit(X)

labels = model.predict(X)

2. Elbow Method (Determine Optimal Clusters):

from sklearn.metrics import silhouette_score

silhouette_score(X, labels)

8. Principal Component Analysis (PCA)


1. PCA: Reduces dimensionality by finding principal components.

o Parameters:

▪ n_components: Number of dimensions to reduce to.

from sklearn.decomposition import PCA

pca = PCA(n_components=2)

X_reduced = pca.fit_transform(X)

9. Neural Networks
Keras Example (Deep Learning)

1. Sequential: Creates a neural network as a stack of layers.

2. Dense: Fully connected layers.

3. Conv2D and MaxPooling2D: Convolution and pooling layers for CNNs.

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten

model = Sequential([

Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)),

MaxPooling2D((2, 2)),
Flatten(),

Dense(128, activation='relu'),

Dense(10, activation='softmax')

])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

10. Ensemble Techniques


1. AdaBoostClassifier: Boosts the performance of weak learners (e.g., decision stumps).

from sklearn.ensemble import AdaBoostClassifier

model = AdaBoostClassifier(n_estimators=50)

model.fit(X_train, y_train)

2. GradientBoostingClassifier: Sequentially corrects errors from previous models.

from sklearn.ensemble import GradientBoostingClassifier

model = GradientBoostingClassifier()

model.fit(X_train, y_train)

11. Evaluation Functions


1. Metrics:

o accuracy_score: Measures overall correctness.

o precision_score: Fraction of relevant instances among retrieved instances.

o recall_score: Fraction of relevant instances that were retrieved.

o f1_score: Harmonic mean of precision and recall.

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

2. Confusion Matrix:
from sklearn.metrics import confusion_matrix

confusion_matrix(y_test, y_pred)

3. ROC and AUC:

from sklearn.metrics import roc_curve, auc

fpr, tpr, thresholds = roc_curve(y_test, y_pred_prob)

auc(fpr, tpr)

Keras

1. Overview: A high-level neural networks API built on TensorFlow. Designed for fast
experimentation.

2. Key Features:

o Easy-to-use API: Simple prototyping with Sequential and Functional APIs.

o Extensibility: Can define custom layers and loss functions.

o Pre-trained Models: Accessible through keras.applications.

3. Important Functions and Layers:

o Sequential(): Simplifies stacking of layers.

o Dense(units, activation): Fully connected layer.

o Conv2D(filters, kernel_size, activation): Convolutional layer for image processing.

o MaxPooling2D(pool_size): Down-sampling layer.

o Dropout(rate): Reduces overfitting by randomly setting inputs to 0.

o model.compile(optimizer, loss, metrics): Configures training settings.

o model.fit(X, y, epochs, batch_size): Trains the model.

o model.evaluate(X, y): Evaluates the trained model.

o model.predict(X): Predicts outcomes on new data.

PyTorch

1. Overview: A deep learning framework with a dynamic computation graph, making it highly
flexible and easy to debug.
2. Key Features:

o Dynamic Computation Graph: Builds graphs on-the-fly.

o Autograd: Automates gradient computation.

o Custom Models: Allows defining models by subclassing torch.nn.Module.

3. Important Modules and Functions:

o torch.nn: Contains building blocks like layers and loss functions.

▪ nn.Linear(in_features, out_features): Fully connected layer.

▪ nn.ReLU(): Applies ReLU activation.

▪ nn.Conv2d(in_channels, out_channels, kernel_size): Convolutional layer.

▪ nn.CrossEntropyLoss(): Computes loss for classification tasks.

o torch.optim: Optimizers for training.

▪ optim.SGD(params, lr): Stochastic Gradient Descent.

▪ optim.Adam(params, lr): Adam optimizer.

o autograd.grad(): Computes gradients for tensors.

o model.train(): Enables training mode.

o model.eval(): Enables evaluation mode.

You might also like