0% found this document useful (0 votes)
6 views

4_ML_Manual

The workshop focuses on gaining skills in various data classification methods using Python's Sklearn library. Participants are required to load or generate a DataFrame, perform exploratory data analysis, choose a classification method, and evaluate model performance through various metrics. The final report should be a PowerPoint presentation with specific requirements, including code and input data files.

Uploaded by

Makinator
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

4_ML_Manual

The workshop focuses on gaining skills in various data classification methods using Python's Sklearn library. Participants are required to load or generate a DataFrame, perform exploratory data analysis, choose a classification method, and evaluate model performance through various metrics. The final report should be a PowerPoint presentation with specific requirements, including code and input data files.

Uploaded by

Makinator
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Workshop №4

Data classification.

The purpose of the work: gaining skills in different classification method.

Progress:

To prepare working with DataFrames you should use materials from lectures and
finish workshop #3. For Data classification you can use the Sklearn libraries,
which contains a range of useful algorithms that can easily be implemented and
setting for the purposes of classification and other machine learning tasks You can
also use different libraries for Python up to you. ☺

1. Load the DataFrame, which you used in the workshop #3 or Generate a


random DataFrame for classification problem with your own setting (used
method make_classification()).
2. Generate descriptive statistics. Group DataFrame using a Series of
columns up to you. Create two count plots up to you.
3. Choose a Classification method (K-Nearest Neighbors, Support Vector
Machines, Decision Tree Classifiers, Naive Bayes, Linear Discriminant Analysis,
Logistic Regression and so on) and perform necessary data-preparation for this
model. Justify the need for choosing preprocessing steps.
4. Separate the dataset into feature columns and target column. Create the
testing and training splits (import train_test_split).
5. Perform an Exploratory Data Analysis (import pyplot, import seaborn).
6. Analyze the model settings. Create classification models with different
settings. For the every model with different setting:
6.1 Estimate the model on Validation (Testing) dataset.
6.2 Make a classification report (import ConfusionMatrixDisplay,
roc_curve, auc, classification_report,accuracy_score, confusion_matrix).Visualize
a confusion matrix. Plot ROC curve.
7. Additional step. Compare the quality of the model at different settings.
Present the result of comparing as a barplot. Print the value of the criteria upper the
every barplot.

Reporting
The report is a power point presentation with slides that should contain SLIDE
NAME, maybe little description and charts with particular results (no more than 10
slides). Also, please, add a file with the program code in *.ipynb format and input
data (cvs or xlsx format) . Do not forget to sign the presentation on first slide. The
file name must contain your last name and workshop number. ☺

You might also like