How to choose Machine Learning algorithm.

Apr 1, 20201 like322 views

The document discusses choosing machine learning algorithms for classification problems. It recommends first visualizing the dataset using a pair plot to understand the data structure. If there is high overlap between classes, logistic regression and decision trees may not be suitable due to high error rates. For highly overlapped data, K-nearest neighbors (KNN) is recommended as it uses Euclidean distance to find similarities between data points based on their neighborhoods. Other options for highly overlapped data include random forests or deeper decision trees, but they increase computational costs. The key is to understand the dataset nature and properties before selecting an algorithm.

Machine Learning
Algorithms
Dilemma of choosing WHICH !?
February 23, 2020
Mala Deep Upadhaya

Machine Learning Algorithm
Slides 2 of 12
Choosing Machine Learning Algorithm

Assumption in the lecture
Slides 3 of 12
Choosing Machine Learning Algorithm
• We have a use case: Supervised Learning
system
• i.e. We are concerning with ML supervised
problem statement
• We have Dataset that may be Classification or
Regression problem set
Data set
Regression Classification
• Now, Which Algorithm to choose for the
Dataset?

Algorithm for Classification Problems
Slides 4 of 12
Classification
Logistic
Regression
KNN SVM
Extreme
Gradient
Boosting
Decision Tree
Random
Forest
Ensemble Learning Method : Model that
makes predictions based on a number of
different models
• Now, Which one of the above
algorithm to choose for the
Classification problem?
Choosing Machine Learning Algorithm

Algorithm Selection
Slides 5 of 12
Mostly - Blindly used
• Decision Tree
• Random Forest
• Logistic Regression
OR
Apply every Algorithm parallelly and check the accuracy to see which one is the best.
Task is
Time & Resource Consuming
Choosing Machine Learning Algorithm

Right way to choose Algorithm
Visualization of Data
• Library: Seaborn
• Function: Pairplot
Figure: Pairplot structure
Source: https://seaborn.pydata.org/generated/seaborn.pairplot.html
Slides 6 of 12
Choosing Machine Learning Algorithm

Right way to choose Algorithm
Choose Logistic Regression?
• High overlap of data
• So no straight line can be created
as error rate will be high
• Less accuracy
• Seen as non-linear classification
type problem
• No go with Logistic Regression
Algorithm
Slides 7 of 12
High Overlap of Data
Choosing Machine Learning Algorithm

Right way to choose Algorithm
Slides 8 of 12
• For Non-Linear classification
Choosing Machine Learning Algorithm

Right way to choose Algorithm
Slides 9 of 12
Choose Decision Tree?
• It is just multiple IF –ELSE
• Time for model train will be high in multiple overlap
Choosing Machine Learning Algorithm

Right way to choose Algorithm
Highly Overlap?
• Choose KNN
WHY?
• It uses the concept of Euclidian distance to find the similarities of the point where it
belongs to i.e. Based on neighborhood
Still not satisfied of using KNN then go with Random Forest or Decision Tree
but they will have deeper tree and increase the cost of operation
Choosing Machine Learning Algorithm
Highly Overlapped of Data
Slides 10 of 12

Summary
Slides 11 of 12Choosing Machine Learning Algorithm
Dilemma of Choosing ML Algorithm?
Know the nature of Dataset
IF it is supervised problem statement
Visualize the dataset with Pair plot
Dataset is highly overlap?
No use of Logical
Regression
No use Decision Tree as
cost of operation is high
Choose KNN

References
• https://www.youtube.com/watch?v=38SUUaMX5Rg https://slideplayer.com/slide/5219172/
• https://towardsdatascience.com/basic-ensemble-learning-random-forest-adaboost-gradient-boosting-step-by-step-explained-
95d49d1e2725
Slides 12 of 12Choosing Machine Learning Algorithm

Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning. Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.

Random forest algorithmRashid Ansari

The document discusses the random forest algorithm. It introduces random forest as a supervised classification algorithm that builds multiple decision trees and merges them to provide a more accurate and stable prediction. It then provides an example pseudocode that randomly selects features to calculate the best split points to build decision trees, repeating the process to create a forest of trees. The document notes key advantages of random forest are that it avoids overfitting and can be used for both classification and regression tasks.

05 Clustering in Data MiningValerii Klymchuk

Ensemble learningHaris Jamil

Decision trees in Machine Learning Mohammad Junaid Khan

Presentation on K-Means ClusteringPabna University of Science & Technology

This presentation introduces clustering analysis and the k-means clustering technique. It defines clustering as an unsupervised method to segment data into groups with similar traits. The presentation outlines different clustering types (hard vs soft), techniques (partitioning, hierarchical, etc.), and describes the k-means algorithm in detail through multiple steps. It discusses requirements for clustering, provides examples of applications, and reviews advantages and disadvantages of k-means clustering.

Supervised and unsupervised learningParas Kohli

This document discusses and provides examples of supervised and unsupervised learning. Supervised learning involves using labeled training data to learn relationships between inputs and outputs and make predictions. An example is using data on patients' attributes to predict the likelihood of a heart attack. Unsupervised learning involves discovering hidden patterns in unlabeled data by grouping or clustering items with similar attributes, like grouping fruits by color without labels. The goal of supervised learning is to build models that can make predictions when new examples are presented.

Unit 2 unsupervised learning.pptxDr.Shweta

Unsupervised learning is a machine learning paradigm where the algorithm is trained on a dataset containing input data without explicit target values or labels. The primary goal of unsupervised learning is to discover patterns, structures, or relationships within the data without guidance from predefined categories or outcomes. It is a valuable approach for tasks where you want the algorithm to explore the inherent structure and characteristics of the data on its own.

Support vector machineMusa Hawamdah

This document discusses support vector machines (SVMs) for classification. It explains that SVMs find the optimal separating hyperplane that maximizes the margin between positive and negative examples. This is formulated as a convex optimization problem. Both primal and dual formulations are presented, with the dual having fewer variables that scale with the number of examples rather than dimensions. Methods for handling non-separable data using soft margins and kernels for nonlinear classification are also summarized. Popular kernel functions like polynomial and Gaussian kernels are mentioned.

Lecture 6: Ensemble Methods Marina Santini

Ensemble learning TechniquesBabu Priyavrat

This document provides an introduction to ensemble learning techniques. It defines ensemble learning as combining the predictions of multiple machine learning models. The main ensemble methods described are bagging, boosting, and voting. Bagging involves training models on random subsets of data and combining results by majority vote. Boosting iteratively trains models to focus on misclassified examples from previous models. Voting simply averages the predictions of different model types. The document discusses how these techniques are implemented in scikit-learn and provides examples of decision tree bagging on the Iris dataset.

Ensemble methods in machine learningSANTHOSH RAJA M G

Ensemble methods combine multiple machine learning models to obtain better predictive performance than from any individual model. There are two main types of ensemble methods: sequential (e.g AdaBoost) where models are generated one after the other, and parallel (e.g Random Forest) where models are generated independently. Popular ensemble methods include bagging, boosting, and stacking. Bagging averages predictions from models trained on random samples of the data, while boosting focuses on correcting previous models' errors. Stacking trains a meta-model on predictions from other models to produce a final prediction.

CART – Classification & Regression TreesHemant Chetwani

This document discusses Classification and Regression Trees (CART), a data mining technique for classification and regression. CART builds decision trees by recursively splitting data into purer child nodes based on a split criterion, with the goal of minimizing heterogeneity. It describes the 8 step CART generation process: 1) testing all possible splits of variables, 2) evaluating splits using reduction in impurity, 3) selecting the best split, 4) repeating for all variables, 5) selecting the split with most reduction in impurity, 6) assigning classes, 7) repeating on child nodes, and 8) pruning trees to avoid overfitting.

K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn

This K-Means clustering algorithm presentation will take you through the machine learning introduction, types of clustering algorithms, k-means clustering, how does K-Means clustering work and at least explains K-Means clustering by taking a real life use case. This Machine Learning algorithm tutorial video is ideal for beginners to learn how K-Means clustering work. Below topics are covered in this K-Means Clustering Algorithm presentation: 1. Types of Machine Learning? 2. What is K-Means Clustering? 3. Applications of K-Means Clustering 4. Common distance measure 5. How does K-Means Clustering work? 6. K-Means Clustering Algorithm 7. Demo: k-Means Clustering 8. Use case: Color compression - - - - - - - - About Simplilearn Machine Learning course: A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning. - - - - - - - Why learn Machine Learning? Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period. - - - - - - What skills will you learn from this Machine Learning course? By the end of this Machine Learning course, you will be able to: 1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling. 2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project. 3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning. 4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more. 5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems - - - - - - -

Decision Tree LearningMilind Gokhale

Machine Learning AlgorithmsDezyreAcademy

The document discusses machine learning algorithms and provides descriptions of the top 10 algorithms. It begins by explaining the types of machine learning algorithms: supervised, unsupervised, and reinforcement learning. It then provides brief overviews of some of the most commonly used algorithms, including Naive Bayes, K-means clustering, support vector machines, Apriori, and others. For each algorithm, it gives a short description and links to additional resources.

Support Vector Machine ppt presentationAyanaRukasar

Support vector machines (SVM) is a supervised machine learning algorithm used for both classification and regression problems. However, it is primarily used for classification. The goal of SVM is to create the best decision boundary, known as a hyperplane, that separates clusters of data points. It chooses extreme data points as support vectors to define the hyperplane. SVM is effective for problems that are not linearly separable by transforming them into higher dimensional spaces. It works well when there is a clear margin of separation between classes and is effective for high dimensional data. An example use case in Python is presented.

Machine Learning - Ensemble MethodsAndrew Ferlitsch

Introduction to Machine LearningShahar Cohen

This document provides an introduction to machine learning. It discusses how children learn through explanations from parents, examples, and reinforcement learning. It then defines machine learning as programs that improve in performance on tasks through experience processing. The document outlines typical machine learning tasks including supervised learning, unsupervised learning, and reinforcement learning. It provides examples of each type of learning and discusses evaluation methods for supervised learning models.

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini

Classification and regression trees (cart)Learnbay Datascience

Machine Learning and Real-World ApplicationsMachinePulse

This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications. Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.

Support Vector Machines ( SVM ) Mohammad Junaid Khan

Reinforcement learning 7313Slideshare

This document discusses reinforcement learning. It defines reinforcement learning as a learning method where an agent learns how to behave via interactions with an environment. The agent receives rewards or penalties based on its actions but is not told which actions are correct. Several reinforcement learning concepts and algorithms are covered, including model-based vs model-free approaches, passive vs active learning, temporal difference learning, adaptive dynamic programming, and exploration-exploitation tradeoffs. Generalization methods like function approximation and genetic algorithms are also briefly mentioned.

Dimensionality Reductionmrizwan969

This document discusses dimensionality reduction techniques for data mining. It begins with an introduction to dimensionality reduction and reasons for using it. These include dealing with high-dimensional data issues like the curse of dimensionality. It then covers major dimensionality reduction techniques of feature selection and feature extraction. Feature selection techniques discussed include search strategies, feature ranking, and evaluation measures. Feature extraction maps data to a lower-dimensional space. The document outlines applications of dimensionality reduction like text mining and gene expression analysis. It concludes with trends in the field.

Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn

This presentation about hierarchical clustering will help you understand what is clustering, what is hierarchical clustering, how does hierarchical clustering work, what is distance measure, what is agglomerative clustering, what is divisive clustering and you will also see a demo on how to group states based on their sales using clustering method. Clustering is the method of dividing the objects into clusters which are similar between them and are dissimilar to the objects belonging to another cluster. It is used to find data clusters such that each cluster has the most closely matched data. Prototype-based clustering, hierarchical clustering, and density-based clustering are the three types of clustering algorithms. Lets us discuss hierarchical clustering in this video. In simple terms, Hierarchical clustering is separating data into different groups based on some measure of similarity. Below topics are explained in this "Hierarchical Clustering" presentation: 1. What is clustering? 2. What is hierarchical clustering 3. How hierarchical clustering works? 4. Distance measure 5. What is agglomerative clustering 6. What is divisive clustering 7. Demo: to group states based on their sales Why learn Machine Learning? Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period. What skills will you learn from this Machine Learning course? By the end of this Machine Learning course, you will be able to: 1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling. 2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project. 3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning. 4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more. 5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems We recommend this Machine Learning training course for the following professionals in particular: 1. Developers aspiring to be a data scientist or Machine Learning engineer 2. Information architects who want to gain expertise in Machine Learning algorithms 3. Analytics professionals who want to work in Machine Learning or artificial intelligence 4. Graduates looking to build a career in data science and Machine Learning Learn more at www.simplilearn.com

From decision trees to random forestsViet-Trung TRAN

This document discusses decision trees and random forests for classification problems. It explains that decision trees use a top-down approach to split a training dataset based on attribute values to build a model for classification. Random forests improve upon decision trees by growing many de-correlated trees on randomly sampled subsets of data and features, then aggregating their predictions, which helps avoid overfitting. The document provides examples of using decision trees to classify wine preferences, sports preferences, and weather conditions for sport activities based on attribute values.

Support vector machinezekeLabs Technologies

Support vector machines are a type of supervised machine learning algorithm used for classification and regression analysis. They work by mapping data to high-dimensional feature spaces to find optimal linear separations between classes. Key advantages are effectiveness in high dimensions, memory efficiency using support vectors, and versatility through kernel functions. Hyperparameters like kernel type, gamma, and C must be tuned for best performance. Common kernels include linear, polynomial, and radial basis function kernels.

random forest.pptxPriyadharshiniG41

Random forest is an ensemble learning technique that builds multiple decision trees and merges their predictions to improve accuracy. It works by constructing many decision trees during training, then outputting the class that is the mode of the classes of the individual trees. Random forest can handle both classification and regression problems. It performs well even with large, complex datasets and prevents overfitting. Some key advantages are that it is accurate, efficient even with large datasets, and handles missing data well.

Machine Learning and its Appplications--sudarmani rajagopal

More Related Content

What's hot (20)

Support vector machineMusa Hawamdah

Lecture 6: Ensemble Methods Marina Santini

Ensemble learning TechniquesBabu Priyavrat

Ensemble methods in machine learningSANTHOSH RAJA M G

CART – Classification & Regression TreesHemant Chetwani

K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn

Decision Tree LearningMilind Gokhale

Machine Learning AlgorithmsDezyreAcademy

Support Vector Machine ppt presentationAyanaRukasar

Machine Learning - Ensemble MethodsAndrew Ferlitsch

Introduction to Machine LearningShahar Cohen

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini

Classification and regression trees (cart)Learnbay Datascience

Machine Learning and Real-World ApplicationsMachinePulse

Support Vector Machines ( SVM ) Mohammad Junaid Khan

Reinforcement learning 7313Slideshare

Dimensionality Reductionmrizwan969

Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn

From decision trees to random forestsViet-Trung TRAN

Support vector machinezekeLabs Technologies

Support vector machineMusa Hawamdah

Lecture 6: Ensemble Methods Marina Santini

Ensemble learning TechniquesBabu Priyavrat

Ensemble methods in machine learningSANTHOSH RAJA M G

CART – Classification & Regression TreesHemant Chetwani

K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn

Decision Tree LearningMilind Gokhale

Machine Learning AlgorithmsDezyreAcademy

Support Vector Machine ppt presentationAyanaRukasar

Machine Learning - Ensemble MethodsAndrew Ferlitsch

Introduction to Machine LearningShahar Cohen

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini

Classification and regression trees (cart)Learnbay Datascience

Machine Learning and Real-World ApplicationsMachinePulse

Support Vector Machines ( SVM ) Mohammad Junaid Khan

Reinforcement learning 7313Slideshare

Dimensionality Reductionmrizwan969

Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn

From decision trees to random forestsViet-Trung TRAN

Support vector machinezekeLabs Technologies

Similar to How to choose Machine Learning algorithm. (20)

random forest.pptxPriyadharshiniG41

Machine Learning and its Appplications--sudarmani rajagopal

Random Forest Decision Tree.pptxRamakrishna Reddy Bijjam

Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut

Its all about Machine learning .Machine learning is a field of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to perform tasks without explicit programming instructions. Instead, these algorithms learn from data, identifying patterns, and making decisions or predictions based on that data. There are several types of machine learning approaches, including: Supervised Learning: In this approach, the algorithm learns from labeled data, where each example is paired with a label or outcome. The algorithm aims to learn a mapping from inputs to outputs, such as classifying emails as spam or not spam. Unsupervised Learning: Here, the algorithm learns from unlabeled data, seeking to find hidden patterns or structures within the data. Clustering algorithms, for instance, group similar data points together without any predefined labels. Semi-Supervised Learning: This approach combines elements of supervised and unsupervised learning, typically by using a small amount of labeled data along with a large amount of unlabeled data to improve learning accuracy. Reinforcement Learning: This paradigm involves an agent learning to make decisions by interacting with an environment. The agent receives feedback in the form of rewards or penalties, enabling it to learn the optimal behavior to maximize cumulative rewards over time.Machine learning algorithms can be applied to a wide range of tasks, including: Classification: Assigning inputs to one of several categories. For example, classifying whether an email is spam or not. Regression: Predicting a continuous value based on input features. For instance, predicting house prices based on features like square footage and location. Clustering: Grouping similar data points together based on their characteristics. Dimensionality Reduction: Reducing the number of input variables to simplify analysis and improve computational efficiency. Recommendation Systems: Predicting user preferences and suggesting items or actions accordingly. Natural Language Processing (NLP): Analyzing and generating human language text, enabling tasks like sentiment analysis, machine translation, and text summarization. Machine learning has numerous applications across various domains, including healthcare, finance, marketing, cybersecurity, and more. It continues to be an area of active research and

Support Vector machine(SVM) and Random Forestumarcybermind

This document discusses classification algorithms, beginning with an overview of support vector machines (SVMs). SVMs find a hyperplane that maximally separates classes in training data. Key parameters that control SVM performance are the kernel, gamma value, and C parameter. Applications of SVMs include face detection and gene classification. Random forests are also covered, which create decision trees on data samples and aggregate their predictions through voting. Random forests reduce overfitting and can handle large datasets accurately. The random forest algorithm and parameters like n_estimators and max_depth are explained.

Supervised machine learning algorithms(strengths and weaknesses)MonarchSaha

Macine learning algorithms - K means, KNNaiswaryasathwik

Data miningBehnaz Motavali

Amirkabir University of Technology Advanced Database Course Conference Presentation Review on Data Mining and its techniques. Supervisor: Dr. Bagheri November 2016 In English Presented in Persian دانشگاه صنعتی امیرکبیر (پلی تکنیک تهران) دانشکده مهندسی کامپیوتر و فناوری اطلاعات ارائه کنفرانس درس پایگاه داده پیشرفته داده کاوی و تکنیک های آن استاد: دکتر علیرضا باقری آذرماه 1395

Primer on major data mining algorithmsVikram Sankhala IIT, IIM, Ex IRS, FRM, Fin.Engr

This document provides an overview of major data mining algorithms, including supervised learning techniques like decision trees, random forests, support vector machines, naive Bayes, and logistic regression. Unsupervised techniques discussed include clustering algorithms like k-means and EM, as well as association rule learning using the Apriori algorithm. Application areas and advantages/disadvantages of each technique are described. Libraries for implementing these algorithms in Python and R are also listed.

Machine Learning Interview Questions and AnswersSatyam Jaiswal

Introduction to Machine Learning Key Concepts for Beginners.pptxAssignment World

Machine Learning (ML) is a branch of artificial intelligence that enables computers to learn from data and make predictions or decisions without being explicitly programmed. This infographic explores key ML concepts, including supervised and unsupervised learning, algorithms like regression and classification, and essential steps in model building. Whether you're a beginner or looking to refine your understanding, this guide simplifies complex topics, making ML more accessible for students and professionals alike.

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya

https://github.com/telecombcn-dl/dlmm-2017-dcu Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

04 Classification in Data MiningValerii Klymchuk

20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2

ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. An ensemble is itself a supervised learning algorithm, because it can be trained and then used to make predictions. The trained ensemble, therefore, represents a single hypothesis. This hypothesis, however, is not necessarily contained within the hypothesis space of the models from which it is built.

Machine learning - session 3Luis Borbon

Machine learning with scikitlearnPratap Dangeti

Machine learning algorithms can adapt and learn from experience. The three main machine learning methods are supervised learning (using labeled training data), unsupervised learning (using unlabeled data), and semi-supervised learning (using some labeled and some unlabeled data). Supervised learning includes classification and regression tasks, while unsupervised learning includes cluster analysis.

Intro to machine learningAkshay Kanchan

Decision Tree in Machine LearningTutort Academy

UNIT 2 HILLclimbling 19geyebshshsb .pptxxilep87615

Parametric and Nonparametric.pptxSivapriyaS12

random forest.pptxPriyadharshiniG41

Machine Learning and its Appplications--sudarmani rajagopal

Random Forest Decision Tree.pptxRamakrishna Reddy Bijjam

Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut

Support Vector machine(SVM) and Random Forestumarcybermind

Supervised machine learning algorithms(strengths and weaknesses)MonarchSaha

Macine learning algorithms - K means, KNNaiswaryasathwik

Data miningBehnaz Motavali

Primer on major data mining algorithmsVikram Sankhala IIT, IIM, Ex IRS, FRM, Fin.Engr

Machine Learning Interview Questions and AnswersSatyam Jaiswal

Introduction to Machine Learning Key Concepts for Beginners.pptxAssignment World

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya

04 Classification in Data MiningValerii Klymchuk

20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2

Machine learning - session 3Luis Borbon

Machine learning with scikitlearnPratap Dangeti

Intro to machine learningAkshay Kanchan

Decision Tree in Machine LearningTutort Academy

UNIT 2 HILLclimbling 19geyebshshsb .pptxxilep87615

Parametric and Nonparametric.pptxSivapriyaS12

Recently uploaded (20)

Classification_in_Machinee_Learning.pptxwencyjorda88

Data Analytics Overview and its applicationsJanmejayaMishra7

Geometry maths presentation for begginerszrjacob283

Developing Security Orchestration, Automation, and Response ApplicationsVICTOR MAESTRE RAMIREZ

Minions Want to eat presentacion muy lindaCarlaAndradesSoler1

CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...ThanushsaranS

Customer Segmentation using K-Means clusteringIngrid Nyakerario

This project demonstrates the application of machine learning—specifically K-Means Clustering—to segment customers based on behavioral and demographic data. The objective is to identify distinct customer groups to enable targeted marketing strategies and personalized customer engagement. The presentation walks through: Data preprocessing and exploratory data analysis (EDA) Feature scaling and dimensionality reduction K-Means clustering and silhouette analysis Insights and business recommendations from each customer segment This work showcases practical data science skills applied to a real-world business problem, using Python and visualization tools to generate actionable insights for decision-makers.

Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncegiver630

Telangana State, India’s newest state that was carved from the erstwhile state of Andhra Pradesh in 2014 has launched the Water Grid Scheme named as ‘Mission Bhagiratha (MB)’ to seek a permanent and sustainable solution to the drinking water problem in the state. MB is designed to provide potable drinking water to every household in their premises through piped water supply (PWS) by 2018. The vision of the project is to ensure safe and sustainable piped drinking water supply from surface water sources

GenAI for Quant Analytics: survey-analytics.aiInspirient

1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdfSimran112433

4. Multivariable statistics_Using Stata_2025.pdfaxonneurologycenter1

Digilocker under workingProcess Flow.pptxsatnamsadguru491

Cleaned_Lecture 6666666_Simulation_I.pdfalcinialbob1234

VKS-Python-FIe Handling text CSV Binary.pptxVinod Srivastava

Data Science Courses in India iim skillsdharnathakur29

This comprehensive Data Science course is designed to equip learners with the essential skills and knowledge required to analyze, interpret, and visualize complex data. Covering both theoretical concepts and practical applications, the course introduces tools and techniques used in the data science field, such as Python programming, data wrangling, statistical analysis, machine learning, and data visualization.

Deloitte Analytics - Applying Process Mining in an audit contextProcess mining Evangelist

Mieke Jans is a Manager at Deloitte Analytics Belgium. She learned about process mining from her PhD supervisor while she was collaborating with a large SAP-using company for her dissertation. Mieke extended her research topic to investigate the data availability of process mining data in SAP and the new analysis possibilities that emerge from it. It took her 8-9 months to find the right data and prepare it for her process mining analysis. She needed insights from both process owners and IT experts. For example, one person knew exactly how the procurement process took place at the front end of SAP, and another person helped her with the structure of the SAP-tables. She then combined the knowledge of these different persons.

Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...gmuir1066

定制学历(美国Purdue毕业证)普渡大学电子版毕业证Taqyea

2025年新版美国毕业证普渡大学文凭【q微1954292140】办理普渡大学毕业证(Purdue毕业证书)国外学位认证/毕业证购买【q微1954292140】普渡大学offer/学位证、留信官方学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作【q微1954292140】Buy Purdue University Diploma购买美国毕业证，购买英国毕业证，购买澳洲毕业证，购买加拿大毕业证，以及德国毕业证，购买法国毕业证（q微1954292140）购买荷兰毕业证、购买瑞士毕业证、购买日本毕业证、购买韩国毕业证、购买新西兰毕业证、购买新加坡毕业证、购买西班牙毕业证、购买马来西亚毕业证等。包括了本科毕业证，硕士毕业证。主营项目： 1、真实教育部国外学历学位认证《美国毕业文凭证书快速办理普渡大学国外本科offer在线制作》【q微1954292140】《论文没过普渡大学正式成绩单》，教育部存档，教育部留服网站100%可查. 2、办理Purdue毕业证，改成绩单《Purdue毕业证明办理普渡大学制作成绩单》【Q/WeChat：1954292140】Buy Purdue University Certificates《正式成绩单论文没过》，普渡大学Offer、在读证明、学生卡、信封、证明信等全套材料，从防伪到印刷，从水印到钢印烫金，高精仿度跟学校原版100%相同. 3、真实使馆认证（即留学人员回国证明），使馆存档可通过大使馆查询确认. 4、留信网认证，国家专业人才认证中心颁发入库证书，留信网存档可查. 《普渡大学成绩单制作案例美国毕业证书办理Purdue2025年新版毕业证书》【q微1954292140】学位证1:1完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。高仿真还原美国文凭证书和外壳，定制美国普渡大学成绩单和信封。毕业证办理需要多久拿到？Purdue毕业证【q微1954292140】办理美国普渡大学毕业证(Purdue毕业证书)【q微1954292140】文凭办理普渡大学offer/学位证成绩单定制、留信官方学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决普渡大学学历学位认证难题。美国文凭普渡大学成绩单，Purdue毕业证【q微1954292140】办理美国普渡大学毕业证(Purdue毕业证书)【q微1954292140】专业定制国外成绩单普渡大学offer/学位证成绩单温感光标、留信官方学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决普渡大学学历学位认证难题。美国文凭购买，美国文凭定制，美国文凭补办。专业在线定制美国大学文凭，定做美国本科文凭，【q微1954292140】复制美国Purdue University completion letter。在线快速补办美国本科毕业证、硕士文凭证书，购买美国学位证、普渡大学Offer，美国大学文凭在线购买。【q微1954292140】帮您解决在美国普渡大学未毕业难题（Purdue University）文凭购买、毕业证购买、大学文凭购买、大学毕业证购买、买文凭、日韩文凭、英国大学文凭、美国大学文凭、澳洲大学文凭、加拿大大学文凭（q微1954292140）新加坡大学文凭、新西兰大学文凭、爱尔兰文凭、西班牙文凭、德国文凭、教育部认证，买毕业证，毕业证购买，买大学文凭，购买日韩毕业证、英国大学毕业证、美国大学毕业证、澳洲大学毕业证、加拿大大学毕业证（q微1954292140）新加坡大学毕业证、新西兰大学毕业证、爱尔兰毕业证、西班牙毕业证、德国毕业证，回国证明，留信网认证，留信认证办理，学历认证。从而完成就业。普渡大学毕业证办理，普渡大学文凭办理，普渡大学成绩单办理和真实留信认证、留服认证、普渡大学学历认证。学院文凭定制，普渡大学原版文凭补办，扫描件文凭定做，100%文凭复刻。特殊原因导致无法毕业，也可以联系我们帮您办理相关材料：１：在普渡大学挂科了，不想读了，成绩不理想怎么办？？？ 2：打算回国了，找工作的时候，需要提供认证《Purdue成绩单购买办理普渡大学毕业证书范本》【Q/WeChat：1954292140】Buy Purdue University Diploma《正式成绩单论文没过》有文凭却得不到认证。又该怎么办？？？美国毕业证购买，美国文凭购买，

ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptxpankaj6188303

Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Abodahab

Classification_in_Machinee_Learning.pptxwencyjorda88

Data Analytics Overview and its applicationsJanmejayaMishra7

Geometry maths presentation for begginerszrjacob283

Developing Security Orchestration, Automation, and Response ApplicationsVICTOR MAESTRE RAMIREZ

Minions Want to eat presentacion muy lindaCarlaAndradesSoler1

CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...ThanushsaranS

Customer Segmentation using K-Means clusteringIngrid Nyakerario

Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncegiver630

GenAI for Quant Analytics: survey-analytics.aiInspirient

1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdfSimran112433

4. Multivariable statistics_Using Stata_2025.pdfaxonneurologycenter1

Digilocker under workingProcess Flow.pptxsatnamsadguru491

Cleaned_Lecture 6666666_Simulation_I.pdfalcinialbob1234

VKS-Python-FIe Handling text CSV Binary.pptxVinod Srivastava

Data Science Courses in India iim skillsdharnathakur29

Deloitte Analytics - Applying Process Mining in an audit contextProcess mining Evangelist

Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...gmuir1066

定制学历(美国Purdue毕业证)普渡大学电子版毕业证Taqyea

ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptxpankaj6188303

Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Abodahab

How to choose Machine Learning algorithm.

1. Machine Learning Algorithms Dilemma of choosing WHICH !? February 23, 2020 Mala Deep Upadhaya

2. Machine Learning Algorithm Slides 2 of 12 Choosing Machine Learning Algorithm

3. Assumption in the lecture Slides 3 of 12 Choosing Machine Learning Algorithm • We have a use case: Supervised Learning system • i.e. We are concerning with ML supervised problem statement • We have Dataset that may be Classification or Regression problem set Data set Regression Classification • Now, Which Algorithm to choose for the Dataset?

4. Algorithm for Classification Problems Slides 4 of 12 Classification Logistic Regression KNN SVM Extreme Gradient Boosting Decision Tree Random Forest Ensemble Learning Method : Model that makes predictions based on a number of different models • Now, Which one of the above algorithm to choose for the Classification problem? Choosing Machine Learning Algorithm

5. Algorithm Selection Slides 5 of 12 Mostly - Blindly used • Decision Tree • Random Forest • Logistic Regression OR Apply every Algorithm parallelly and check the accuracy to see which one is the best. Task is Time & Resource Consuming Choosing Machine Learning Algorithm

6. Right way to choose Algorithm Visualization of Data • Library: Seaborn • Function: Pairplot Figure: Pairplot structure Source: https://seaborn.pydata.org/generated/seaborn.pairplot.html Slides 6 of 12 Choosing Machine Learning Algorithm

7. Right way to choose Algorithm Choose Logistic Regression? • High overlap of data • So no straight line can be created as error rate will be high • Less accuracy • Seen as non-linear classification type problem • No go with Logistic Regression Algorithm Slides 7 of 12 High Overlap of Data Choosing Machine Learning Algorithm

8. Right way to choose Algorithm Slides 8 of 12 • For Non-Linear classification Choosing Machine Learning Algorithm

9. Right way to choose Algorithm Slides 9 of 12 Choose Decision Tree? • It is just multiple IF –ELSE • Time for model train will be high in multiple overlap Choosing Machine Learning Algorithm

10. Right way to choose Algorithm Highly Overlap? • Choose KNN WHY? • It uses the concept of Euclidian distance to find the similarities of the point where it belongs to i.e. Based on neighborhood Still not satisfied of using KNN then go with Random Forest or Decision Tree but they will have deeper tree and increase the cost of operation Choosing Machine Learning Algorithm Highly Overlapped of Data Slides 10 of 12

11. Summary Slides 11 of 12Choosing Machine Learning Algorithm Dilemma of Choosing ML Algorithm? Know the nature of Dataset IF it is supervised problem statement Visualize the dataset with Pair plot Dataset is highly overlap? No use of Logical Regression No use Decision Tree as cost of operation is high Choose KNN

12. References • https://www.youtube.com/watch?v=38SUUaMX5Rg https://slideplayer.com/slide/5219172/ • https://towardsdatascience.com/basic-ensemble-learning-random-forest-adaboost-gradient-boosting-step-by-step-explained- 95d49d1e2725 Slides 12 of 12Choosing Machine Learning Algorithm

How to choose Machine Learning algorithm.

Recommended

More Related Content

What's hot (20)

Similar to How to choose Machine Learning algorithm. (20)

Recently uploaded (20)

How to choose Machine Learning algorithm.