0% found this document useful (0 votes)
20 views40 pages

textbook ML_removed_removed_removed_removed

The document provides an overview of machine learning, its relationship with artificial intelligence, data science, and statistics, and outlines the different types of machine learning including supervised, unsupervised, semi-supervised, and reinforcement learning. It explains the distinctions between these learning types, the significance of data labeling, and the applications of various algorithms. Additionally, it highlights the challenges faced in the field of machine learning.

Uploaded by

shohilcg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
20 views40 pages

textbook ML_removed_removed_removed_removed

The document provides an overview of machine learning, its relationship with artificial intelligence, data science, and statistics, and outlines the different types of machine learning including supervised, unsupervised, semi-supervised, and reinforcement learning. It explains the distinctions between these learning types, the significance of data labeling, and the applications of various algorithms. Additionally, it highlights the challenges faced in the field of machine learning.

Uploaded by

shohilcg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 40
utx ing. 3an lata ited the 2B em the ala, - ive ngs ans on. sed ave tee. ing ult ng Introduction to Machine Learning 6 5 1.3 MACHINE LEARNING IN RELATION TO OTHER FIELDS Machine learning uses the concepts of Artificial Intelligence, Data Science, and Statistics primarily. Itis the resultant of combined ideas of diverse fields. 1.3.1 Machine Learning and Artificial Intelligence Machine learning is an important branch of AI, which is a much broader subject. The aim of Al is to develop intelligent agents. An agent can be a robot, humans, or any autonomous systems. Initially, the idea of AI was ambitious, that is, to develop intelligent systems like human beings. ‘The focus was on logic and logical inferences. It had seen many ups and downs. These down periods were called Al winters. ‘The resurgence in AI happened due to development of data driven systems. The aim is to find relations and regularities present in the data, Machine learning is the subbranch of AI, whose aim is to extract the patterns for prediction. Itis a broad field that includes learning from examples and. other areas like reinforcement learning. The relationship of Al and machine learning is shown in Figure 1.3. The model can take an unknown instance and generate results, “Artificial intelligence Machine learning Deep learning Figure 1.3: Relationship of Al with Machine Learning Deep learning isa subbranch of machine learning. In deep learning, the models are constructed using neural network technology. Neural networks are based on the human neuron models. Many neurons form a network connected with the activation functions that trigger further neurons to perform tasks. 1.3.2. Machine Learning, Data Science, Data Mining, and Data Analytics Data science is an ‘Umbrella’ term that encompasses many fields. Machine learning starts with ’ine learning are interlinked. Machine learning is a branch hering of data for ana data. Therefore, data science an: e. Data science d sis. It is a broad field that 6 «Machine Learning Big Data Data science concerns about collection of data. Big data is a field of data science that deals with data’s following characteristics: 1, Volume: Huge amount of data is generated by big companies like Facebook, Twitter, YouTube. 2. Variety: Data is available in variety of forms like images, videos, and in different formats. 3. Velocity: It refers to the speed at which the data is generated and processed. Big data is used by many machine learning algorithms for applications such as language trans- lation and image recognition. Big data influences the growth of subjects like Deep learning. Deep learning is a branch of machine learning that deals with constructing models using neural networks. Data Mining Data mining’s original genesis is in the business. Like while mining the earth one gets into precious resources, itis often believed that unearthing of the data produces hidden infor- mation that otherwise would have eluded the attention of the management. Nowadays, many consider that data mining and machine learning are same. There is no difference between these fields except that data mining aims to extract the hidden patterns that are present in the data, whereas, machine learning aims to use it for prediction. Data Analytics Another branch of data science is data analytics. It aims to extract useful knowledge from crude data. There are different types of analytics. Predictive data analytics is used for making predictions. Machine learning is closely related to this branch of analytics and shares almost all algorithms. Pattern Recognition It is an engineering field. It uses machine learning algorithms to extract the features for pattern analysis and pattern classification. One can view pattern recognition as a specific application of machine learning. ‘These relations are summarized in Figure 1.4. Data science data |¥__ Data rmining analytics 1 Ma learning Pattern Big data recognition Figure 1.4: Relationship of Machine Learning with Other Major Fields 1.3.3 Machine Learning and Statistics Statistics isa branch of mathematics thathas a solid theoretical foundation regarding statistical learning, Like machine learning (ML, it can leam from data. But the difference between statistics and ML is that statistical methods look for regularity in data called patterns. Initially, statistics sets a hypothesis and performs experiments to verify and validate the hypothesis in order to find relationships among data. eful sed ures ract wa % rat ad Introduction to Machine Learning » 7 Statistics requires knowledge of the statistical procedures and the guidance of a good statistician. Itis mathematics intensive and models are often complicated equations and involve many assumptions. Statistical methods are developed in relation to the data being analysed. In addition, statistical methods are coherent and rigorous. It has strong theoretical foundations and interpretations that require a strong statistical knowledge. Machine learning, comparatively, has less assumptions and requires less statistical knowledge. But, itoften requires interaction with various tools to automate the process of learning. Nevertheless, there is a school of thought that machine leaning is just the latest version of ‘old Statistics’ and hence this relationship should be recognized. 1.4 TYPES OF MACHINE LEARNING What does the word ‘learn’ mean? Learning, like adaptation, occurs as the result of interaction of the program with its environment. It can be compared with the interaction between a teacher and astudent. There are four types of machine learning as shown in Figure 1.5. learning nana = Supervised Unsupervised Semi-supervised Reinforcement learning learning learning learning classification Regression || luster analysis Figure 1.5: Types of Machine Learning Association mining Dimension reduction Before discussing the types of learning, it is necessary to discuss about data Labelled and Unlabelled Data Data is a raw fact. Normally, data is represented in the form of a table, Data also can be referred to asa data point, sample, or an example, Each row of the table represents a data point, Features are attributes or characteristics of an abject. Normally, the columns of the table are attributes. Out of all attributes, one attribute is important and is called a label. Label is the feature that we aim to predict. Thus, there are two types of data — labelled and unlabelled: Labelled Data To illustrate labelled data, let us take one example dataset called Iris flower dataset or Fisher's Iris dataset. The dataset has 50 samples of Iris ~ with four attributes, length and width There are three classes ~ Iris setosa, Iris of sepals and petals. The target variable is called cla virginica, and Iris versicolor The partial data of Iris dataset is shown in Table 1.1. 8 © Machine Learning Table 1.1: Iris Flower Dataset ‘Width of ies Mitneears el ell ker (eet (eel 55 42 14 02 Setosa 2. 7 32 47 14 Versicolor 3. 73 29 63 18 Virginica ‘A dataset need not be always numbers. It can be images or video frames. Deep neural networks can handle images with labels. Inthe following Figure 1.6, the deep neural network takes images of dogs and cats with labels for classification. Input Label dog cat (b) Figure 1.6: (2) Labelled Dataset (b) Unlabelled Dataset In unlabelled data, there are no labels in the dataset. 14,1 Supervised Learning Supervised algorithms use labelled dataset. As the name suggests there isa supervisor or teacher component in supervised learning, A supervisor provides labelled data so that the model is constructed and generates test data. In supervised leaming algorithms, learning takes place in two stages. In layman terms, during the first stage, the teacher conununicates the information to the student that the student is supposed to ‘master, The student receives the information and understands it. During this stage, the teacher has no knowledge of whether the information is grasped by the student. This leads to the second stage of learning, The teacher then asks the student a set of questions to find out how much information has been grasped by the student, Based on these questions, the to no ms ns, ~ Introduction to Machine Learning * 9 the students tested, and the teacher informs the student about his assessment. This kind of learning is typically called supervised learning. Supervised learning has two methods: 1. Classification 2. Regression Classification Classification is a supervised learning method. The input attributes of the classification algorithms are called independent variables. The target attribute is called label or dependent variable. The relationship between the input and target variable is represented in the form of a structure which is called a classification model. So, the focus of classification is to predict the ‘label’ that is ina discrete form (a value from the set of finite values). An example is shown in Figure 1.7 where a dassification algorithm takes a set of labelled data images such as dogs and cats to construct a model that can later be used to classify an unknown test image data. Labelled on New test data Classification Classification —} algorithm model Label is Cat Figure 1.7; An Example Classification System In classification, learning takes place in two stages. During the first stage, called training stage, the learning algorithm takes a labelled dataset and starts learning, After the training set, samples are processed and the model is generated. In the second stage, the constructed model is tested with test or unknown sample and assigned a label. This is the classification process. ‘This is illustrated in the above Figure 1.7. Initially, the classification learning algorithm learns with the collection of labelled data and constructs the model. Then, a test case is selected, and the model assigns a label Similarly, in the case of Iris dataset, if the test is given as (6.3, 2.9, 5.6, 1.8, 2), the classification will generate the label for this. This is called classification. One of the examples of classification is Image recognition, which includes classification of diseases like cancer, sification of plants, etc. ‘The classification models can be categorized based on the implementation technology like decision trees, probabilisticmethods, distance measures, and soft computing methods. Classification models can also be classified as generative models and discriminative models. Generative models deal with the process of data generation and its distribution. Probabilistic models are examples of 40 + Machine Learning generative models. Discriminative models do not care about the generation of data. Instead, they simply concentrate on classifying the given data. Some of the key algorithms of classification are: * Decision Tree * Random Forest * Support Vector Machines © Naive Bayes «Artificial Neural Network and Deep Learning networks like CNN Regression Models Regression models, unlike classification algorithms, predict continuous variables like price. In other words, it is a number. A fitted regression model is shown in Figure 1.8 for a dataset that represent weeks input x and product sales y. 47 “J yranis-Product sales cata () i 4 | + | 1 2 3 4 canis Week data (x) — Regression line (y = 0.66X + 0.54) Figure 1.8: A Regression Model of the Form y= ax + b ‘The regression model takes input x and generates a model in the form of a fitted line of the form y= fix). Here, x is the independent variable that may be one or more attributes and y is the dependent variable. In Figure 1.8, linear regression takes the training set and tries to fit it with a line — product sales = 0,66 x Week + 0.54. Here, 0.66 and 0.54 are all regression coefficients that are learnt from data. The advantage of this model is that prediction for product sales (y) can be made for unknown week data (z). For example, the prediction for unknown eighth week can be made by substituting x as 8 in that regression formula to get y. One of the most important regression algorithms is linear regression that is explained in the next section. Both regression and classification models are supervised algorithms. Both have a supervisor and the concepts of training and testing are applicable to both. What is the difference between classification and regression models? The main difference is that regression models predict continuous variables such as product price, while classification concentrates on assigning labels such as class. hey ice, that the the ha are ade by the ind sles. Introduction to Machine Learning «1 4.4.2 Unsupervised Learning ‘The second kind of learning is by selfnstruction. As the name suggests, there are no supervisor or teacher components. In the absence of a supervisor or teacher, self-instruction is the most common kind of learning, process. This process of self-instruction is based on the concept of trial and error. Here, the program is supplied with objects, but no labels are defined. The algorithm itself observes the examples and recognizes patterns based on the principles of grouping. Grouping is done in ways that similar objects form the same group. Cluster analysis and Dimensional reduction algorithms are examples of unsupervised algorithms. Cluster Analysis Cluster analysis is an example of unsupervised learning. It aims to group objects into disjoint clusters or groups. Cluster analysis clusters objects based on its attributes. All the data objects of the partitions are similar in some aspect and vary from the data objects in the other partitions significantly. Some of the examples of clustering processes are — segmentation of a region of interest in an image, detection of abnormal growth in a medical image, and determining clusters of signatures ina gene database. ‘An example of clustering scheme is shown in Figure 1.9 where the clustering algorithm takes a set of dogs and cats images and groups it as two clusters-dogs and cats. It can be observed that the samples belonging to a cluster are similar and samples are different radically across clusters. Unlabelled Cluster 1 data we Cluster 2 Figure 1.9: An Example Clustering Scheme Some of the key clustering algorithms are k-means algorithm * Hierarchical algorithms 12 « Machine Learning Dimensionality Reduction Dimensionality reduction algorithms are examples of unsupervised algorithms. It takes a higher dimension data as input and outputs the data in lower dimension by taking advantage of the variance of the data. It is a task of reducing the dataset with few features without losing the generality. The differences between supervised and unsupervised learning are listed in the following Table 1.2. Table 1.2: Differences between Supervised and Unsupervised Learning oon ete Unsupervised Learning ‘There is a supervisor component _| No supervisor component 2, | Uses Labelled data Uses Unlabelled data 3. | Assigns categories or labels Performs grouping process such that similar objects | willbe in one cluster 1.4.3 Semi-supervised Learning ‘There are circumstances where the dataset has a huge collection of unlabelled data and some labelled data. Labelling is a costly process and difficult to perform by the humans. Semi-supervised algorithms use unlabelled data by assigning a pseudo-label. Then, the labelled and pseudo-labelled dataset can be combined. 1.4.4 Reinforcement Learning Reinforcement learning mimics human beings. Like human beings use ears and eyes to perceive the world and take actions, reinforcement learning allows the agent to iriteract with the environment to get rewards. The agent can be human, animal, robot, ot any independent program. The rewards enable the agent to gain experience. The agent aims to maximize the reward. ‘The reward can be positive or negative (Punishment). When the rewards are more, the behavior gets reinforced and learning becomes possible. Consider the following example of a Grid game as shown in Figure 1.10. Block Goal Danger Figure 1.10: A Grid game — — Introduction to Machine Learning ° 13 Intthis grid game, the gray tile indicates the danger, black is a block, and the tile with diagonal tines is the goal. The aim is to start, say from bottomv-left grid, using the actions left, right, top and bottom to reach the goal state. To solve this sort of problem, there is no data. The agent interacts with the environment to get experience. In the above case, the agent tries to create a model by simulating many paths and finding rewarding paths. This experience helps in constructing a model. It can be said in summary, compared to supervised learning, there is no supervisor or Jabelled dataset. Many sequential decisions need to be taken to reach the final decision. Therefore, reinforcement algorithms are reward-based, goal-oriented algorithms. ‘Scan for information on ‘Important Machine Learning Algorithms” 1.5 CHALLENGES OF MACHINE LEARNING What are the challenges of machine learning? Let us discuss about them now. Problems that can be Dealt with Machine Learning Computers arebetter than humans in performing tasks like computation, For example, while calculating the square root of large numbers, an average human may blink but computers can display the result in seconds. Computers can play games like chess, GO, and even beat professional players of that game. However, humans are better than computers in many aspects like recognition. But, deep learning systems challenge human beings in this aspect as well. Machines can recognize human faces in a second, Still, there are tasks where humans are better as machine learning systems still require quality data for model construction. The quality of a learning system depends on the quality of data. This is a challenge. Some of the challenges are listed below: 1. Problems~ Machine learning can deal with the well-posed’ problems where specifications are complete and available. Computers cannot solve ‘ill-posed’ problems. Consider one simple example (shown in Table 1.3): Table 1.3: An Example | [_ : i Can a model for this test data be multiplication? Thatis, y=x, x x,, Well! Itis true! But, this is equally true that y may be y x,2. So, there are three functions that fit the data. ‘This means that the probl this problem, one needs more example to check the model, Puzzles and games that do not have ill-posed problem and scientific computation has many ill-posed problems. fficient specification may become an 14 + Machine Learning Huge data ~ This is a primary requirement of machine learning, Availability of a quality data is a challenge. A quality data means it should be large and should not have data problems such as missing data or incorrect data. High computation power ~ With the availability of Big Data, the computational resource requirement has also increased. Systems with Graphics Processing Unit (GPU) or even Tensor Processing Unit (TPU) are required to execute machine learning algorithms. Also, machine learning tasks have become complex and hence time complexity has increased, and that can be solved only with high computing power. . Complexity of the algorithms - The selection of algorithms, describing the algorithms, application of algorithms to solve machine learning task, and comparison of algorithms have become necessary for machine learning or data scientists now. Algorithms have become a big topic of discussion and itis a challenge for machine learning professionals to design, select, and evaluate optimal algorithms. Bias/Variance ~ Variance is the error of the model. This leads to a problem called bias/ variance tradeoff. A model that fits the training data correctly but fails for test data, in general lacks generalization, is called overfitting. The reverse problem is called underfitting where the model fails for training data but has good generalization. Overfitting and underfitting are great challenges for machine learning algorithms. 1.6 MACHINE LEARNING PROCESS ‘The emerging process model for the data mining solutions for business organizations is CRISP-DM. Since machine learning is like data mining, except for the aim, this process can be used for machine earning. CRISP-DM stands for Cross Industry Standard Process ~ Data Mining, This process involves six steps. The steps are listed below in Figure 1.11, Understand the [oT-| Understand the | data a Data preprocesing | ~ i Modelling business Model evaluation Model deployment Figure 1.11: A Machine Learning/Data Mining Process — Introduction to Machine Learning » 15 1. Understanding the business ~ This step involves understanding the objectives and requirements of the business organization, Generally, a single data mining algorithm is enough for giving, the solution. This step also involves the formulation of the problem statement for the data mining process. 2. Understanding the data ~ It involves the steps like data collection, study of the charac- teristics of the data, formulation of hypothesis, and matching of patterns to the selected hypothesis, 3. Preparation of data - This step involves producing the final dataset by cleaning the raw data and preparation of data for the data mining process. The missing values may cause problems during both training and testing phases. Missing data forces classifiers to produce inaccurate results, This is a perennial problem for the classification models, Hence, suitable strategies should be adopted to handle the missing data. 4, Modelling ~ This step plays a role in the application of data mining algorithm for the data to obtain a model or pattern. 5. Evaluate ~ This step involves the evaluation of the data mining results using statistical analysis and visualization methods. The performance of the classifier is determined by evaluating the accuracy of the classifier. The process of classification is a fuzzy issue. For example, classification of emails requires extensive domain knowledge and requires domain experts. Hence, performance of the classifier is very crucial. 6. Deployment ~ This step involves the deployment of results of the data mining algorithm to improve the existing process or for a new situation. 1.7 MACHINE LEARNING APPLICATIONS Machine Learning technologies are used widely now in different domains. Machine learning appli cations are everywhere! One encounters many machine learning, applications in the day-to-day life. Some applications are listed below: 1, Sentiment analysis ~ This is an application of natural language processing (NLP) where the words of documents are converted to sentiments like happy, sad, and angry which are captured by emoticons effectively. For movie reviews or produict reviews, five stars or one star are automatically attached using sentiment analysis programs. Recommendation systems ~ These are systems that make personalized purchases possible. For example, Amazon recommends users to find related books or books bought by people who have the same taste like you, and Netflix suggests shows or related movies of your taste, The recommendation systems are based on machine learning. 3, Voice assistants ~ Products like Amazon Alexa, Microsoft Cortana, Apple Siri, and Google Assistant are all examples of voice assistants. They take speech commands and perform tasks. These chatbo 4, Technologies like Google Maj learning which offer to locate and are the result of machine learning technologies and those used by Uber are all avigate shortest amples of machine juce time: The machine learning applications the machine learning applications. re enormous. The following Table 1.4 summarizes some of 16 © Machine Learning Table 1.4: Applications’ Survey Table No. (esouerouel ‘Applications Business Predicting the bankruptcy of a business firm 2, | Banking Prediction of bank loan defaulters and detecting credit card frauds 3. | Image Processing | Image search engines, object identification, image classification, and generating synthetic images 4. | Audio/Voice Chatbots like Alexa, Microsoft Cortana. Developing chatbots for customer support, speech to text, and text to voice 7 TTelecommuni- | Trend analysis and identification of bogus calls, fraudulent calls and cation its callers, churn analysis 6. | Marketing Retail sales analysis, market basket analysis, product performance analysis, market segmentation analysis, and study of travel patterns of customers for marketing tours 7. | Games Game programs for Chess, GO, and Atari video games 8. | Natural Language | Google Translate, Text summarization, and sentiment analysis Translation 9. | Web Analysis and | Identification of access pattems, detection of e-mail spams, viruses, Services personalized web services, search engines like Google, detection of promotion of user websites, and finding loyalty of users after web page layout modification 10. | Medicine Prediction of diseases, given disease symptoms as cancer or diabetes. Prediction of effectiveness of the treatment using patient history and Chatbots to interact with patients like IBM Watson uses machine learning technologies. ' 1. | Multimedia and | Face recognitionjidentification, biometric projects like identification | | Security of a person from a large image or video database, and applications involving multimedia retrieval 12. | Scientific Domain | Discovery of new galaxies, identification of groups of houses based ‘on house type/geographical location, identification of earthquake epicenters, and identification of similar land use Summary 1. Machine learning can enable top management of an organization to extract the knowledge from the | data stored in various archives to facilitate decision making. 2. Machine learning is an important subbranch of Artificial Intelligence (Al). 3. A model is an explicit description of patterns within the data. 4, A model can be a formula, procedure or representation that can generate data decisions. 5, Humans predict by remembering the past, then formulate the strategy and make a prediction. In the same manner, the computers can predict by following the process. 6. Machine learning is an important branch of AI. Al is a much broader subject. The aim of AI is to develop intelligent agents. An agent can be a robot, humans, or other autonomous systems. I — Introduction to Machine Learning + 17 7. Deep learning is a branch of machine leaming. The difference between machine learning and deep learning is that models are constructed using neural network technology in deep learning, Neural networks are models constructed based on the human neuron models. 8. Data science deals with gathering of data for analysis. Itis a broad field that includes other fields. 9. Data analytics aims to extract useful knowledge from crude data. There are many types of analytics. Predictive data analytics is an area that is dedicated for making predictions. Machine learning is closely related to this branch of analytics and shares almost all algorithms. 10. One can say thus there are two types of data ~ labelled data and unlabelled data. The data with a label is called labelled data and those without a label are called unlabelled data. 11. Supervised algorithms use labelled dataset. As the name suggests, there is a supervisor or teacher component in supervised learning. A supervisor provides the labelled data so that the model is constructed and gives test data for checking the model. 12. Classification is a supervised learning method. The input attributes of the classification algorithms 1. are called independent variables. The target attribute is called label or dependent variable. ‘The relationship between the input and target variables is represented in the form ofa structure which is called a classification model. Cluster analysis is an example of unsupervised leaming. Tt aims to assemble objects into disjoint clusters or groups. 4, Semi-supervised algorithms assign a pseudo-label for unlabelled data, 15. Reinforcement learning allows the agent to interact with the environment to get rewards, The agent 1 can be human, animal, robot, or any independent program. The rewards enable the agent to gain experience. 6, The emerging process model for the data mining solutions for business organizations is CRISP-DM. ‘This model stands for Cross Industry Standard Process ~ Data Mining. 17. Machine Learning technologies are used widely now in different domains. Machine Learning ~ A branch of AI that concems about machines to learn austomatically without being explicitly programmed. Data ~ A raw fact. Model ~ An explicit description of patterns in a data, Experience ~ A collection of knowledge and heuristics in humans and historical training data in case of machines. Predictive Modelling ~ A technique of developing models and making a prediction of unseen data. Deep Learning ~ A branch of machine learning that deals with constructing models using neural networks. Data Science ~ A field of study that encompasses capturing of data to its an: ysis covering all stage of data management Data Analytics ~ A field of study that deals with analysis of data, Big Data ~ A study of data that has characteristics of volume, variety, and velocity Pattern Recognition ~ A field of study that analyses a pattern using machine learning algorithms. 18 Machine Learning —————______ Statistics — A branch of mathematics that deals with learning from data using statistical methods. Hypothesis ~ An initial assumption of an experiment. Learning ~ Adapting to the environment that happens because of interaction of an agent with the environment. Label ~A target attribute. Labelled Data A data that is associated with a label. Unlabelled Data~ A data without labels Supervised Learning - A type of machine learning that uses labelled data and learns with the help ‘of a supervisor or teacher component. Classification Program — A supervisory learning method that takes an unknown input and assigns a label for it. In simple words, finds the category of class of the input attributes. Regression Analysis A supervisory method that predicts the continuous variables based on the input variables, Unsupervised Learning - A type of machine leaning that uses unlabelled data and groups the attributes to clusters using atrial and error approach, Cluster Analysis - A type of unsupervised approach that groups the objects based on attributes so that similar objects or data points form a cluster. Semi-supervised Learning - A type of machine leaming that uses limited labelled and large unlabelled data, It first labels unlabelled data using labelled data and combines it for learning purposes. Reinforcement Learning ~ A type of machine learning that uses agents and environment interaction for creating labelled data for learning, Well-posed Problem - A problem that has well-defined specifications, Otherwise, the problem is called ill-posed. Bias/Variance ~ The inability. of the machine leaning algorithm to predict correctly due to lack of generalization is called bias. Variance is the error of the model for training data, This leads to problems called overfitting and underfitting, Model Deployment ~ A method of deploying machine leaming algorithms to improve the existing business processes for a new situation. So Why is machine learning needed for business organizations? List out the factors that drive the popularity of machine learning. What is a model? Distinguish between the terms: Data, Information, Knowledge, and Intelligence. . How is machine leaning linked to Al, Data Science, and Statistics? List out the types of machine learning. List out the differences between a model and pattern, Patterns are local and model is global for entire dataset — Justify . Are classification and clustering are same or different? Justify. eee enema nonreranamnmaietreroiinminen een Ae IIR 10. iu 12. 13, 14, 15, Introduction to Machine Learning » 19 List out the differences between labelled and unlabelled data. Point out the differences between supervised and unsupervised learning. What are the differences between classification and regression? ‘What isa semi-supervised learning? List out the differences between reinforced leaming and supervised learning. List out important dassification and clustering algorithms List out at least five major applications of machine learning. Long Questions 1 2. 3. Explain in detail the machine learning process model. List out and briefly explain the classification algorithms. List out and briefly explain the unsupervised algorithms. Numerical Problems and Activities Let us assume a regression algorithm generates a model y = 0.54 + 0.66 x for data pertaining to week sales data of a product. Here, x is the week and y is the product sales. Find the prediction for the 5 and 8" week. . Give two examples of patterns and models. . Survey and find out atleast five latest applications of machine learning, . Survey and list out atleast five products that use machine learning, SS og 20 © Machine Learning Across 3. nL 2B. 15. 7, 19. a. 2. 2. 24, The initial assumption of the experiment is called a 5. A study that deals with the analysis of data is called A domain of study that coversall the aspects of data management is called science. fact model. Dataisa CRISP-DM is a Pattern recogition is used for identifying pattems in and videos. ‘Unsupervised learning uses data. Reinforcement learning uses feedback from environment for learning —. (True/False) Amazon Alexa is a, assistant. Classification is an example of, learning. data has the characteristics such as volume, variety and velocity. Down 1. 10. 2 14. 16. 18. . Learning is A problem thathas well-posed specification can be solved using machine learning algorithms —. (Yes/No) Cluster analysis is an example of learning, . Learning from datais the aim of statistics —. (Yes/No) . Predictive models can predict based on data. . Regression can predict variables. tothe environment. Bias and variance cause overfitting and of model Lack of generalization in machine learning happens because of Bias —. (Yes/No) Supervised learning uses data, Machine learning is learning without being explicity programmed. Model is a description of A semi-supervised algorithm assigns a pseudo-label for unlabelled data —. (True/ False) |. Machine leaning using neural networks from a domain called artificial neural network arid learning. 8 foMee Zu eUneeeME ES eEl op 2 Imp mE oxQOBOneNHUOH] . By g g ONUMZO>ZUxKENOHemM SZEE z Zeawant0mne Et oOOnn aU z 2 jo>unmeM Po tree om tinn ge 3 PeUMH MAR AmH HE HHDOHKYU oe g Onn Onazamaenmsmm| EZ E & a> BOte Man uZnvOnnn 7 eDE Sn .0n) = ZE Jnmo zn Ema SFOMH mR aK Zag: lw OOMNAZ—mOCmM<=EZ lOZmadDuDzOHnunoza< & lnnmonmxmEdmxaaiodm ge 8 Jon DunDmEmOnvOZeux £2be Boon bawdonesDoupzal MAO Ziman mm r>aempmxode < = El.mxmeONOeee eae 2 Zloe Dade EZaDx2DNNN<| = ky Ps 0u bax maNOuendOmpZuPaun>uanmonl 2 By ElomtannuazeaxEoo is known as five-point summary. Box plots are suitable for continuous variables and a nominal variable. Box plots can be used to illustrate data distributions and summary of data. It is the popular way for plotting five number summaries. A Box plot is also known as a Box and whisker plot. ‘The box contains bulk of the data. These data are between first and third quartiles. The line inside the box indicates location - mostly median of the data. If the median is not equidistant, then the data is skewed. The whiskers that project from the ends of the box indicate the spread of the tails and the maximum and minimum of the data value. eS Find the 5-point summary of the list (13, 11, 2, 3, 4, 8, 9}. Solution: The minimum is 2 and the maximum is 13. The Q, Q, and Q, are 3, 8 and 11, respec- tively. Hence, 5-point summary is {2, 3, 8, 11, 13}, that is, {minimum, Q,, median, Q,, maximum). Box plots are useful for describing, 5-point summary. The Box plot for the set is given in Figure 2.7. English marks box plot Marks scored yveoe English Figure 2.7: Box Plot for English Marks a e 2.5.4 Shape Skewness and Kurtosis (called moments) indicate the symmetry/asymmetry and peak location of the dataset. Skewness The measures of direction and degree of symmetry are called measures of third order. Ideally skewness should be zero as in ideal normal distribution. More often, the given dataset may not have perfect symmetry (consider the following Figure 2.8). 42» Machine Learning (6) (b) Figure 2.8: (a) Positive Skewed and (b) Negative Skewed Data The dataset may also either have very high values or extremely low values. If the dataset has far higher values, then it is said to be skewed to the tight. On the other hand, if the dat; get has far more low values then it is said to be skewed towards left. Ifthe tail is longer on the lofthand side and hump on the right-hand side, it is called positive skew. Otherwise, it is called négative skew. The given dataset may have an equal distribution of data. The implication of this is that if the data is skewed, then there is a greater chance of outliers in the dataset. This affects the mean and median. Hence, this may affect the performance ofthe data mining algorithm. A perfect symmetry means the skewness is zero. In the case of skew, the median is greater than the mean. In positive skew, the mean is greater than the median. Generally, for negatively skewed distribution, the median is more than the mean, The relationship between skew and the relative size of the mean and median can be summarized by a convenient numerical skew index known as Pearson 2 skewness coefficient. 3x(u= median) — e1 o Also, the following measure is more commonly used to measure skewness. Let Xy X,-) Xy bea set of ‘N’ values or observations then the skewness can be given as: 1x @,- WP Ay, geo whe Here, iris the population mean and is the population standard deviation of the univariate data. Sometimes, for bias correction instead of N, N 1 is used, (2.13) Kurtosis Kurtosis also indicates the peaks of data. If the data is high peak, then it indicates higher kurtosis and vice versa, Kurtosis is the measure of whether the data is heavy tailed or light tailed relative to normal distribution. It can be observed that normal distribution has bell-shaped curve with no long tals, Low kuriosis tends tohave light tails. The implication is that there isno outlier data, Lett x», bea set of ‘N’ values or observations. Then, kurtosis is measured using the formula given below: 2, -3/N o It can be observed that N - 1 is used instead of N in the numerator of Eq. (2.14) for (214) /bias correction. Here, X and ¢ are the mean and standard deviation of the univariate data, “ respectively. 1s far side "ifthe vand retry sitive 2.14) ) for data, _ thas | ———— Understanding Data + 43 ome of the other useful measures for finding the shape ofthe univariate dataset are mean absolute ‘eviation (MAD) and coefficient of variation (CV). mean Absolute Deviation (MAD) MADis another dispersion measure and is robust to outliers. Normally, the outlier points detected by computing the deviation from median and by dividing itby MAD. Here, the absolute deviation between the data and mean is taken. Thus, the absolute deviation is given as: le—ul 215) ‘The sum of the absolute deviations is given as 21x —j Ze-al ‘Therefore, the mean absolute deviation is given as: (2.16) Coefficient of Variation (CV) Coefficient of variation is used to compare datasets with different units. CV is the ratio of standard deviation and mean, and %CV is the percentage of coefficient of variations. 2.5.5 Special Univariate Plots ‘The ideal way to check the shape of the dataset is a stem and leaf plot. A stem and leaf plot are a display that help us to know the shape and distribution of the data. In this method, each value is split into a ‘stem’ and a ‘leaf’. The last digit is usually the leaf and digits to the left: of the leaf mostly form the stem. For example, marks 45 are divided into stem 4 and leaf 5 in Figure 2.9. ‘The stem and leaf plot for the English subject marks, say, (45, 60, 60, 80, 85} is given in Figure 2.9. : Leaf als 6|oo 8 fos Figure 2.9: Stem and Leaf Plot for English Marks It can be seen from Figure 2.9 that the first column is stem and the second column is leaf. For the given English marks, two students with 60 marks are shown in stem and leaf plot as stem-6 with 2 leaves with 0. As discussed earlier, the ideal to normality. Most of the statistical tests are designed only for normal distribution of data. A QQ plot can be used fo assess the shape of the dataset. The Q-Q plot is a 2D scatter plot of an univariate data against theoretical normal distribution data or of two datasets ~ the quartiles of iad datasets. The normal Q-Q plot for marks x =[13 11 234 8 9] is given below in ape of the dataset is a bell-shaped curve. This corresponds the first and sec Figure 2.10. 44 «© Machine Learning QQ plot of sample data versus standard normal 16 14 Quantiles of input sample 5 0 t 05 0 05 1 15 Standard normal quantiles Figure 2.10: Normal Q-Q Plot Ideally, the points fall along the reference line (45 Degree) if the data follows normal distri- bution. Ifthe deviation is more, then there is greater evidence that the datasets follow some different distribution, that is, other than the normal distribution shape. In such a case, careful analysis of the slatistical investigations should be carried out before interpretation. This skewness, kurtosis, mean absolute deviation and coefficient of variation help in assessing the univariate data. vy 2.6 BIVARIATE DATA AND MULTIVARIATE DATA Bivariate Data involves two variables. Bivariate data deals with causes of relationships. The aim is to find relationships among data. Consider the following Table 2.3, with data ofthe temperature in a shop and sales of sweaters. Table 2.3: Temperature in a Shop and Sales Data Oye ce Temperature (in centigr 5 200 10 150 15 ~ 140 20 75 2 @ 3 55 | B 20

You might also like