From the course: Microsoft Azure AI Essentials: Workloads and Machine Learning on Azure

Binary classification

- [Instructor] Binary classification predicts one of two outcomes, like yes and no or valuable and not valuable. It's a supervised technique requiring features to have assigned labels. Like regression, it follows an alternative process of training, validating, and evaluating. However, classification algorithms calculate probabilities for class assignment, not numeric values. For example, let's build a model to predict if a person will develop diabetes based on features like blood pressure, cholesterol, BMI, and smoking habits. We trained the model using an algorithm that fits the data to a function, calculating the probability of diabetes ranging from 0-1. For instance, if the probability is 0.7, then the chance of not having diabetes is 0.3. Similar to regression, there are many algorithms: logistic regression, decision tree, random forest, and support vector machines among others. Logistic regression is popular for its simplicity. Using a sigmoid s-shaped function ranging from 0-1, predictions are compared to a threshold. typically 0.5. Values above or equal to 0.5 indicate diabetes, 1, and values below indicate no diabetes, 0. Like regression, you reserve a random subset of data to validate the model. To evaluate the model, you create a confusion matrix, a count of correct and incorrect predictions for each class. It might sound confusing, but let's simplify it. Assume our diabetes model gives the results in the following matrix. True negatives, TN, indicate our model correctly predicted non-diabetic cases. False positives, FP, are incorrect diabetes predictions. False negatives, FN, are incorrect non-diabetic predictions. True positives, TP, indicate our model correctly predicted diabetes. After creating a confusion matrix, we compute these metrics. Accuracy is the percentage of correct predictions, which is 83% in our case. Recall measures how well the model identifies actual diabetic cases. Ours correctly identifies 75% of them. Precision measures how accurate the model's positive predictions are. Our model shows a precision of 100%. Finally, the F1-score combines recall and precision, making it useful as a single performance metric. Our model's F1-score is 0.86. Another key metric is the area under the curve, AUC, which shows how well the model predicts diabetes compared to random guessing. Most software calculates this automatically. An AUC of 1 indicates a perfect model. An AUC of 0.5 or lower means the model is guessing randomly. You should aim for an AUC between 0.5 and 1.

Contents