Cross-Validation in Machine Learning

Cross-validation is a technique used to evaluate machine learning models on a limited data sample. It involves splitting the data into training and validation sets, training the model on the training set and evaluating it on the validation set. Common cross-validation methods described in the document include hold out validation, k-fold cross-validation, leave one out cross-validation, and stratified k-fold cross-validation. These methods are implemented in Python using scikit-learn to effectively test machine learning models.

Uploaded by

Priya dharshini.G

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

151 views

Cross-Validation in Machine Learning

Uploaded by

Priya dharshini.G

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Cross-Validation in Machine

Learning
• Cross-validation is a technique for validating the model efficiency by
training it on the subset of input data and testing on previously unseen
subset of the input data. We can also say that it is a technique to check
how a statistical model generalizes to an independent dataset.
• In machine learning there is always the need to test the stability of the
model. It means based only on the training dataset; we can't fit our
model on the training dataset. For this purpose, we reserve a particular
sample of the dataset, which was not part of the training dataset. After
that, we test our model on that sample before deployment, and this
complete process comes under cross-validation. This is something
different from the general train-test split.
• Hence the basic steps of cross-validations are:
• Reserve a subset of the dataset as a validation set.
• Provide the training to the model using the training dataset.
• Now, evaluate model performance using the validation set. If the model
performs well with the validation set, perform the further step, else
check for the issues.
Key aspects of evaluating the quality of the model are –
• How accurate the model is
• How generalized the model is
• When we start building a model and train it with the ‘entire’ dataset, we can very well calculate its accuracy
on this training data set. But we cannot test how this model will behave with new data which is not present in
the training set, hence its generalization cannot be determined.
• Hence we need techniques to make use of the same data set for both training and testing of the models.
• In machine learning, Cross-Validation is the technique to evaluate how well the model has generalized and its
overall accuracy. For this purpose, it randomly samples data from the dataset to create training and testing
sets. There are multiple cross-validation approaches as follows –

• 1.Hold Out Approach

• 2.Leave One Out Cross-Validation
• 3.K-Fold Cross-Validation
• 4.Stratified K-Fold Cross-Validation
• 5.Repeated Random Train Test Split
• 1. Hold Out Approach
• In the hold-out approach, the data set is split into the train and test set with random sampling.
The train set is used for training the model and the test set is used to test its accuracy with
unseen data. If the training and accuracy are almost the same then the model is said to have
generalized well. It is common to use 80% of data for training and the remaining 20% for testing.
• Advantages
• It is simple and easy to implement
• The execution time is less.
• Disadvantages
• If the dataset itself is small, setting aside portions for testing would reduce the robustness of
the model. This is because the training sample may not be representative of the entire dataset.
• The evaluation metrics may vary due to the randomness of the split between the train and test
set.
• Although 80-20 split for train test is widely followed, there is no thumb rule for the split and
hence the results can vary based on how the train test split is done.
• 2. Leave One Out Cross Validation (LOOCV)
• In this technique, if there are n observations in the dataset, only one observation is
reserved for testing, and the remaining data points are used for training. This is
repeated n times till all data points have been used for testing purposes in each
iteration. Finally, the average accuracy is calculated by combining the accuracies of
each iteration.
• Advantage
• Since every data participates both for training and testing, the overall accuracy is
more reliable.
• It is very useful when the dataset is small.
• Disadvantage
• LOOCV is not practical to use when the number of data observations n is huge. E.g.
imagine a dataset with 500,000 records, then 500,000 model needs to be created
which is not really feasible.
• There is a huge computational and time cost associated with the LOOCV approach.
• 3. K-Fold Cross-Validation
• In the K-Fold Cross-Validation approach, the dataset is split into K folds. Now in 1st iteration, the
first fold is reserved for testing and the model is trained on the data of the remaining k-1 folds.
• In the next iteration, the second fold is reserved for testing and the remaining folds are used for
training. This is continued till the K-th iteration. The accuracy obtained in each iteration is used
to derive the overall average accuracy for the model.
• Advantages
• K-Fold cross-validation is useful when the dataset is small and splitting it is not possible to split
it in train-test set (hold out approach) without losing useful data for training.
• It helps to create a robust model with low variance and low bias as it is trained on all data
• Disadvantages
• The major disadvantage of K-Fold Cross Validation is that the training needs to be done K times
and hence it consumes more time and resources,
• Not recommended to be used with sequential time series data.
• When the dataset is imbalanced, K-fold cross-validation may not give good results. This is
because some folds may have just a few or no records for the minority class.
• 4. Stratified K-Fold Cross-Validation
• Stratified K-fold cross-validation is useful when the data is imbalanced.
While sampling data into K-folds it makes sure that the distribution of all
classes in each fold is maintained. For example, if in the dataset 98% of
data belongs to class B and 2% to class A, the stratified sampling will
make sure each fold contains the two classes in the same ratio of 98% to
2%.
• Advantage
• Stratified K-fold cross-validation is recommended when the dataset is
imbalanced.
• 5. Repeated Random Test-Train Split
• Repeated random test-train split is a hybrid of traditional train-test
splitting and the k-fold cross-validation method. In this technique, we
create random splits of the data into the training-test set and then
repeat this process multiple times, just like the cross-validation
method.
Examples of Cross-Validation in Sklearn Library
• About Dataset
• We will be using Parkinson’s disease dataset for all examples of cross-validation in the Sklearn library. The goal is to predict
whether or not a particular patient has Parkinson’s disease. We will be using the decision tree algorithm in all the examples.
• The dataset has 21 attributes and 195 rows. The various fields of the Parkinson’s Disease dataset are as follows –
• MDVP:Fo(Hz) – Average vocal fundamental frequency
• MDVP:Fhi(Hz) – Maximum vocal fundamental frequency
• MDVP:Flo(Hz) – Minimum vocal fundamental frequency
• MDVP:Jitter(%),MDVP:Jitter(Abs),MDVP:RAP,MDVP:PPQ,Jitter:DDP – Several
• measures of variation in fundamental frequency
• MDVP:Shimmer,MDVP:Shimmer(dB),Shimmer:APQ3,Shimmer:APQ5,MDVP:APQ,Shimmer:DDA – Several measures of
variation in amplitude
• NHR,HNR – Two measures of ratio of noise to tonal components in the voice
• status – Health status of the subject (one) – Parkinson’s, (zero) – healthy
• RPDE,D2 – Two nonlinear dynamical complexity measures
• DFA – Signal fractal scaling exponent
• spread1,spread2PPE – Three nonlinear measures of fundamental frequency variation
• Importing Necessary Libraries
• We first load the libraries required to build our model.

• import pandas as pd
• import numpy as np
• from sklearn.tree import DecisionTreeClassifier
• from sklearn.model_selection import train_test_split
• from sklearn.model_selection import KFold
• from sklearn.model_selection import StratifiedKFold
• Reading CSV Data into Pandas
• Next, we load the dataset in the CSV file into the pandas dataframes
and check the top 5 rows.
• df=pd.read_csv(“Parkinsson disease.csv")
• df.head()
• Data Preprocessing
• The “name” column is not going to add any value in training the model
and can be discarded, so we are dropping it below.
• df.drop(df.columns[0], axis = 1, inplace = True)
• Next, we will separate the feature and target matrix as shown below.
• #Independent And dependent features
• X=df.drop('status', axis=1)
• y=df['status']
Hold out Approach in Sklearn

• The hold-out approach can be applied by using train_test_split module of

sklearn.model_selection
• In the below example we have split the dataset to create the test data with a size of
30% and train data with a size of 70%. The random_state number ensures the split is
deterministic in every run.
• from sklearn.model_selection import train_test_split
• X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=4)
• model = DecisionTreeClassifier()
• model.fit(X_train, y_train)
• result = model.score(X_test, y_test)print(result)
• Out[38]:
• 0.7796610169491526
K-Fold Cross-Validation
• K-Fold Cross-Validation in Sklearn can be applied by using cross_val_score module
of sklearn.model_selection.
• In the below example, 10 folds are used that produced 10 accuracy scores using which we
calculated the mean score.
• In [40]:
• from sklearn.model_selection import cross_val_score
• model=DecisionTreeClassifier()
• kfold_validation=KFold(10)
• results=cross_val_score(model,X,y,cv=kfold_validation)
• print(results)print(np.mean(results))
• Out[40]:
• [0.7 0.8 0.8 0.8 0.8 0.78947368
• 0.84210526 1. 0.68421053 0.36842105]
• 0.758421052631579
• Stratified K-fold Cross-Validation
• In Sklearn stratified K-fold cross-validation can be applied by
using StratifiedKFold module of sklearn.model_selection
• In the below example, the dataset is divided into 5 splits or folds. It returns 5
accuracy scores using which we calculate the final mean score.
• from sklearn.model_selection import StratifiedKFold
• skfold=StratifiedKFold(n_splits=5)
• model=DecisionTreeClassifier()scores=cross_val_score(model,X,y,cv=skfold)
• print(scores)print(np.mean(scores))

• Out[41]:
• array([0.61538462, 0.79487179, 0.71794872, 0.74358974, 0.71794872])
• 0.717948717948718
Leave One Out Cross Validation(LOOCV)

• In Sklearn Leave One Out Cross Validation (LOOCV) can be applied by using LeaveOneOut module of sklearn.model_selection
• from sklearn.model_selection import LeaveOneOut
• model=DecisionTreeClassifier()
• leave_validation=LeaveOneOut()
• results=cross_val_score(model,X,y,cv=leave_validation)
• results

• Out[22]:
• array([1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1.,
• 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1.,
• 1., 1., 1., 1., 1., 0., 0., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1.,
• 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
• 0., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1.,,,,,,,,,,,,,,,,,,,,,,,,,,,]
• print(np.mean(results))
• Out[44]:
• 0.8358974358974359
Repeated Random Test-Train Splits

• In Sklearn repeated random test-train splits can be applied by using ShuffleSplit module

of sklearn.model_selection
• In [45]:
• from sklearn.model_selection import ShuffleSplit
• model=DecisionTreeClassifier()
• ssplit=ShuffleSplit(n_splits=10,test_size=0.30)
• results=cross_val_score(model,X,y,cv=ssplit)
• print(results)print(np.mean(results))
• Out[45]:
• array([0.79661017, 0.71186441, 0.79661017, 0.88135593, 0.72881356,
• 0.84745763, 0.83050847, 0.77966102, 0.83050847, 0.81355932])

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (81)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
Aosdijfpqoiew
No ratings yet
Aosdijfpqoiew
6 pages
1001 Songs
69% (72)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Case Study An Own Goal
100% (1)
Case Study An Own Goal
2 pages
Financial Engineering Interview Questions Part 1 1657202878
No ratings yet
Financial Engineering Interview Questions Part 1 1657202878
24 pages
Entropy, Relative Entropy and Mutual Information
No ratings yet
Entropy, Relative Entropy and Mutual Information
4 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
12 pages
1
No ratings yet
1
2 pages
MScFE 620 DTSP-Compiled-Notes-M5
No ratings yet
MScFE 620 DTSP-Compiled-Notes-M5
13 pages
L A Level Biology MS Jan 05
No ratings yet
L A Level Biology MS Jan 05
180 pages
Module 2 Example Wellness Plan 2 08 PDF
No ratings yet
Module 2 Example Wellness Plan 2 08 PDF
6 pages
All Types of Cross Validation
No ratings yet
All Types of Cross Validation
9 pages
Simulating Trading Strategies
No ratings yet
Simulating Trading Strategies
47 pages
Rust Cheat Sheet
No ratings yet
Rust Cheat Sheet
88 pages
Levinson and Durbin Algorithm
No ratings yet
Levinson and Durbin Algorithm
4 pages
Numpy - Pandas Assignment
No ratings yet
Numpy - Pandas Assignment
2 pages
Mutual Information
No ratings yet
Mutual Information
8 pages
Answer 1722791857 NLP and Classification Practical MCQ 4991
No ratings yet
Answer 1722791857 NLP and Classification Practical MCQ 4991
26 pages
358 33 Powerpoint Slides DSC Chapter 15
No ratings yet
358 33 Powerpoint Slides DSC Chapter 15
55 pages
cytogenetic mapping
No ratings yet
cytogenetic mapping
15 pages
Module - 3 Time Series Analysis
No ratings yet
Module - 3 Time Series Analysis
26 pages
Neural Networks Economics
No ratings yet
Neural Networks Economics
27 pages
Signal Analysis
No ratings yet
Signal Analysis
103 pages
Neural Network and Their Applications
No ratings yet
Neural Network and Their Applications
2 pages
Modern Portfolio Theory: Prof Mahesh Kumar Amity Business School
No ratings yet
Modern Portfolio Theory: Prof Mahesh Kumar Amity Business School
15 pages
Unit 4 - Week 1 - Introduction, Sampling and Reconstruction: Assignment 1
100% (1)
Unit 4 - Week 1 - Introduction, Sampling and Reconstruction: Assignment 1
6 pages
Numerical Methods Lecture (Autosaved)
No ratings yet
Numerical Methods Lecture (Autosaved)
126 pages
AD3491 - Unit 4 - Analysis of Variance Important Questions 2 Marks With Answer --3-9 (1)
No ratings yet
AD3491 - Unit 4 - Analysis of Variance Important Questions 2 Marks With Answer --3-9 (1)
7 pages
Various Neural Network Architect Assignment Questions
No ratings yet
Various Neural Network Architect Assignment Questions
9 pages
Salaryconditional
No ratings yet
Salaryconditional
1 page
Radial Basis Functions With Adaptive Input and Composite Trend Representation For Portfolio Selection
100% (1)
Radial Basis Functions With Adaptive Input and Composite Trend Representation For Portfolio Selection
13 pages
Winning The Kaggle Algorithmic Trading Challenge
No ratings yet
Winning The Kaggle Algorithmic Trading Challenge
8 pages
Parameter Estimation
100% (1)
Parameter Estimation
24 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
ANN - Ch2-Adaline and Madaline
100% (1)
ANN - Ch2-Adaline and Madaline
29 pages
Quantitative Analysis and Modelling
No ratings yet
Quantitative Analysis and Modelling
3 pages
Machine Learning MCQ S
No ratings yet
Machine Learning MCQ S
318 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
Int. To Data Analytics and Cyber Security Syllabus
No ratings yet
Int. To Data Analytics and Cyber Security Syllabus
2 pages
CS230 Midterm Solutions Fall 2022
No ratings yet
CS230 Midterm Solutions Fall 2022
20 pages
Stock Price Prediction Using Genetic Algorithms
No ratings yet
Stock Price Prediction Using Genetic Algorithms
3 pages
Moving Average Filters
0% (1)
Moving Average Filters
17 pages
06 Convex Optimization - MCQs
No ratings yet
06 Convex Optimization - MCQs
5 pages
Stock Price Simulation in R
100% (1)
Stock Price Simulation in R
37 pages
Time Series
No ratings yet
Time Series
29 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
Applications of Eigen Values and Vectors
No ratings yet
Applications of Eigen Values and Vectors
11 pages
ch13 Linear Factor Models
No ratings yet
ch13 Linear Factor Models
33 pages
Module - 3 - ANALYSIS OF TIME SERIES
No ratings yet
Module - 3 - ANALYSIS OF TIME SERIES
21 pages
Pandas Datareader
No ratings yet
Pandas Datareader
31 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Decision Tree and KNN Assignment Two
No ratings yet
Decision Tree and KNN Assignment Two
13 pages
Text
No ratings yet
Text
131 pages
Systems For Digital Signal Processing: 1 - Introduction
No ratings yet
Systems For Digital Signal Processing: 1 - Introduction
21 pages
Midterm Exam: Marks Q1 Q2 Q3 Q4 Q5 Total 10 12 10 10 8 50
No ratings yet
Midterm Exam: Marks Q1 Q2 Q3 Q4 Q5 Total 10 12 10 10 8 50
13 pages
U L D R: Nsupervised Earning and Imensionality Eduction
No ratings yet
U L D R: Nsupervised Earning and Imensionality Eduction
58 pages
EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification
No ratings yet
EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification
4 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
CS8392 - Oop - Unit - 3 - PPT - 3.1
No ratings yet
CS8392 - Oop - Unit - 3 - PPT - 3.1
17 pages
T2.Statistics Review (Stock & Watson)
No ratings yet
T2.Statistics Review (Stock & Watson)
15 pages
Logic synthesis Standard Requirements
From Everand
Logic synthesis Standard Requirements
Gerardus Blokdyk
No ratings yet
Cross-Validation in Machine Learning - Javatpoint
No ratings yet
Cross-Validation in Machine Learning - Javatpoint
8 pages
UNIT4 Cross Validation
No ratings yet
UNIT4 Cross Validation
16 pages
Module 6_ML
No ratings yet
Module 6_ML
30 pages
Cross Validation - Notes
No ratings yet
Cross Validation - Notes
10 pages
Eapp Worksheet q1w1
No ratings yet
Eapp Worksheet q1w1
3 pages
001 - Language Culture and Society
No ratings yet
001 - Language Culture and Society
42 pages
RPH Cup Panitia Bi SKSP 2022
No ratings yet
RPH Cup Panitia Bi SKSP 2022
1 page
ZF Test-Compressed
No ratings yet
ZF Test-Compressed
33 pages
Incorporating Directed Reading Thinking Activity (Drta) Technique Into Extensive Reading Class
No ratings yet
Incorporating Directed Reading Thinking Activity (Drta) Technique Into Extensive Reading Class
19 pages
Research Manuscript Oral Defense
No ratings yet
Research Manuscript Oral Defense
41 pages
KNE301 Engineering: Project Management Economics
No ratings yet
KNE301 Engineering: Project Management Economics
5 pages
The Future of ECommerce 1st Edition by Frank Columbus ISBN 9798886973587 - Own the complete ebook set now in PDF and DOCX formats
100% (5)
The Future of ECommerce 1st Edition by Frank Columbus ISBN 9798886973587 - Own the complete ebook set now in PDF and DOCX formats
80 pages
Josaa Info Brochure
No ratings yet
Josaa Info Brochure
101 pages
Math8 q3 Mod3 v4-1 Cut
No ratings yet
Math8 q3 Mod3 v4-1 Cut
18 pages
Good Dissertation Topics For Sports Therapy
100% (2)
Good Dissertation Topics For Sports Therapy
5 pages
Leadership Vs Management Report
100% (1)
Leadership Vs Management Report
30 pages
Parent Information Night Script
No ratings yet
Parent Information Night Script
3 pages
DLL-BPP12 & KPWKP 11 - Week7
No ratings yet
DLL-BPP12 & KPWKP 11 - Week7
4 pages
Idt 3600 Field Experience Reflection Paper Template Autosaved
No ratings yet
Idt 3600 Field Experience Reflection Paper Template Autosaved
2 pages
Thesis Network Anomaly Detection With Incomplete Audit Data 2006
No ratings yet
Thesis Network Anomaly Detection With Incomplete Audit Data 2006
149 pages
Chapter 9 - Foundations of Group Behavior
No ratings yet
Chapter 9 - Foundations of Group Behavior
32 pages
Week 4 Philosophy
No ratings yet
Week 4 Philosophy
19 pages
Engl4 Q4 L1 Activity Sheet
No ratings yet
Engl4 Q4 L1 Activity Sheet
2 pages
Carl Gustav Jung Archetypes and Joseph C
No ratings yet
Carl Gustav Jung Archetypes and Joseph C
89 pages
DIRECT Vs INDIRECT QUESTIONS
No ratings yet
DIRECT Vs INDIRECT QUESTIONS
13 pages
Application of ISO 9000 PDF
No ratings yet
Application of ISO 9000 PDF
9 pages
Performance Based Assessment
No ratings yet
Performance Based Assessment
3 pages
ANNand Its Applications
No ratings yet
ANNand Its Applications
16 pages
QMS Induction Laterals
No ratings yet
QMS Induction Laterals
15 pages
Interactive Levels 3 and 4 Cambridge English First Exam Topic Correlation
No ratings yet
Interactive Levels 3 and 4 Cambridge English First Exam Topic Correlation
4 pages

Cross-Validation in Machine Learning

Uploaded by

Cross-Validation in Machine Learning

Uploaded by

Cross-Validation in Machine

• 1.Hold Out Approach

• The hold-out approach can be applied by using train_test_split module of

• In Sklearn repeated random test-train splits can be applied by using ShuffleSplit module

You might also like