SlideShare a Scribd company logo
Machine Learning using biased data
Machine Learning Meetup Sydney
Arnaud de Myttenaere
Data Scientist @ OCTO Technology
27/06/2017
1 / 17
About me
Data Scientist Consultant, OCTO Technology.
Founder of Uchidata Text-Mining API
PhD in Mathematics at Paris 1 University
20+ Machine Learning challenges
2 / 17
Context
2015: Product classification challenge sponsored by Cdiscount (French
marketplace).
15M products
Text (title, description, brand)
Product price
5700 categories
15 000e
3 / 17
Cdiscount product classification challenge
Data:
Id_Produit Categorie Description Titre Marque Prix
107448 10170 DKNY Montre Homme - DKNY Montre Homme - DKNY 58.68
14088553 3649 Mini four SEVERIN 2020 SEVERIN 56.83
14214236 1995 Flower magic 1KG Flower magic 1KG KB 9.8
481700 14194 Voie Express Chinois initiation AUCUNE 28.3
412300 14217 Susumu Shingu Les petits oiseaux AUCUNE 15.05
1010000 14349 Références sciences Calcul différentiel AUCUNE 28.22
Target Explanatory variables
Process:
Learn a model
Apply the model on a test file
Submit predictions on the platform
4 / 17
Evaluation process
The platform provides 2 different datasets:
Train data: used to learn a model
Test data: used to compute prediction and evaluate the model’s accuracy.
Every day, we can submit up to 5 distinct prediction files to have feedbacks on
our model’s accuracy.
5 / 17
Evaluation process
The test data is actually divided in 2 parts:
30% to evaluate models accuracy during the challenge (public
leaderboard)
70% for the final evaluation (private leaderboard)
→ overfitting the public leaderboard leads to poor accuracy during the final
evaluation.
5 / 17
Evaluation process
Model calibration:
1. Split the train data
2. Learn a model
3. Validate model and estimate accuracy
4. Compute and submit predictions
5. Get public score
If training and test data are not biased, the estimated accuracy is close to the
public score.
5 / 17
Evaluation process
Model calibration:
1. Split the train data
2. Learn a model
3. Validate model and estimate accuracy
4. Compute and submit predictions
5. Get public score
If training and test data are not biased, the estimated accuracy is close to
the public score.
5 / 17
Sometimes cross-validation fails...
Cross-validation score
90%
Leaderboard score
58.9%
6 / 17
Sometimes cross-validation fails...
Cross-validation score
90%
Leaderboard score
58.9%
→ 2 possibilities:
You are overfitting OR The data is biased
6 / 17
Biased data
In this challenge, the test data was biased.
Why?
→ Organisers selected the same number of products in each category to build
the test set.
Phone bumper Books Battery
% in train data: 13.87% 6.53% 3.66%
% in test data: 0.017% 0.017% 0.017%
→ In practice the data is often biased, due to the collection process,
seasonality, ...
Bias correction?
7 / 17
Biased data
In this challenge, the test data was biased.
Why?
→ Organisers selected the same number of products in each category to build
the test set.
Phone bumper Books Battery
% in train data: 13.87% 6.53% 3.66%
% in test data: 0.017% 0.017% 0.017%
→ In practice the data is often biased, due to the collection process,
seasonality, ...
Bias correction?
→ Solution: data sampling
Example: randomly select 100 products of each category in the training set to
make it similar to the test set.
7 / 17
Biased data
In this challenge, the test data was biased.
Why?
→ Organisers selected the same number of products in each category to build
the test set.
Phone bumper Books Battery
% in train data: 13.87% 6.53% 3.66%
% in test data: 0.017% 0.017% 0.017%
→ In practice the data is often biased, due to the collection process,
seasonality, ...
Bias correction?
→ Solution: data sampling
Example: randomly select 100 products of each category in the training set to
make it similar to the test set.
→ Problem: data sampling does not allow us to use the whole training set.
7 / 17
A weighting strategy
Better solution: use weights.
Most of Machine Learning library allows to weight observations.
Example: weights in xgboost.
→ How to mimic the test set?
8 / 17
A weighting strategy
Better solution: use weights.
Most of Machine Learning library allows to weight observations.
Example: weights in xgboost.
→ How to mimic the test set?
Weight =
1
Frequency
8 / 17
Results
Leaderboard score without weights: 58.9%
Leaderboard score with weights: 65.8%
My final score (aggregation of 3 weighted models): 66.9%.
→ The weighting strategy works very well... but why?
What is the theory behind weights?
9 / 17
Similar problem, simpler case: 2 categories.
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (50% purple, 50% green)
size
weight
Objective: find the category of the red dot?
10 / 17
Similar problem, simpler case: 2 categories.
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (50% purple, 50% green)
size
weight
Objective: find the category of the red dot?
10 / 17
Similar problem, simpler case: 2 categories.
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (50% purple, 50% green)
size
weight
Objective: find the category of the red dot?
Proportions on the test dataset are different than the ones on the train
dataset
→ The model calibrated on the train dataset is not optimal on the test
dataset.
10 / 17
Similar problem, simpler case: 2 categories.
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (50% purple, 50% green)
size
weight
Objective: find the category of the red dot?
Proportions on the test dataset are different than the ones on the train
dataset
→ The model calibrated on the train dataset is not optimal on the test
dataset.
→ The red dot is more likely to belong to the green group on the test dataset
10 / 17
Similar problem, simpler case: 2 categories.
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (50% purple, 50% green)
size
weight
Problem: The relation between the target variable and
the observation in training and test set is different.
Why? Proportions are different on train and test
datasets.
But: Knowing the category, the distribution of obser-
vations is the same on both training and test datasets.
10 / 17
Similar problem, simpler case: 2 categories.
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (50% purple, 50% green)
size
weight
Problem: The relation between the target variable and
the observation in training and test set is different.
Why? Proportions are different on train and test
datasets.
But: Knowing the category, the distribution of obser-
vations is the same on both training and test datasets.
Formally:
→ Ptrain(Y|X) = Ptest (Y|X)
→ Ptrain(Y) = Ptest (Y)
→ Ptrain(X|Y) = Ptest (X|Y)
10 / 17
Theoretical justification
For some weights, minimizing the average error on the weighted train set is
equivalent to minimizing the error on the test set.
Formally: If is a loss function, Y and X are random variables such that
Ptrain(X|Y) = Ptest (X|Y), then by denoting ωi = ptest (Yi )
ptrain(Yi )
, for every model g we
have:
Etrain[ω(Y) (Y, g(X))] = Etest [ (Y, g(X))]
Consequence: The best model on the weighted training set is the best model
on the test set.
11 / 17
Optimal weights
Optimal weights:
ωi =
ptest (Yi )
ptrain(Yi )
→ If a label is rare in the test set, the weights are small.
→ Weights can be used to mimic the test set.
12 / 17
Optimal weights
Optimal weights:
ωi =
ptest (Yi )
ptrain(Yi )
→ If a label is rare in the test set, the weights are small.
→ Weights can be used to mimic the test set.
Problem: The distribution of the target variable on the test set, ptest (Yi ), is
unknown.
12 / 17
Optimal weights
Optimal weights:
ωi =
ptest (Yi )
ptrain(Yi )
→ If a label is rare in the test set, the weights are small.
→ Weights can be used to mimic the test set.
Problem: The distribution of the target variable on the test set, ptest (Yi ), is
unknown.
Suggested solution: Estimate the distribution of the target variable iteratively.
12 / 17
An iterative approach
150 160 170 180 190
20406080100
Train (90% purple, 10% green)
size
weight
150 160 170 180 190
20406080100
Test (model: 61% purple, 39% green)
size
weight
→ The model calibrated on the training set gives an estimation of the
distribution of the target variable on the test set: 61% purple, 39% green.
→ Use this predictions to estimate weights, update the model... and iterate!
13 / 17
Results
Predictions obtained by the model after 5 iterations:
150 160 170 180 190
20406080100
Test (model: 50.8% purple, 49.2% green)
size
weight
Purple: 50.79%
Green: 49.21%
The final model is very well calibrated on the test set!
If the data is not biased, this strategy will converge in 1 iteration.
14 / 17
Conclusion
In practice the data is often biased, due to the collection process,
seasonality, ...
Resampling the data is a way to remove the bias
But weighting observations is a better strategy since it allows to use the
whole data
Weights are easy to use in practice (weight option in almost every
Machine Learning library)
There is a nice theory behind weights!
15 / 17
References
Shimodaira et al., "Improving predictive inference under covariate shift by
weighting the log-likelihood function", Journal of statistical planning and
inference, 2000.
Sugiyama et al., "Covariate shift adaptation by importance weighted cross
validation", The Journal of Machine Learning Research, 2007.
16 / 17
About Octo
http://blog.octo.com/en/
@OCTODownUnder
hr@octo.com.au
17 / 17

More Related Content

Similar to Machine Learning using biased data (20)

PPTX
Machine learning
Sukhwinder Singh
 
PDF
Bias and variance trade off
VARUN KUMAR
 
PPTX
Ml2 train test-splits_validation_linear_regression
ankit_ppt
 
PPTX
Build_Machine_Learning_System for Machine Learning Course
ssuserfece35
 
PDF
Tips for data science competitions
Owen Zhang
 
PDF
Simple rules for building robust machine learning models
Kyriakos Chatzidimitriou
 
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET Journal
 
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET Journal
 
PPTX
IME 672 - Classifier Evaluation I.pptx
Temp762476
 
PDF
MLHEP 2015: Introductory Lecture #1
arogozhnikov
 
PDF
Data Science Cheatsheet.pdf
qawali1
 
PDF
13ClassifierPerformance.pdf
ssuserdce5c21
 
PDF
Machine Learning Foundations
Albert Y. C. Chen
 
PDF
Bigger Data v Better Math
Brent Schneeman
 
PPTX
CST413 KTU S7 CSE Machine Learning Supervised Learning Classification Algorit...
resming1
 
PDF
Machine Learning Algorithms Introduction.pdf
Vinodh58
 
PPTX
Machine learning with scikitlearn
Pratap Dangeti
 
PPTX
Deep learning from mashine learning AI..
premkumarlive
 
PPTX
Statistical Learning and Model Selection (1).pptx
rajalakshmi5921
 
Machine learning
Sukhwinder Singh
 
Bias and variance trade off
VARUN KUMAR
 
Ml2 train test-splits_validation_linear_regression
ankit_ppt
 
Build_Machine_Learning_System for Machine Learning Course
ssuserfece35
 
Tips for data science competitions
Owen Zhang
 
Simple rules for building robust machine learning models
Kyriakos Chatzidimitriou
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET Journal
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET Journal
 
IME 672 - Classifier Evaluation I.pptx
Temp762476
 
MLHEP 2015: Introductory Lecture #1
arogozhnikov
 
Data Science Cheatsheet.pdf
qawali1
 
13ClassifierPerformance.pdf
ssuserdce5c21
 
Machine Learning Foundations
Albert Y. C. Chen
 
Bigger Data v Better Math
Brent Schneeman
 
CST413 KTU S7 CSE Machine Learning Supervised Learning Classification Algorit...
resming1
 
Machine Learning Algorithms Introduction.pdf
Vinodh58
 
Machine learning with scikitlearn
Pratap Dangeti
 
Deep learning from mashine learning AI..
premkumarlive
 
Statistical Learning and Model Selection (1).pptx
rajalakshmi5921
 

Recently uploaded (20)

PPTX
DATA-COLLECTION METHODS, TYPES AND SOURCES
biggdaad011
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
PPTX
Introduction to Artificial Intelligence.pptx
StarToon1
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPTX
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
PPTX
fashion industry boom.pptx an economics project
TGMPandeyji
 
DOC
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PPTX
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
DATA-COLLECTION METHODS, TYPES AND SOURCES
biggdaad011
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
Introduction to Artificial Intelligence.pptx
StarToon1
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
fashion industry boom.pptx an economics project
TGMPandeyji
 
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
Climate Action.pptx action plan for climate
justfortalabat
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
Data base management system Transactions.ppt
gandhamcharan2006
 
Ad

Machine Learning using biased data

  • 1. Machine Learning using biased data Machine Learning Meetup Sydney Arnaud de Myttenaere Data Scientist @ OCTO Technology 27/06/2017 1 / 17
  • 2. About me Data Scientist Consultant, OCTO Technology. Founder of Uchidata Text-Mining API PhD in Mathematics at Paris 1 University 20+ Machine Learning challenges 2 / 17
  • 3. Context 2015: Product classification challenge sponsored by Cdiscount (French marketplace). 15M products Text (title, description, brand) Product price 5700 categories 15 000e 3 / 17
  • 4. Cdiscount product classification challenge Data: Id_Produit Categorie Description Titre Marque Prix 107448 10170 DKNY Montre Homme - DKNY Montre Homme - DKNY 58.68 14088553 3649 Mini four SEVERIN 2020 SEVERIN 56.83 14214236 1995 Flower magic 1KG Flower magic 1KG KB 9.8 481700 14194 Voie Express Chinois initiation AUCUNE 28.3 412300 14217 Susumu Shingu Les petits oiseaux AUCUNE 15.05 1010000 14349 Références sciences Calcul différentiel AUCUNE 28.22 Target Explanatory variables Process: Learn a model Apply the model on a test file Submit predictions on the platform 4 / 17
  • 5. Evaluation process The platform provides 2 different datasets: Train data: used to learn a model Test data: used to compute prediction and evaluate the model’s accuracy. Every day, we can submit up to 5 distinct prediction files to have feedbacks on our model’s accuracy. 5 / 17
  • 6. Evaluation process The test data is actually divided in 2 parts: 30% to evaluate models accuracy during the challenge (public leaderboard) 70% for the final evaluation (private leaderboard) → overfitting the public leaderboard leads to poor accuracy during the final evaluation. 5 / 17
  • 7. Evaluation process Model calibration: 1. Split the train data 2. Learn a model 3. Validate model and estimate accuracy 4. Compute and submit predictions 5. Get public score If training and test data are not biased, the estimated accuracy is close to the public score. 5 / 17
  • 8. Evaluation process Model calibration: 1. Split the train data 2. Learn a model 3. Validate model and estimate accuracy 4. Compute and submit predictions 5. Get public score If training and test data are not biased, the estimated accuracy is close to the public score. 5 / 17
  • 9. Sometimes cross-validation fails... Cross-validation score 90% Leaderboard score 58.9% 6 / 17
  • 10. Sometimes cross-validation fails... Cross-validation score 90% Leaderboard score 58.9% → 2 possibilities: You are overfitting OR The data is biased 6 / 17
  • 11. Biased data In this challenge, the test data was biased. Why? → Organisers selected the same number of products in each category to build the test set. Phone bumper Books Battery % in train data: 13.87% 6.53% 3.66% % in test data: 0.017% 0.017% 0.017% → In practice the data is often biased, due to the collection process, seasonality, ... Bias correction? 7 / 17
  • 12. Biased data In this challenge, the test data was biased. Why? → Organisers selected the same number of products in each category to build the test set. Phone bumper Books Battery % in train data: 13.87% 6.53% 3.66% % in test data: 0.017% 0.017% 0.017% → In practice the data is often biased, due to the collection process, seasonality, ... Bias correction? → Solution: data sampling Example: randomly select 100 products of each category in the training set to make it similar to the test set. 7 / 17
  • 13. Biased data In this challenge, the test data was biased. Why? → Organisers selected the same number of products in each category to build the test set. Phone bumper Books Battery % in train data: 13.87% 6.53% 3.66% % in test data: 0.017% 0.017% 0.017% → In practice the data is often biased, due to the collection process, seasonality, ... Bias correction? → Solution: data sampling Example: randomly select 100 products of each category in the training set to make it similar to the test set. → Problem: data sampling does not allow us to use the whole training set. 7 / 17
  • 14. A weighting strategy Better solution: use weights. Most of Machine Learning library allows to weight observations. Example: weights in xgboost. → How to mimic the test set? 8 / 17
  • 15. A weighting strategy Better solution: use weights. Most of Machine Learning library allows to weight observations. Example: weights in xgboost. → How to mimic the test set? Weight = 1 Frequency 8 / 17
  • 16. Results Leaderboard score without weights: 58.9% Leaderboard score with weights: 65.8% My final score (aggregation of 3 weighted models): 66.9%. → The weighting strategy works very well... but why? What is the theory behind weights? 9 / 17
  • 17. Similar problem, simpler case: 2 categories. 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (50% purple, 50% green) size weight Objective: find the category of the red dot? 10 / 17
  • 18. Similar problem, simpler case: 2 categories. 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (50% purple, 50% green) size weight Objective: find the category of the red dot? 10 / 17
  • 19. Similar problem, simpler case: 2 categories. 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (50% purple, 50% green) size weight Objective: find the category of the red dot? Proportions on the test dataset are different than the ones on the train dataset → The model calibrated on the train dataset is not optimal on the test dataset. 10 / 17
  • 20. Similar problem, simpler case: 2 categories. 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (50% purple, 50% green) size weight Objective: find the category of the red dot? Proportions on the test dataset are different than the ones on the train dataset → The model calibrated on the train dataset is not optimal on the test dataset. → The red dot is more likely to belong to the green group on the test dataset 10 / 17
  • 21. Similar problem, simpler case: 2 categories. 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (50% purple, 50% green) size weight Problem: The relation between the target variable and the observation in training and test set is different. Why? Proportions are different on train and test datasets. But: Knowing the category, the distribution of obser- vations is the same on both training and test datasets. 10 / 17
  • 22. Similar problem, simpler case: 2 categories. 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (50% purple, 50% green) size weight Problem: The relation between the target variable and the observation in training and test set is different. Why? Proportions are different on train and test datasets. But: Knowing the category, the distribution of obser- vations is the same on both training and test datasets. Formally: → Ptrain(Y|X) = Ptest (Y|X) → Ptrain(Y) = Ptest (Y) → Ptrain(X|Y) = Ptest (X|Y) 10 / 17
  • 23. Theoretical justification For some weights, minimizing the average error on the weighted train set is equivalent to minimizing the error on the test set. Formally: If is a loss function, Y and X are random variables such that Ptrain(X|Y) = Ptest (X|Y), then by denoting ωi = ptest (Yi ) ptrain(Yi ) , for every model g we have: Etrain[ω(Y) (Y, g(X))] = Etest [ (Y, g(X))] Consequence: The best model on the weighted training set is the best model on the test set. 11 / 17
  • 24. Optimal weights Optimal weights: ωi = ptest (Yi ) ptrain(Yi ) → If a label is rare in the test set, the weights are small. → Weights can be used to mimic the test set. 12 / 17
  • 25. Optimal weights Optimal weights: ωi = ptest (Yi ) ptrain(Yi ) → If a label is rare in the test set, the weights are small. → Weights can be used to mimic the test set. Problem: The distribution of the target variable on the test set, ptest (Yi ), is unknown. 12 / 17
  • 26. Optimal weights Optimal weights: ωi = ptest (Yi ) ptrain(Yi ) → If a label is rare in the test set, the weights are small. → Weights can be used to mimic the test set. Problem: The distribution of the target variable on the test set, ptest (Yi ), is unknown. Suggested solution: Estimate the distribution of the target variable iteratively. 12 / 17
  • 27. An iterative approach 150 160 170 180 190 20406080100 Train (90% purple, 10% green) size weight 150 160 170 180 190 20406080100 Test (model: 61% purple, 39% green) size weight → The model calibrated on the training set gives an estimation of the distribution of the target variable on the test set: 61% purple, 39% green. → Use this predictions to estimate weights, update the model... and iterate! 13 / 17
  • 28. Results Predictions obtained by the model after 5 iterations: 150 160 170 180 190 20406080100 Test (model: 50.8% purple, 49.2% green) size weight Purple: 50.79% Green: 49.21% The final model is very well calibrated on the test set! If the data is not biased, this strategy will converge in 1 iteration. 14 / 17
  • 29. Conclusion In practice the data is often biased, due to the collection process, seasonality, ... Resampling the data is a way to remove the bias But weighting observations is a better strategy since it allows to use the whole data Weights are easy to use in practice (weight option in almost every Machine Learning library) There is a nice theory behind weights! 15 / 17
  • 30. References Shimodaira et al., "Improving predictive inference under covariate shift by weighting the log-likelihood function", Journal of statistical planning and inference, 2000. Sugiyama et al., "Covariate shift adaptation by importance weighted cross validation", The Journal of Machine Learning Research, 2007. 16 / 17