Machinelearning

This document outlines a final task to develop methods for predicting the stage of disease in patients using medical data, with an emphasis on estimating uncertainty in predictions. Two datasets are provided: 1) tabular data on 418 patients with cirrhosis to predict disease stage, and 2) 317 chest X-ray images to predict normal vs. COVID-19/pneumonia cases. The goal is to analyze and develop classification methods for prediction while considering computational complexity, precision in different feature spaces, and producing informative uncertainty estimates. Results must be summarized in a report explaining the analysis and proposed strategies, with citations of sources and a link to an implementation script shared on Google Colab.

Uploaded by

ayesha awan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Machinelearning

Uploaded by

ayesha awan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

1.

INTRODUCTION
An accurate estimate of uncertainties in machine learning predictions is
paramount to building reliable models. It allows to make better informed
decisions, to identify outliers, as well as detect anomalies in the data or to
interpret the results more easily. A major challenge for the deployment of
these systems in critical applications (such as medical diagnostics, self driving
vehicles, etc.) is to identify when and to what extent the system may fail a
prediction. After all, an evaluation of uncertainty is built-in into our
behaviour. For example, a human driver will slow down in case of a
significant amount of uncertainty.
When we turn to regression problems, in certain families of models the task
can be easily accomplished through theoretical results. The most obvious
example in this respect is (obviously) linear regression. In general, models
are much more complicated and the complex interaction within the
algorithms make almost hopeless, even in regressive problems, to derive a
reasonable theoretical analysis. In classification problems, often, the output of
an algorithm is the probability distribution over all possible classes, assessing
the likelihood of each class. The problem appears to be completely solved, but
in fact uncertainty is actually moved on the values of the outcome
probabilities.

2. GoALS
Your task is to explore possible ways of producing a measure or an estimate
of uncertainty in classification predictions. You are not restricted to using
a single classification method, and you can, in principle, develop different
assessments of uncertainty for different algorithms. When developing your
method(s), try to consider
the computational complexity of your method,
the precision of different regions of the feature space,
that the results should be as much informative as possible.

3. DAtA
Data are of medical interest. The reason is that this is the kind of framework
in which it is of greatest importance to have a clear and reliable assessment of
the uncertainty in the prediction.
The first dataset is only tabular, and it has been taken from:
Date: June 15, 2023.
1
2 FINAL TASK FOR ANALISI DEI DATI

https://www.kaggle.com/datasets/fedesoriano/cirrhosis-prediction-dataset

It contains data about 418 patients with biliary cirrhosis of the liver. The preli-
minary goal is to analyse extensively methods that will provide accurate pre-
dictions of the histologic stage of disease. The main step is then to develop, for
one or more of the methods analysed, an assessment of the uncertainty of the
prediction.
A second dataset contains images, and as such is more computationally
complicated. It can be find here:
https://www.kaggle.com/datasets/pranavraikokte/covid19-image-dataset

It contains 317 chest X-rays images of normal patients, and of patients

with Covid-19 or viral pneumonia. The goal is the same of previous
dataset, na- mely develop methods for prediction and an estimate of the
uncertainty in the prediction. This task is computationally more intensive, and
for this reason the analysis of this dataset is optional, for those that want to
have fun with a more computationally challenging problem. The dataset is
divided into training and test, but you can consider the whole data. Notice
that while convolutional neural networks can be considered a standard tool
in classification based on images, they require some computational power1,
and they might not be the only possibility available.

4. SUBMISSION
The results of your own analysis and ideas must be summarised in a report
which explains how you have planned to tackle the problem and the possible
strategies you have tried to solve the problem. The emphasis is not on the
performances of the final method(s) proposed, but on the way you have dealt
with the problem.
You are not only allowed but actually encouraged to read up on the subject.
In order to be complete and fair, you are required to cite all sources of research
material you have used (books, scientific papers, etc.).
This final assignment is a personal piece of work and must not be done in
groups. Discussions with colleagues or experts, although discouraged, should
be reported for fairness.
Your report can be uploaded on the e-learning website. The deadline is
August 5, 2023.
You should add, at the end of your report, the link to a script (R or Python)
containing the implementation of the final method(s) proposed, based on the
analysis developed. The script must be shared via a notebook onGoogle Colab.
Obviously the script must not contain any errors. Please add a link to the
notebook in your report.

1
In case, you may consider to downscale pictures.
FINAL TASK FOR ANALISI DEI DATI 3

It is not necessary (and in fact useless) for the script to contain the entire
analysis. The recommendation is that the output of your scripts will be a
detailed account of your conclusions. The numbers, without any explanation
about their meaning, are not really helpful.

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (78)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (78)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Phone Codes
78% (27)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
2025 MandateForLeadership FULL
70% (10)
2025 MandateForLeadership FULL
920 pages
How To Kiss A Woman's Breast
60% (114)
How To Kiss A Woman's Breast
14 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
1001 Songs
70% (71)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Globalization - A Basic Text
100% (1)
Globalization - A Basic Text
3 pages
Executive Data Science
100% (1)
Executive Data Science
6 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
4th Quarter Test in Mapeh
100% (7)
4th Quarter Test in Mapeh
5 pages
Chapter 6 Predictive Analysis
No ratings yet
Chapter 6 Predictive Analysis
5 pages
DT 444
No ratings yet
DT 444
19 pages
XAI Grafos Autismo PUB
No ratings yet
XAI Grafos Autismo PUB
20 pages
Machine Learning Project 1
No ratings yet
Machine Learning Project 1
19 pages
Get Practical Bayesian Inference A Primer For Physical Scientists 1st Edition Coryn A. L. Bailer Jones Free All Chapters
100% (11)
Get Practical Bayesian Inference A Primer For Physical Scientists 1st Edition Coryn A. L. Bailer Jones Free All Chapters
38 pages
What Is The Best Way To Analyze Data
No ratings yet
What Is The Best Way To Analyze Data
4 pages
Divorce Prediction System: Devansh Kapoor 179202050
No ratings yet
Divorce Prediction System: Devansh Kapoor 179202050
12 pages
Manish Bhatt 2451137 ProjectIV
No ratings yet
Manish Bhatt 2451137 ProjectIV
20 pages
Computer Basics Document
No ratings yet
Computer Basics Document
27 pages
Practical Bayesian Inference A Primer for Physical Scientists 1st Edition Coryn A. L. Bailer Jones All Chapters Instant Download
100% (4)
Practical Bayesian Inference A Primer for Physical Scientists 1st Edition Coryn A. L. Bailer Jones All Chapters Instant Download
55 pages
The Elements of Data Analytic Style
No ratings yet
The Elements of Data Analytic Style
95 pages
Prediction of Breast Cancer Using Machine Learning Algorithms - 2nd Review
No ratings yet
Prediction of Breast Cancer Using Machine Learning Algorithms - 2nd Review
21 pages
Advanced Data Analytics Assignment
No ratings yet
Advanced Data Analytics Assignment
6 pages
Going Beyond Simple Sample Size Calculations: A Practitioner's Guide
No ratings yet
Going Beyond Simple Sample Size Calculations: A Practitioner's Guide
56 pages
Practical Tools for Designing and Weighting Survey Samples - 2nd Edition Illustrated eBook Download
100% (2)
Practical Tools for Designing and Weighting Survey Samples - 2nd Edition Illustrated eBook Download
16 pages
Unit III
No ratings yet
Unit III
19 pages
How To Write Methods Section of Thesis
100% (2)
How To Write Methods Section of Thesis
8 pages
Basic Data Science Interview Questions
No ratings yet
Basic Data Science Interview Questions
18 pages
Assignment2 2024
No ratings yet
Assignment2 2024
4 pages
T DEV 810 - Project
No ratings yet
T DEV 810 - Project
5 pages
Med Allance
No ratings yet
Med Allance
17 pages
Fundamental of Analytics: CSSELEC3/CS0009
No ratings yet
Fundamental of Analytics: CSSELEC3/CS0009
5 pages
Stapor ASOC 2021
No ratings yet
Stapor ASOC 2021
12 pages
2020 - Boulesteix - Intro To Simulation Studies - BMJ Open
No ratings yet
2020 - Boulesteix - Intro To Simulation Studies - BMJ Open
11 pages
Relatório Machine Learning
No ratings yet
Relatório Machine Learning
24 pages
Project Brain
No ratings yet
Project Brain
10 pages
10.1201_b16328_previewpdf
No ratings yet
10.1201_b16328_previewpdf
47 pages
Research Proposal LLMs and Knowledge Graphs
No ratings yet
Research Proposal LLMs and Knowledge Graphs
4 pages
Unit 1 DMW
No ratings yet
Unit 1 DMW
41 pages
Nature 14541
No ratings yet
Nature 14541
8 pages
EDA Unit-2
No ratings yet
EDA Unit-2
24 pages
Compusoft, 3 (6), 831-835 PDF
No ratings yet
Compusoft, 3 (6), 831-835 PDF
5 pages
INTRO STATS MANUAL R FINAL
No ratings yet
INTRO STATS MANUAL R FINAL
250 pages
Unit 1: Capstone Project
No ratings yet
Unit 1: Capstone Project
21 pages
Surpac Minex Group Geostatistics in Surp PDF
No ratings yet
Surpac Minex Group Geostatistics in Surp PDF
116 pages
37316
No ratings yet
37316
4 pages
Thesis Pca
100% (2)
Thesis Pca
8 pages
Assignment-4 Answers - 1225397
No ratings yet
Assignment-4 Answers - 1225397
3 pages
Master Thesis Uva Physics
100% (3)
Master Thesis Uva Physics
4 pages
Download (Ebook) Practical Bayesian Inference A Primer for Physical Scientists by Coryn A. L. Bailer Jones ISBN 9781316642214, 1316642216 ebook All Chapters PDF
100% (9)
Download (Ebook) Practical Bayesian Inference A Primer for Physical Scientists by Coryn A. L. Bailer Jones ISBN 9781316642214, 1316642216 ebook All Chapters PDF
55 pages
Business Analytics Lab Summative 1
No ratings yet
Business Analytics Lab Summative 1
5 pages
Artificial Intelligence - (Unit - 1)
No ratings yet
Artificial Intelligence - (Unit - 1)
47 pages
Stat. and Prob. Module 3
No ratings yet
Stat. and Prob. Module 3
24 pages
Free Download Dissertation Thesis
100% (2)
Free Download Dissertation Thesis
7 pages
BigData QB (c.format)
No ratings yet
BigData QB (c.format)
6 pages
DA (All CHP.)
No ratings yet
DA (All CHP.)
14 pages
Bachelor Thesis Methodology Chapter
100% (3)
Bachelor Thesis Methodology Chapter
5 pages
Prob and Stats in AI Unit-4
No ratings yet
Prob and Stats in AI Unit-4
24 pages
Stat. and Prob. Module 1
100% (1)
Stat. and Prob. Module 1
20 pages
Pattern Recognition Thesis PDF
100% (3)
Pattern Recognition Thesis PDF
8 pages
Unit-4 Part 2 Modelling and Evaluation
No ratings yet
Unit-4 Part 2 Modelling and Evaluation
35 pages
The Predictive Analytics Model
No ratings yet
The Predictive Analytics Model
6 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Aids 2 Mse
No ratings yet
Aids 2 Mse
27 pages
Image Segmentation PHD Thesis 2010
100% (3)
Image Segmentation PHD Thesis 2010
5 pages
SECTION A (10-12)
No ratings yet
SECTION A (10-12)
7 pages
Data Science Crash Course
100% (1)
Data Science Crash Course
32 pages
Data Science Pipeline, EDA & Data Preparation
No ratings yet
Data Science Pipeline, EDA & Data Preparation
14 pages
Text Media and Information 2.1
100% (1)
Text Media and Information 2.1
79 pages
PE and Arts Cala
No ratings yet
PE and Arts Cala
6 pages
Architectural Footprints of CPWD
No ratings yet
Architectural Footprints of CPWD
143 pages
Integrative Programming and Technologies 2
No ratings yet
Integrative Programming and Technologies 2
3 pages
Acetaminophen Ibuprofen Dosage Chart
No ratings yet
Acetaminophen Ibuprofen Dosage Chart
1 page
Offender Profiling and Crime Analysis (Peter B Ainswoth) Willan Pub - English - 9781843924630 - 2001
100% (3)
Offender Profiling and Crime Analysis (Peter B Ainswoth) Willan Pub - English - 9781843924630 - 2001
208 pages
Vocathlon 2023 TOSYA Draft
No ratings yet
Vocathlon 2023 TOSYA Draft
6 pages
L1 - Law of Demand
No ratings yet
L1 - Law of Demand
25 pages
Tiny Broom
100% (1)
Tiny Broom
4 pages
ps2 1
No ratings yet
ps2 1
5 pages
Iit Mole Concept Questions
0% (1)
Iit Mole Concept Questions
3 pages
Siemens Sinamics G120 PM250 Power Module Manual PDF
No ratings yet
Siemens Sinamics G120 PM250 Power Module Manual PDF
78 pages
Invoice
No ratings yet
Invoice
54 pages
Bio
No ratings yet
Bio
19 pages
Mini Project: Hyderabad Karnataka Education Society's
No ratings yet
Mini Project: Hyderabad Karnataka Education Society's
4 pages
Mankiw Ca
No ratings yet
Mankiw Ca
1 page
Literature Poetry First Editions
No ratings yet
Literature Poetry First Editions
68 pages
#13. Brain Teaser To Exercise Your Cognitive Skills - Where Do Words Go - SharpBrains
No ratings yet
#13. Brain Teaser To Exercise Your Cognitive Skills - Where Do Words Go - SharpBrains
1 page
Sample Contract For Broiler in Trinidad and Tobago
No ratings yet
Sample Contract For Broiler in Trinidad and Tobago
8 pages
Past Tense Irregular Verbs Lesson Plan
No ratings yet
Past Tense Irregular Verbs Lesson Plan
7 pages
Santoprene 101-73
No ratings yet
Santoprene 101-73
4 pages
Kellogg Conference Hotel Fact Sheet English
No ratings yet
Kellogg Conference Hotel Fact Sheet English
2 pages
Cheerleaders Dont Fall For Nerdy Boys - Emma Dalton
No ratings yet
Cheerleaders Dont Fall For Nerdy Boys - Emma Dalton
382 pages
E12 Review U678 23 24
No ratings yet
E12 Review U678 23 24
10 pages
The Elisha Principle Revival Through Sonship (Z-Library)
No ratings yet
The Elisha Principle Revival Through Sonship (Z-Library)
124 pages
Area by Integration
No ratings yet
Area by Integration
20 pages
Namma Kalvi 6th Maths Sura Sample Guide Term 1 em 218917
No ratings yet
Namma Kalvi 6th Maths Sura Sample Guide Term 1 em 218917
56 pages
Chapter 4 - HDL Modelling of Sequential Logic Circuit
No ratings yet
Chapter 4 - HDL Modelling of Sequential Logic Circuit
26 pages

Machinelearning

Uploaded by

Machinelearning

Uploaded by

1.

It contains 317 chest X-rays images of normal patients, and of patients

You might also like