0% found this document useful (0 votes)
11 views

PMRslides 01

This document introduces probabilistic modeling and reasoning. It discusses how variability leads to uncertainty and how probabilistic models can quantify uncertainty using probabilities. It provides an example of using a probabilistic model to analyze the results of an Alzheimer's screening test, and how Bayes' rule can be used to update probabilities based on new evidence.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

PMRslides 01

This document introduces probabilistic modeling and reasoning. It discusses how variability leads to uncertainty and how probabilistic models can quantify uncertainty using probabilities. It provides an example of using a probabilistic model to analyze the results of an Alzheimer's screening test, and how Bayes' rule can be used to update probabilities based on new evidence.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Probabilistic Modelling and Reasoning

— Introduction —

Michael Gutmann

Probabilistic Modelling and Reasoning (INFR11134)


School of Informatics, University of Edinburgh

Spring Semester 2020


Variability

I Variability is part of nature


I Human heights vary
I Men are typically taller than women but
height varies a lot

Michael Gutmann PMR Introduction 2 / 23


Variability

I Our handwriting is unique


I Variability leads to uncertainty: e.g. 1 vs 7 or 4 vs 9

Michael Gutmann PMR Introduction 3 / 23


Variability

I Variability leads to uncertainty


I Reading handwritten text in a
foreign language

Michael Gutmann PMR Introduction 4 / 23


Example: Screening and diagnostic tests

I Early warning test for Alzheimer’s disease (Scharre, 2010, 2014)


I Detects “mild cognitive impairment”

I Takes 10–15 minutes


I Freely available
I Assume a 70 year old man
tests positive.
I Should he be concerned?

(Example from sagetest.osu.edu)

Michael Gutmann PMR Introduction 5 / 23


Accuracy of the test

I Sensitivity of 0.8 and specificity of 0.95 (Scharre, 2010)


I 80% correct for people with impairment

impairment
detected (y=1)

with impairment (x=1)

0.8

0.2

no impairment
detected (y=0)

Michael Gutmann PMR Introduction 6 / 23


Accuracy of the test

I Sensitivity of 0.8 and specificity of 0.95 (Scharre, 2010)


I 95% correct for people w/o impairment

impairment
detected (y=1)

w/o impairment (x=0)


5
0.0

0.9
5

no impairment
detected (y=0)

Michael Gutmann PMR Introduction 7 / 23


Variability implies uncertainty

I People of the same group do not have the same test results
I Test outcome is subject to variability
I The data are noisy
I Variability leads to uncertainty
I Positive test ≡ true positive ?
I Positive test ≡ false positive ?
I What can we safely conclude from a positive test result?
I How should we analyse such kind of ambiguous data?

Michael Gutmann PMR Introduction 8 / 23


Probabilistic approach

I The test outcomes y can be described with probabilities

sensitivity = 0.8 ⇔ P(y = 1|x = 1) = 0.8


⇔ P(y = 0|x = 1) = 0.2
specificity = 0.95 ⇔ P(y = 0|x = 0) = 0.95
⇔ P(y = 1|x = 0) = 0.05

I P(y |x ): model of the test specified in terms of (conditional)


probabilities
I x ∈ {0, 1}: quantity of interest (cognitive impairment or not)

Michael Gutmann PMR Introduction 9 / 23


Prior information
Among people like the patient, P(x = 1) = 5/45 ≈ 11% have a
cognitive impairment (plausible range: 3% – 22%, Geda, 2014)

With impairment
p(x=1)

Without impairment
p(x=0)

Michael Gutmann PMR Introduction 10 / 23


Probabilistic model

I Reality:
I properties/characteristics of the group of people like the
patient
I properties/characteristics of the test
I Probabilistic model:
I P(x = 1)
I P(y = 1|x = 1) or P(y = 0|x = 1)
P(y = 1|x = 0) or P(y = 0|x = 0)
Fully specified by three numbers.
I A probabilistic model is an abstraction of reality that uses
probability theory to quantify the chance of uncertain events.

Michael Gutmann PMR Introduction 11 / 23


If we tested the whole population

With impairment
p(x=1)

Without impairment
p(x=0)
Michael Gutmann PMR Introduction 12 / 23
If we tested the whole population
Fraction of people who are impaired and have positive tests:
P(x = 1, y = 1) = P(y = 1|x = 1)P(x = 1) = 4/45 (product rule)

With impairment
p(x=1)

Without impairment
p(x=0)
Michael Gutmann PMR Introduction 13 / 23
If we tested the whole population
Fraction of people who are not impaired but have positive tests:
P(x = 0, y = 1) = P(y = 1|x = 0)P(x = 0) = 2/45 (product rule)

With impairment
p(x=1)

Without impairment
p(x=0)
Michael Gutmann PMR Introduction 14 / 23
If we tested the whole population
Fraction of people where the test is positive:
P(y = 1) = P(x = 1, y = 1) + P(x = 0, y = 1) = 6/45 (sum rule)

With impairment
p(x=1)

Without impairment
p(x=0)
Michael Gutmann PMR Introduction 15 / 23
Putting everything together

I Among those with a positive test, fraction with impairment:

P(y = 1|x = 1)P(x = 1) 4 2


P(x = 1|y = 1) = = =
P(y = 1) 6 3
I Fraction without impairment:

P(y = 1|x = 0)P(x = 0) 2 1


P(x = 0|y = 1) = = =
P(y = 1) 6 3
I Equations are examples of “Bayes’ rule”.
I Positive test increased probability of cognitive impairment
from 11% (prior belief) to 67%, or from 6% to 51%.
I 51% ≈ coin flip

Michael Gutmann PMR Introduction 16 / 23


Probabilistic reasoning

I Probabilistic reasoning ≡ probabilistic inference:


Computing the probability of an event that we have not or
cannot observe from an event that we can observe
I Unobserved/uncertain event, e.g. cognitive impairment x = 1
I Observed event ≡ evidence ≡ data, e.g. test result y = 1
I “The prior”: probability for the uncertain event before having
seen evidence, e.g. P(x = 1)
I “The posterior”: probability for the uncertain event after
having seen evidence, e.g. P(x = 1|y = 1)
I The posterior is computed from the prior and the evidence via
Bayes’ rule.

Michael Gutmann PMR Introduction 17 / 23


Key rules of probability
(1) Product rule:

P(x = 1, y = 1) = P(y = 1|x = 1)P(x = 1)


= P(x = 1|y = 1)P(y = 1)

(2) Sum rule:

P(y = 1) = P(x = 1, y = 1) + P(x = 0, y = 1)

Bayes’ rule (conditioning) as consequence of the product rule

P(x = 1, y = 1) P(y = 1|x = 1)P(x = 1)


P(x = 1|y = 1) = =
P(y = 1) P(y = 1)
Denominator from sum rule, or sum rule and product rule
P(y = 1) = P(y = 1|x = 1)P(x = 1) + P(y = 1|x = 0)P(x = 0)

Michael Gutmann PMR Introduction 18 / 23


Key rules or probability

I The rules generalise to the case of multivariate random


variables (discrete or continuous)
I Consider the conditional joint probability density function
(pdf) or probability mass function (pmf) of x, y: p(x, y)
(1) Product rule:

p(x, y) = p(x|y)p(y)
= p(y|x)p(x)

(2) Sum rule:


(P
x p(x, y) for discrete r.v.
p(y) = R
p(x, y)dx for continuous r.v.

Michael Gutmann PMR Introduction 19 / 23


Probabilistic modelling and reasoning

I Probabilistic modelling:
I Identify the quantities that relate to the aspects of reality that
you wish to capture with your model.
I Consider them to be random variables, e.g. x, y, z, with a joint
pdf (pmf) p(x, y, z).
I Probabilistic reasoning:
I Assume you know that y ∈ E (measurement, evidence)
I Probabilistic reasoning about x then consists in computing

p(x|y ∈ E)

or related quantities like argmaxx p(x|y ∈ E) or posterior


expectations of some function g of x, e.g.
Z
E [g(x) | y ∈ E] = g(u)p(u|y ∈ E)du

Michael Gutmann PMR Introduction 20 / 23


Solution via product and sum rule

Assume that all variables are discrete valued, that E = {yo }, and
that we know p(x, y, z). We would like to know p(x|yo ).
p(x,yo )
I Product rule: p(x|yo ) = p(yo )
P
I Sum rule: p(x, yo ) = z p(x, yo , z)
P P
I Sum rule: p(yo ) = x p(x, yo ) = x,z p(x, yo , z)
I Result: P
z p(x, yo , z)
p(x|yo ) = P
x,z p(x, yo , z)

Michael Gutmann PMR Introduction 21 / 23


What we do in PMR

P
p(x,yo ,z)
p(x|yo ) = P z p(x,y ,z)
o
x,z

Assume that x, y, z each are d = 500 dimensional, and that each


element of the vectors can take K = 10 values.

I Issue 1: To specify p(x, y, z), we need to specify


K 3d − 1 = 101500 − 1 non-negative numbers, which is
impossible.
Topic 1: Representation What reasonably weak assumptions
can we make to efficiently represent p(x, y, z)?

Michael Gutmann PMR Introduction 22 / 23


What we do in PMR
P
p(x,y ,z)
p(x|yo ) = Pz o
p(x,yo ,z)
x,z

I Issue 2: The sum in the numerator goes over the order of


K d = 10500 non-negative numbers and the sum in the
denominator over the order of K 2d = 101000 , which is
impossible to compute.
Topic 2: Exact inference Can we further exploit the
assumptions on p(x, y, z) to efficiently compute the posterior
probability or derived quantities?
I Issue 3: Where do the non-negative numbers p(x, y, z) come
from?
Topic 3: Learning How can we learn the numbers from data?
I Issue 4: For some models, exact inference and learning is too
costly even after fully exploiting the assumptions made.
Topic 4: Approximate inference and learning
Michael Gutmann PMR Introduction 23 / 23

You might also like