0% found this document useful (0 votes)
5 views

Uncertainty Prediction

Uploaded by

jialuyang98
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Uncertainty Prediction

Uploaded by

jialuyang98
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

§ Aleatoric Uncertainty (Statistical Uncertainty): Inherent variability due to randomness

in data or environments.
§ Irreducible through additional data.
§ Modeled using probability distributions.
§ Examples:
§ Sensor measurement noise.
§ Natural fluctuations in weather patterns.

§ Epistemic Uncertainty (Systematic Uncertainty): Arises from lack of knowledge


information, or misrepresentation of the model
§ Source of uncertainty:
§ Insufficient or poor-quality data to inform the model
§ Inaccurate or oversimplified models
§ Parameters within the model that are uncertain or estimated.
§ Incorrect assumptions made during model development.
§ Reducible: by obtaining more data or improving models.

144
§ Statistical Techniques
§ Confidence Intervals: Estimate the range of possible values for a parameter.
§ Prediction Intervals: Predict the range where unseen observations are likely to fall.

§ Probabilistic Modeling
§ Bayesian Methods: Incorporate prior knowledge and update beliefs with new data.
§ Monte Carlo Simulations: Use random sampling to model uncertainty propagation.

§ Ensemble Methods
§ Model Ensembles: Combine multiple models to capture a range of predictions.
§ Bootstrap Methods: Resample data to estimate the variability of predictions.

§ Sensitivity Analysis
§ Evaluate how changes in inputs affect outputs across the entire input space.

145
The goal is to predict the value of target given the features/input variables
In a regression problem the target is continuous variable
Linear regression with one variable
ŷ = fw (xi )

Hypothesis space:

fw (x) = wT x + b
|fw (xi ) yi |
Optimization problem: fw (x) = w1 x + b Prediction error

N
<latexit sha1_base64="3ykn1I99UJq26G2D0S53VKZsFbk=">AAACMXicbVDNSgMxGMz6b/1b9eglWAotaNktUr0IohdPomBV6LZLNs3W0CS7JFm1hH0lL76JePGgiFdfwrRW0OpAYJj5PvLNRCmjSnveszMxOTU9Mzs3X1hYXFpecVfXLlSSSUwaOGGJvIqQIowK0tBUM3KVSoJ4xMhl1Dsa+Jc3RCqaiHPdT0mLo66gMcVIWyl0jwNORWgCjvR1FJvbfCvKYaAyHhq67+ftExjEEmHj56aWl+Pwtvw9etemeQVuw36bVto1GLpFr+oNAf8Sf0SKYITT0H0MOgnOOBEaM6RU0/dS3TJIaooZyQtBpkiKcA91SdNSgThRLTNMnMOSVTowTqR9QsOh+nPDIK5Un9sspcG5atwbiP95zUzHey1DRZppIvDXR3HGoE7goD7YoZJgzfqWICypvRXia2Qb0rbkgi3BH4/8l1zUqn69Wj/bKR4cjuqYAxtgE5SBD3bBATgGp6ABMLgHT+AFvDoPzrPz5rx/jU44o5118AvOxydgPqme</latexit>

X 1
min (fw (xi ) y i )2
w,b
i=1
2
146
For example if we have features x1
and x2 = 2x1 then x1 and x2 are
not linearly independent
We can solve linear regression analytically
instead of gradient descent. Invertible if column of X are linearly independent

Xw = y w = (X T X) 1
XT y
Row i xi =
Pseudo-inverse of X

m⇥n n⇥1 𝑚× 1
Normal Equation Gradient Descent

Number of features
2
O(mn + n ) 3 O(kmn)
Number of data points Number of iteration
No learning rate
Works better if we have
Not invertible
many features (>10000)
In practice we use SVD for
1 finding the inverse of X which is
w=X y more stable that pseudo-inverse
147
P (y i ; w) = N (y i |fw (xi ), 2
)

y = Xw + ✏
<latexit sha1_base64="mZbe3Rh92HBDjw9QF5XPKEZtK+0=">AAACFnicbVDLSsNAFL3xWeur6tLNYBEEsSQi1Y1QdOOygn1AG8pkOmmHTh7MTJQQ8hVu/BU3LhRxK+78GydtFG09MHDmnHu59x4n5Ewq0/w05uYXFpeWCyvF1bX1jc3S1nZTBpEgtEECHoi2gyXlzKcNxRSn7VBQ7DmctpzRZea3bqmQLPBvVBxS28MDn7mMYKWlXumo62E1dNwkTtE5+v600x96l6JD1KWhZDyrL5sVcww0S6yclCFHvVf66PYDEnnUV4RjKTuWGSo7wUIxwmla7EaShpiM8IB2NPWxR6WdjM9K0b5W+sgNhH6+QmP1d0eCPSljz9GV2bJy2svE/7xOpNwzO2F+GCnqk8kgN+JIBSjLCPWZoETxWBNMBNO7IjLEAhOlkyzqEKzpk2dJ87hiVSvV65Ny7SKPowC7sAcHYMEp1OAK6tAAAvfwCM/wYjwYT8ar8TYpnTPynh34A+P9Cz3dn3A=</latexit>

148
y = Xw + ✏
<latexit sha1_base64="mZbe3Rh92HBDjw9QF5XPKEZtK+0=">AAACFnicbVDLSsNAFL3xWeur6tLNYBEEsSQi1Y1QdOOygn1AG8pkOmmHTh7MTJQQ8hVu/BU3LhRxK+78GydtFG09MHDmnHu59x4n5Ewq0/w05uYXFpeWCyvF1bX1jc3S1nZTBpEgtEECHoi2gyXlzKcNxRSn7VBQ7DmctpzRZea3bqmQLPBvVBxS28MDn7mMYKWlXumo62E1dNwkTtE5+v600x96l6JD1KWhZDyrL5sVcww0S6yclCFHvVf66PYDEnnUV4RjKTuWGSo7wUIxwmla7EaShpiM8IB2NPWxR6WdjM9K0b5W+sgNhH6+QmP1d0eCPSljz9GV2bJy2svE/7xOpNwzO2F+GCnqk8kgN+JIBSjLCPWZoETxWBNMBNO7IjLEAhOlkyzqEKzpk2dJ87hiVSvV65Ny7SKPowC7sAcHYMEp1OAK6tAAAvfwCM/wYjwYT8ar8TYpnTPynh34A+P9Cz3dn3A=</latexit>

ŵ = (X> X) 1 X> y
<latexit sha1_base64="3zqcvpIrNbDh9FpVxmp/AnleRG4=">AAACOHicdVDLSgMxFM3UV62vqks3wSLUhWVGpLoRim7cWcE+oNOWTJppQzMPkjtKGeaz3PgZ7sSNC0Xc+gWmD1FbPRA495x7yb3HCQVXYJqPRmpufmFxKb2cWVldW9/Ibm5VVRBJyio0EIGsO0QxwX1WAQ6C1UPJiOcIVnP650O/dsOk4oF/DYOQNT3S9bnLKQEttbOXdo9AbHsEeo4b3yYJPsX5r7KetGwIQvxd77fiAyvB/zUMknY2ZxbMEfAssSYkhyYot7MPdiegkcd8oIIo1bDMEJoxkcCpYEnGjhQLCe2TLmto6hOPqWY8OjzBe1rpYDeQ+vmAR+rPiZh4Sg08R3cON1TT3lD8y2tE4J40Y+6HETCfjj9yI4EhwMMUcYdLRkEMNCFUcr0rpj0iCQWddUaHYE2fPEuqhwWrWCheHeVKZ5M40mgH7aI8stAxKqELVEYVRNEdekIv6NW4N56NN+N93JoyJjPb6BeMj0+zwq4v</latexit>

Substitute the regression model

= (X> X) 1 X> (Xw + ✏) Expand


<latexit sha1_base64="7cndrGsQ4zqPh2eqClHsyfi+Mto=">AAACP3icbVDLSgMxFM34rPVVdekmWIQWscyIVDdC0Y3LCvYBnbZk0kwbmkmGJKOUYf7Mjb/gzq0bF4q4dWf6QGvrhcDJOfdw7z1eyKjStv1sLSwuLa+sptbS6xubW9uZnd2qEpHEpIIFE7LuIUUY5aSiqWakHkqCAo+Rmte/Guq1OyIVFfxWD0LSDFCXU59ipA3VzlQvYM4NkO55flxPWq4WIfz951vxsZPA2YYpx492n8Aj6JJQUSZ4vp3J2gV7VHAeOBOQBZMqtzNPbkfgKCBcY4aUajh2qJsxkppiRpK0GykSItxHXdIwkKOAqGY8uj+Bh4bpQF9I87iGI3baEaNAqUHgmc7htmpWG5L/aY1I++fNmPIw0oTj8SA/YlALOAwTdqgkWLOBAQhLanaFuIckwtpEnjYhOLMnz4PqScEpFoo3p9nS5SSOFNgHByAHHHAGSuAalEEFYPAAXsAbeLcerVfrw/octy5YE88e+FPW1zd24a/V</latexit>

= (X> X) 1 X> Xw + (X> X) 1 X> ✏


<latexit sha1_base64="XiBlW6iXABzjbeThDnUBtstDLeg=">AAACcHicnVHLSgMxFM2Mr1pf1W4EFaNFqIplRqS6EYpuXFawD+i0JZNm2tDMgySjlGHW/p87P8KNX2CmHXy0uvFA4OTcc5ObEztgVEjDeNX0ufmFxaXMcnZldW19I7e5VRd+yDGpYZ/5vGkjQRj1SE1SyUgz4AS5NiMNe3ib1BuPhAvqew9yFJC2i/oedShGUknd3DO8hkXLRXJgO1Ez7ljSD+DX/rgTnZkx/NvwSZ9iePqfk0ggKEsmKRglYww4S8yUFECKajf3YvV8HLrEk5ghIVqmEch2hLikmJE4a4WCBAgPUZ+0FPWQS0Q7GgcWwyOl9KDjc7U8Ccfq944IuUKMXFs5k3nFdC0Rf6u1QulctSPqBaEkHp5c5IQMSh8m6cMe5QRLNlIEYU7VrBAPEEdYqj/KqhDM6SfPkvp5ySyXyvcXhcpNGkcG7IBDUAQmuAQVcAeqoAYweNPy2q62p73r2/q+fjCx6lrakwc/oJ98ANHavW4=</latexit>

= w + (X> X) 1 X> ✏
<latexit sha1_base64="ZPukaga9DtFb2EWubouKVm7FSWE=">AAACM3icbVDLSgMxFM3UV62vqks3wSJUxDIjUt0IRTfiqoJ9QKctmTTThmYmQ5JRyjD/5MYfcSGIC0Xc+g9m2hG19UDg3HMfufc4AaNSmeazkZmbX1hcyi7nVlbX1jfym1t1yUOBSQ1zxkXTQZIw6pOaooqRZiAI8hxGGs7wIsk3bomQlPs3ahSQtof6PnUpRkpL3fwVPIO2h9TAcaO7GB7A4nfUjDu24gH8ifc70aEVw5kCEkjKkmEFs2SOAWeJlZICSFHt5h/tHsehR3yFGZKyZZmBakdIKIoZiXN2KEmA8BD1SUtTH3lEtqPxzTHc00oPulzo5ys4Vn93RMiTcuQ5ujLZV07nEvG/XCtU7mk7on4QKuLjyUduyKDiMDEQ9qggWLGRJggLqneFeIAEwkrbnNMmWNMnz5L6Uckql8rXx4XKeWpHFuyAXVAEFjgBFXAJqqAGMLgHT+AVvBkPxovxbnxMSjNG2rMN/sD4/AJ9Earu</latexit>

Var(ŵ) = Var (X> X) 1 X> ✏


<latexit sha1_base64="we8EtqLsDGk4wMkZvuOE/kr9vVQ=">AAACWXicbVFNTxsxEPUuHw0p0BSOvVhElZID0W5V0V6QEFx6pBIJkbIh8jqziYXXtuzZlmi1f7IHJMRf4VAnBPERRrL0/N4bzfg5NVI4jKK7IFxb39j8UNuqf9ze2f3U+LzXc7qwHLpcS237KXMghYIuCpTQNxZYnkq4TK/P5vrlH7BOaHWBMwPDnE2UyARn6KlRwyQIN1j2mK1ayZRhmeQMp2lW/q2qNj2mz3IiIcNW60nvV1cJakOf7+2r8jCu6IoBjBNSq8SKyRTbo0Yz6kSLoqsgXoImWdb5qPEvGWte5KCQS+bcII4MDktmUXAJVT0pHBjGr9kEBh4qloMblotkKvrVM2OaaeuPQrpgX3aULHdulqfeOV/bvdXm5HvaoMDs57AUyhQIij8OygpJUdN5zHQsLHCUMw8Yt8LvSvmUWcbRf0bdhxC/ffIq6H3rxEedo9/fmyenyzhq5As5IC0Skx/khPwi56RLOLklD8FGsBnch0FYC+uP1jBY9uyTVxXu/wf0UrY9</latexit>

= (X> X) 1 X> Var(✏)X(X> X) 1


<latexit sha1_base64="tSTO34qwDCdrLTztq4rlkH+12iw=">AAACX3ichVHLahsxFNVMk9hxHp00q9CNqCk4i5iZUNxsCqHZZJlA/QCPYzTynURYIw3SnVAzzE92F8imfxL5AUntQi4Ijs45V7o6SnIpLIbhk+d/2NreqdV3G3v7B4cfg6NPPasLw6HLtdRmkDALUijookAJg9wAyxIJ/WR6Ndf7j2Cs0OoXznIYZexeiVRwho4aB4/0B23FGcOHJC0H1V2MOqev+9O78iyq6IYB4TeWPWaqVgy5FVKr0zem908cB82wHS6KboJoBZpkVTfj4E880bzIQCGXzNphFOY4KplBwSVUjbiwkDM+ZfcwdFCxDOyoXORT0a+OmdBUG7cU0gX7tqNkmbWzLHHO+Zx2XZuT/9OGBaYXo1KovEBQfHlRWkiKms7DphNhgKOcOcC4EW5Wyh+YYRzdlzRcCNH6kzdB77wdddqd22/Ny5+rOOrkM/lCWiQi38kluSY3pEs4efZ8b8/b9/76Nf/QD5ZW31v1HJN/yj95AZkttmQ=</latexit>

2
(X> X) 1
X> X(X> X) 1
<latexit sha1_base64="hmqagN5O0O5RobfrKyHvckScUXI=">AAACU3ichVHPS8MwGE3r1DmdVj16CQ5hHhztkOlFGHrxOMH9gLUbaZZucekPklQYpf+jCB78R7x40HQrqJviB4GX977H9+XFjRgV0jRfNX2tsL6xWdwqbe+Ud/eM/YOOCGOOSRuHLOQ9FwnCaEDakkpGehEnyHcZ6brTm0zvPhIuaBjcy1lEHB+NA+pRjKSihsYDvIK2oGMfDeqwavtITlwv6aUDW4YR/LqfDpIzK4V/N/xvHhoVs2bOC64CKwcVkFdraDzboxDHPgkkZkiIvmVG0kkQlxQzkpbsWJAI4Skak76CAfKJcJJ5Jik8UcwIeiFXJ5Bwzn53JMgXYua7qjPbUyxrGfmb1o+ld+kkNIhiSQK8GOTFDMoQZgHDEeUESzZTAGFO1a4QTxBHWKpvKKkQrOUnr4JOvWY1ao2780rzOo+jCI7AMagCC1yAJrgFLdAGGDyBN/ChAe1Fe9d1vbBo1bXccwh+lF7+BMMcs/0=</latexit>

=
2
(X> X) 1
<latexit sha1_base64="oeqeK3O88BLsP5acDqaiewWNjzc=">AAACFXicbVDLSsNAFJ3UV62vqEs3g0WooCUpUt0IRTcuK9gHNGmZTCft0EkmzEyEEvoTbvwVNy4UcSu482+ctAG19cDAmXPu5d57vIhRqSzry8gtLa+sruXXCxubW9s75u5eU/JYYNLAnHHR9pAkjIakoahipB0JggKPkZY3uk791j0RkvLwTo0j4gZoEFKfYqS01DNPLqEj6SBA3QosOQFSQ89P2pOuo3gEf/7H3eTUnvTMolW2poCLxM5IEWSo98xPp89xHJBQYYak7NhWpNwECUUxI5OCE0sSITxCA9LRNEQBkW4yvWoCj7TShz4X+oUKTtXfHQkKpBwHnq5M95TzXir+53Vi5V+4CQ2jWJEQzwb5MYOKwzQi2KeCYMXGmiAsqN4V4iESCCsdZEGHYM+fvEialbJdLVdvz4q1qyyOPDgAh6AEbHAOauAG1EEDYPAAnsALeDUejWfjzXifleaMrGcf/IHx8Q1T655W</latexit>

= 149
a new observation
predicted mean

ŷ0 = x>
<latexit sha1_base64="SP64ja7jPOUXk5qBhLe3rRtEjO4=">AAACD3icbVDLSsNAFJ3UV62vqEs3g0VxVRKR6kYounFZwT6giWEynbRDJw9mbtQS8gdu/BU3LhRx69adf2OSdqGtBy4czrmXe+9xI8EVGMa3VlpYXFpeKa9W1tY3Nrf07Z22CmNJWYuGIpRdlygmeMBawEGwbiQZ8V3BOu7oMvc7d0wqHgY3MI6Y7ZNBwD1OCWSSox9aQwLJOHUMfI4fHOPWgjDChWj5BIaul9ynacXRq0bNKIDniTklVTRF09G/rH5IY58FQAVRqmcaEdgJkcCpYGnFihWLCB2RAetlNCA+U3ZS/JPig0zpYy+UWQWAC/X3REJ8pca+m3XmN6pZLxf/83oxeGd2woMoBhbQySIvFhhCnIeD+1wyCmKcEUIlz27FdEgkoZBFmIdgzr48T9rHNbNeq1+fVBsX0zjKaA/toyNkolPUQFeoiVqIokf0jF7Rm/akvWjv2sektaRNZ3bRH2ifP3YHnFQ=</latexit>

0 ŵ

Var(ŷ0 ) = Var(x> > 2 >


x0 (X> X) 1 x0
<latexit sha1_base64="uauCbHqEynxzuDJ+z2rlf5qZagI=">AAACjXicbVHbTttAEF0bKJBeCPDIy4qoUnggshF3lSpqH8ojSE2IFCfWeLNOVqwv2h1TIst/0y/ijb/pOjG30JFWOnvOmdnZmSCVQqPjPFr20vLKh9W19drHT5+/bNQ3t7o6yRTjHZbIRPUC0FyKmHdQoOS9VHGIAslvgtufpX5zx5UWSfwbpykfRDCORSgYoKH8+l8P+T3mXVBF05sA5tPCd/boBX3F3/vO0MMkpTODFwFOgjD/UxSl70VcKPTGZ1xlTS3GEQwPXpKaT65eUVV5vu8N8323KK01v95wWs4s6HvgVqBBqrjy6w/eKGFZxGNkErTuu06KgxwUCiZ5UfMyzVNgtzDmfQNjiLge5LNpFvSrYUY0TJQ5MdIZ+zojh0jraRQYZ9msXtRK8n9aP8PwdJCLOM2Qx2z+UJhJigktV0NHQnGGcmoAMCVMr5RNQAFDs8ByCO7il9+D7kHLPW4dXx822j+qcayRHbJLmsQlJ6RNLskV6RBm1SzHOrPO7Q37yP5mf59bbavK2SZvwv71D5tHxgs=</latexit>

0 ŵ) = x0 Var(ŵ)x0 =

y0 = x >
<latexit sha1_base64="Wziio1bFcmyhjGNXVDAzphUs+pw=">AAACGnicbVDLSsNAFJ3UV62vqks3g0UQhJKIVDdC0Y3LCvYBTQyT6aQdOnkwc6OW0O9w46+4caGIO3Hj3zhps9DWAwOHc+5h7j1eLLgC0/w2CguLS8srxdXS2vrG5lZ5e6elokRS1qSRiGTHI4oJHrImcBCsE0tGAk+wtje8zPz2HZOKR+ENjGLmBKQfcp9TAlpyyxbWGLkmPscPrnlrQxRje0AgtQMCA89P78djfIRtFisudMAsueWKWTUnwPPEykkF5Wi45U+7F9EkYCFQQZTqWmYMTkokcCrYuGQnisWEDkmfdTUNScCUk05OG+MDrfSwH0n9QsAT9XciJYFSo8DTk9nCatbLxP+8bgL+mZPyME6AhXT6kZ8IDBHOesI9LhkFMdKEUMn1rpgOiCQUdJtZCdbsyfOkdVy1atXa9UmlfpHXUUR7aB8dIgudojq6Qg3URBQ9omf0it6MJ+PFeDc+pqMFI8/soj8wvn4A6OyfAQ==</latexit>

Prediction
0 ŵ + ✏0
<latexit sha1_base64="dZnGqicgtNKW1ZJQ7E+7JQKN5Uk=">AAACxXiclVFNaxsxENVu+pG6H3GaYy+ipsWm1OyGkuZSCOmhPaZQOwbLXrTyrFdE+1FpNmRZlvzH3Ar9MZVslyZOLx0QPL03bzSaiUslDQbBT8/fefDw0ePdJ52nz56/2OvuvxybotICRqJQhZ7E3ICSOYxQooJJqYFnsYLz+OKz088vQRtZ5N+xLmGW8WUuEyk4Wirq/mIIV9iMuW77dRQM6NtP9BbFUo5N3Trh3R0eSiOVLWAFxjrOZOQy4/NDehUFc4ZFSfss45jGSTNp18Tf+2DevA9bl+rK/nFuFWIKEuzT0Kb8Z02m5TLFQSfq9oJhsAp6H4Qb0CObOIu6N2xRiCqDHIXixkzDoMRZwzVKoaDtsMpAycUFX8LUwpxnYGbNagstfWOZBU0KbU+OdMXedjQ8M6bOYpvpmjbbmiP/pU0rTI5njczLCiEX64eSSlEsqFspXUgNAlVtARda2l6pSLnmAu3i3RDC7S/fB+PDYXg0PPr2oXdyuhnHLnlFXpM+CclHckK+kjMyIsI79VLvh6f9L37mo3+5TvW9jeeA3An/+jcB/dda</latexit>

Var(y0 ) = Var(ŷ0 ) + Var(✏0 )


2 >
= x0 (X> X) 1 x0 + 2
= 2
1 + x>
0 (X >
X) 1
x0
150
Critical values for
normal distribution:

§ Use t-Distribution Critical Values:


§ Sample size is small (n<30).
§ Degrees of freedom are low.

§ Use z-Values:
§ Sample size is large (n≥30).
§ Degrees of freedom are high.

1.96 < Y < µ + 1.96 ) ⇡ 95%


<latexit sha1_base64="1q8Cix06VLwWlJ8Gu+47esI7kKM=">AAACI3icbZDLSgMxFIYz9VbrbdSlm2ApVMQyI1qtuCi6cVnBXqRTSiZN29BkZkgyYhn6Lm58FTculOLGhe9ipp1Fbf3hwM93ziE5vxswKpVlfRuppeWV1bX0emZjc2t7x9zdq0k/FJhUsc980XCRJIx6pKqoYqQRCIK4y0jdHdzG/foTEZL63oMaBqTFUc+jXYqR0qhtXlXyDg/hCbQLpSJ0JO1xBK/ho66YH8/yI+igIBD+MyydO7lM28xaBWsiuGjsxGRBokrbHDsdH4eceAozJGXTtgLVipBQFDMyyjihJAHCA9QjTW09xIlsRZMbRzCnSQd2faHLU3BCZzcixKUccldPcqT6cr4Xw/96zVB1L1sR9YJQEQ9PH+qGDCofxoHBDhUEKzbUBmFB9V8h7iOBsNKxxiHY8ycvmtppwS4Wivdn2fJNEkcaHIBDkAc2uABlcAcqoAoweAFv4AN8Gq/GuzE2vqajKSPZ2Qd/ZPz8AjqHnyU=</latexit>

P (µ
151
Standard error of prediction:
q
<latexit sha1_base64="W3w6a0/mpjRSNXn1ZdKs95i0oME=">AAACT3icbZHPixMxFMczdV3b7rpb9eglWIQWscwsUr0IxUXw2GXtD+i0QybNtKGZHyZvpCXkP/Sy3vw3vHhQxEw7sLutDwKffN97vJdvwkxwBa77w6k8OHp4/Khaq5+cPj47bzx5OlRpLikb0FSkchwSxQRP2AA4CDbOJCNxKNgoXF0W+dFXJhVPk8+wydg0JouER5wSsFLQiHxga9DXH02gd2jb58a0NoHbxu+xvySgfcUXMTHYV18kaA+/wuvAnfmQZrjlxwSWYaTHZifc3tsz/dozRampB42m23G3gQ/BK6GJyugHje/+PKV5zBKggig18dwMpppI4FQwU/dzxTJCV2TBJhYTEjM11Vs/DH5plTmOUmlPAnir3u3QJFZqE4e2sthW7ecK8X+5SQ7Ru6nmSZYDS+huUJQLDCkuzMVzLhkFsbFAqOR2V0yXRBIK9gsKE7z9Jx/C8KLjdTvdqzfN3ofSjip6jl6gFvLQW9RDn1AfDRBF39BP9Bv9cW6cX87fSllacUp4hu5FpfYPaQW0AQ==</latexit>

SEpred (y0 ) = ˆ 1 + x> >


0 (X X)
1x
0

Sample estimate of error standard deviation

Prediction with confidence interval:

ŷ0 ± t↵/2, n ⇥ SEpred (y0 )


<latexit sha1_base64="wzWf0fEGlm4AeKXKpWcW0XL9e6g=">AAACMXicbVBNSwMxEM36bf2qevQSLIKC1l0R9SiK4FHRqtAty2ya2mA2G5JZcVn2L3nxn4gXD4p49U+Y1h78Ggh5vPeGmXmxlsKi7z97Q8Mjo2PjE5OVqemZ2bnq/MKFTTPDeIOlMjVXMVguheINFCj5lTYckljyy/jmsKdf3nJjRarOMde8lcC1Eh3BAB0VVY/DLmCRl5FPQ51QjIoQpO7C5tY6DdepohtUlzREkXDrPn6HxdlR6Vx96Ea1y3I1j/y1qFrz636/6F8QDECNDOokqj6G7ZRlCVfIJFjbDHyNrQIMCiZ5WQkzyzWwG7jmTQcVuA1aRf/ikq44pk07qXFPIe2z3zsKSKzNk9g5E8Cu/a31yP+0ZoadvVYhlM6QK/Y1qJNJiintxUfbwnCGMncAmBFuV8q6YIChC7niQgh+n/wXXGzVg536zul2bf9gEMcEWSLLZJUEZJfsk2NyQhqEkXvyRF7Iq/fgPXtv3vuXdcgb9CySH+V9fAKPealI</latexit>

152
The regions that are farther
away from data has higher
uncertainty. This is captured
by x> > 1
<latexit sha1_base64="tfp0H0VEIH1IgV0L6Ld5agZf4tU=">AAACGXicbVDLSsNAFJ3UV62vqEs3g0WoC0siUl0W3bisYB/QpGEynbRDJ5kwMxFL6G+48VfcuFDEpa78GydtQG09MHDmnHu59x4/ZlQqy/oyCkvLK6trxfXSxubW9o65u9eSPBGYNDFnXHR8JAmjEWkqqhjpxIKg0Gek7Y+uMr99R4SkPLpV45i4IRpENKAYKS15pgXvPavnKB7DihMiNfSDtDOZCT//4156Yk+y0pJnlq2qNQVcJHZOyiBHwzM/nD7HSUgihRmSsmtbsXJTJBTFjExKTiJJjPAIDUhX0wiFRLrp9LIJPNJKHwZc6BcpOFV/d6QolHIc+royW1bOe5n4n9dNVHDhpjSKE0UiPBsUJAwqDrOYYJ8KghUba4KwoHpXiIdIIKx0mFkI9vzJi6R1WrVr1drNWbl+mcdRBAfgEFSADc5BHVyDBmgCDB7AE3gBr8aj8Wy8Ge+z0oKR9+yDPzA+vwHHo5+J</latexit>

0 (X X) x0

153
Likelihood Prior

2
<latexit sha1_base64="QNxAiAFWjHdriDtB7c7YnXdVHQg=">AAAClHicfVFbS8MwFE7rbc7bpuCLL8EhbDBGO0R9UJiK4JMouAusc6RZuoUlbUlSpZT+Iv+Nb/4b0zlvm3og8OU758s5+Y4bMiqVZb0a5sLi0vJKbjW/tr6xuVUobrdkEAlMmjhggei4SBJGfdJUVDHSCQVB3GWk7Y4vs3z7kQhJA/9exSHpcTT0qUcxUprqF57DssORGrle8pRCh9MB/LjHafUTdzIs6ZCjh3oFnkHHEwgnX9p4Rvv0p9apwu8tK+k/r/xQpvl+oWTVrEnAeWBPQQlM47ZfeHEGAY448RVmSMqubYWqlyChKGYkzTuRJCHCYzQkXQ19xInsJRNTU3igmQH0AqGPr+CE/a5IEJcy5q6uzMaVs7mM/C3XjZR30kuoH0aK+Pi9kRcxqAKYbQgOqCBYsVgDhAXVs0I8QtpvpfeYmWDPfnketOo1+6h2dHdYalxM7ciBPbAPysAGx6ABrsEtaAJsFI1jo2Gcm7vmqXlpXr2XmsZUswN+hHnzBi0eyGs=</latexit>

Posterior 2 p(y | w, X, ) p(w)


p(w | y, X, ) =
p(y | X, 2 )
Both likelihood and prior are Gaussian → Posterior is Gaussian
For models where conjugate priors are not available analytical solutions may not be possible.

2
) = N (w | wN , ⇤N1 )
<latexit sha1_base64="2GdsUO9YCYAEqAnRfzDzDoWLdR4=">AAACZnicdVFNSwMxEM2un61fVREPXoJFUNCyW0S9CKIXD1IUrBa6tcxms21osrskWaUs+ye9efbizzDbVvFzIOTlzTxm5sVPOFPacV4se2p6ZnZuvlReWFxaXqmsrt2pOJWENknMY9nyQVHOItrUTHPaSiQF4XN67w8uivz9I5WKxdGtHia0I6AXsZAR0IbqVvJk1xOg+36YPeXYEyzAH+9Wvv+JhwVWrCfgob6HT8c8AZ418n/1T3m3YVR+zAM1FObKvCszWACGf8gO3Hyv3K1UnZozCvwbuBNQRZO47laevSAmqaCRJhyUartOojsZSM0Ip3nZSxVNgAygR9sGRiCo6mQjm3K8Y5gAh7E0J9J4xH5VZCBUMampLFZQP3MF+VeunerwpJOxKEk1jci4UZhyrGNceI4DJinRfGgAEMnMrJj0QQLR5mcKE9yfK/8Gd/Wae1Q7ujmsnp1P7JhHW2gb7SIXHaMzdImuURMR9GqVrDVr3Xqzl+0Ne3NcalsTzTr6FjZ+B7kXuIs=</latexit>

p(w | X, y, can be computationally


<latexit sha1_base64="eJM6/4OJ0rv3yTm5peH8VE0l/Ik=">AAACdnicbVFdixMxFM2MX7XratUnESS0lF2RLTNF6r4IRV98EKmw3RaatmQymTY0MxmSOy4lzE/wz/m2v2NffDTTlrJu90LIyT3n3tyPKJfCQBBce/6Dh48eP6k9rR89O37+ovHy1aVRhWZ8yJRUehxRw6XI+BAESD7ONadpJPkoWn2t+NEvro1Q2QWscz5N6SITiWAUnGve+E1SCssosVfl/Af+jEmkZGzWqbss+e7yxNQRM3sWlphInsDp/ZIA30oU4A+YJJoyG5aWGLFI6axb7hXjckZA5fv32lFaLJbwft5oBZ1gY/gQhDvQQjsbzBt/SKxYkfIMmKTGTMIgh6mlGgSTvKyTwvCcshVd8ImDGU25mdrN2Ercdp4YJ0q7kwHeeG9HWJqaqk+nrCo1d7nKeR83KSA5n1qR5QXwjG0/SgqJQeFqBzgWmjOQawco08LVitmSunmB21TdDSG82/IhuOx2wl6n9/Njq/9lN44aeoua6BSF6BPqo29ogIaIoRvvjdf0Wt5f/53f9k+2Ut/bxbxG/5kf/AP9R8Av</latexit>

✓ ◆ intensive for large


1 datasets.
wN = ⇤N1 ⇤0 w0 + 2
X> y

1
<latexit sha1_base64="O4g3gSETKfc8YqcGwSsHdE8esyI=">AAACSXicbVBNS8MwGE43P+b8qnr0EhyCIIxWZHoRhl48iExwH7BuM03TLSxtSpIKo/TvefHmzf/gxYMinky3Hea2F0KePM/7kPd93IhRqSzr3cjlV1bX1gsbxc2t7Z1dc2+/IXksMKljzrhouUgSRkNSV1Qx0ooEQYHLSNMd3mR685kISXn4qEYR6QSoH1KfYqQ01TOfHJczT44CfSXOnTZ6KO3dwyu4VLDgKXR8gXBip4kjaT9A3bMUOgFSA9dPWmnXUTyaeRd7ZskqW+OCi8CeghKYVq1nvjkex3FAQoUZkrJtW5HqJEgoihlJi04sSYTwEPVJW8MQBUR2knESKTzWjAd9LvQJFRyzs44EBTLbSXdmI8p5LSOXae1Y+ZedhIZRrEiIJx/5MYOKwyxW6FFBsGIjDRAWVM8K8QDpoJQOPwvBnl95ETTOynalXHk4L1Wvp3EUwCE4AifABhegCm5BDdQBBi/gA3yBb+PV+DR+jN9Ja86Yeg7Av8rl/wAHHrOU</latexit>

>
⇤N = ⇤0 + 2
X X

154
<latexit sha1_base64="4KMD6TDTbqAYnmabr6PlSnoLlFE=">AAACnXichVFbS8MwFE7rfd6qvumDwSFMkNEOUV+EoQgqIgrODdZZ0jTdgumFJFVL6b/yl/jmvzGdxTkveCDhy3fOl3NzY0aFNM03TZ+YnJqemZ2rzC8sLi0bK6t3Iko4Ji0csYh3XCQIoyFpSSoZ6cScoMBlpO0+nBT+9iPhgkbhrUxj0gtQP6Q+xUgqyjFe4lrqmNAOqKcuJAeunz3njrn7+erkI5wWWNB+gO4bO/AI2jSU8N8fnsZUcW3Ej6v+zmTvQm+kqjhG1aybQ4M/gVWCKijt2jFebS/CSUBCiRkSomuZsexliEuKGckrdiJIjPAD6pOugiEKiOhlw+nmcFsxHvQjro5qd8h+VWQoECINXBVZlCi++wryN183kf5hL6NhnEgS4o9EfsKgjGCxKuhRTrBkqQIIc6pqhXiAOMJSLbQYgvW95Z/grlG39uv7N3vV5nE5jlmwAbZADVjgADTBGbgGLYC1da2pnWsX+qZ+ql/qVx+hulZq1sCY6e13OXPKTA==</latexit>

Z
2 2 2
p(y0 | x0 , X, y, )= p(y0 | x0 , w, )p(w | X, y, ) dw

Since both the likelihood and the posterior are Gaussian, the posterior predictive
distribution is also Gaussian:

2 2
<latexit sha1_base64="MjHP3V8KBMk5hAM0ryt7bHoGXSI=">AAACTnicbVHLSgMxFM3UV62vUZdugkWoRcpMkepGKLpxJRXsA9o6ZNJMG5p5kGTEYZgvdCPu/Aw3LhTRTFu1tl5IODn33JubEztgVEjDeNYyC4tLyyvZ1dza+sbmlr690xB+yDGpY5/5vGUjQRj1SF1SyUgr4AS5NiNNe3iR5pt3hAvqezcyCkjXRX2POhQjqShLJ0EhsgzYcWlPbUgObCe+Tyzj6OfUSn5xlGJB+y66LR/CszGPEYuvkuk2oVX81llFpcxZet4oGaOA88CcgDyYRM3Snzo9H4cu8SRmSIi2aQSyGyMuKWYkyXVCQQKEh6hP2gp6yCWiG4/sSOCBYnrQ8blanoQjdroiRq4QkWsrZTq/mM2l5H+5diid025MvSCUxMPji5yQQenD1FvYo5xgySIFEOZUzQrxAHGEpfqB1ARz9snzoFEumZVS5fo4Xz2f2JEFe2AfFIAJTkAVXIIaqAMMHsALeAPv2qP2qn1on2NpRpvU7II/kcl+AXInsh4=</latexit>

p(y0 | x0 , X, y, ) = N (y0 | µ⇤ , ⇤)

µ⇤ = x >
<latexit sha1_base64="B2B2OsZ7KCGvViM0Y3k6ZQnnduM=">AAACEnicbVDLSsNAFJ34rPUVdelmsAjqoiQi1Y1QdONKKtgHtDFMppN26CQTZiZqCfkGN/6KGxeKuHXlzr9x0gbR1gPDHM65l3vv8SJGpbKsL2Nmdm5+YbGwVFxeWV1bNzc2G5LHApM65oyLlockYTQkdUUVI61IEBR4jDS9wXnmN2+JkJSH12oYESdAvZD6FCOlJdfch7ATxO4BPNU/Un3PT+5T17rpKB79KHepe1l0zZJVtkaA08TOSQnkqLnmZ6fLcRyQUGGGpGzbVqScBAlFMSNpsRNLEiE8QD3S1jREAZFOMjophbta6UKfC/1CBUfq744EBVIOA09XZkvKSS8T//PasfJPnISGUaxIiMeD/JhBxWGWD+xSQbBiQ00QFlTvCnEfCYSVTjELwZ48eZo0Dst2pVy5OipVz/I4CmAb7IA9YINjUAUXoAbqAIMH8ARewKvxaDwbb8b7uHTGyHu2wB8YH99xWpzC</latexit>

0 wN
1
2 2
+ x> parameter uncertainty
<latexit sha1_base64="NYA/BQRrU52XYlwK25sc+/BYzZM=">AAACPnicbVBNS8NAEN3Ur1q/qh69LBZBFEtSpHoRil48iFSwKjRt2Gw27dJNNuxuxBLyy7z4G7x59OJBEa8e3bQ5WOvAsm/evGFmnhsxKpVpvhiFmdm5+YXiYmlpeWV1rby+cSN5LDBpYc64uHORJIyGpKWoYuQuEgQFLiO37uAsq9/eEyEpD6/VMCKdAPVC6lOMlKaccgtCW9JegJy9bg2e5ImG+9AOkOq7fvKQOmbXVjyCtsuZJ4eB/hL7Qg/xUOpcdpMDK51Ql5xyxayao4DTwMpBBeTRdMrPtsdxHJBQYYakbFtmpDoJEopiRtKSHUsSITxAPdLWMEQBkZ1kdH4KdzTjQZ8L/UIFR+zvjgQFMttaK7Ml5d9aRv5Xa8fKP+4kNIxiRUI8HuTHDCoOMy+hRwXBig01QFhQvSvEfSQQVtrxzATr78nT4KZWterV+tVhpXGa21EEW2Ab7AILHIEGOAdN0AIYPIJX8A4+jCfjzfg0vsbSgpH3bIKJML5/AGSari0=</latexit>

⇤ = 0 ⇤ N x0 (epistemic uncertainty)

observation noise
(aleatoric uncertainty) 155
156
1- Define the model <latexit sha1_base64="7AhdD2uNN3KCY1BqPioUTvd/b1c=">AAAB83icbVBNSwMxEM3Wr1q/qh69BItQL2VXpHosevFYwX5Ady3ZNNuGJtklmRXK0r/hxYMiXv0z3vw3pu0etPXBwOO9GWbmhYngBlz32ymsrW9sbhW3Szu7e/sH5cOjtolTTVmLxiLW3ZAYJrhiLeAgWDfRjMhQsE44vp35nSemDY/VA0wSFkgyVDzilICVfN/woSSPWRXOp/1yxa25c+BV4uWkgnI0++UvfxDTVDIFVBBjep6bQJARDZwKNi35qWEJoWMyZD1LFZHMBNn85ik+s8oAR7G2pQDP1d8TGZHGTGRoOyWBkVn2ZuJ/Xi+F6DrIuEpSYIouFkWpwBDjWQB4wDWjICaWEKq5vRXTEdGEgo2pZEPwll9eJe2Lmlev1e8vK42bPI4iOkGnqIo8dIUa6A41UQtRlKBn9IrenNR5cd6dj0VrwclnjtEfOJ8/wqSRhQ==</latexit>

(t)
<latexit sha1_base64="Rjk5aftkgeZIWOsvoPULo6HGMy4=">AAAB+XicbVDLSsNAFL2pr1pfUZduBotQNyURqS6LblxWsA9oa5lMJ+3QySTMTCol5E/cuFDErX/izr9x0mahrQcGDufcyz1zvIgzpR3n2yqsrW9sbhW3Szu7e/sH9uFRS4WxJLRJQh7KjocV5UzQpmaa004kKQ48Ttve5Dbz21MqFQvFg55FtB/gkWA+I1gbaWDbvQDrsecnT+ljUtHn6cAuO1VnDrRK3JyUIUdjYH/1hiGJAyo04ViprutEup9gqRnhNC31YkUjTCZ4RLuGChxQ1U/myVN0ZpQh8kNpntBorv7eSHCg1CzwzGSWUy17mfif1421f91PmIhiTQVZHPJjjnSIshrQkElKNJ8ZgolkJisiYywx0aaskinBXf7yKmldVN1atXZ/Wa7f5HUU4QROoQIuXEEd7qABTSAwhWd4hTcrsV6sd+tjMVqw8p1j+APr8wek7JOu</latexit>

2- Sample w(t) and from posterior


3- Sample prediction from
<latexit sha1_base64="8t6RYEBCTLlkhaERbHHsuWGzIPA=">AAACXXicbZFNTxsxEIa9y3f4aKCHHnqxiJCCiqLdCgEXJNReeqpAIhApm0azjjexsL2LPQuKVvsnubWX/pV6symCwEiWXj3vjDx+HWdSWAyC356/tLyyura+0djc2t750Nzdu7FpbhjvslSmpheD5VJo3kWBkvcyw0HFkt/Gd98r//aBGytSfY3TjA8UjLVIBAN0aNjEaAJYTMtfRRsPS3pOIwU4iZOiVz7Lx//uFxrxzAqZ6hoc0eg+h9ECpZEVqh5mIIufZTs4qthYQd1wOGy2gk4wK/pWhHPRIvO6HDafolHKcsU1MgnW9sMgw0EBBgWTvGxEueUZsDsY876TGhS3g2KWTkkPHBnRJDXuaKQz+nKiAGXtVMWus9rZLnoVfM/r55icDQqhsxy5ZvVFSS4pprSKmo6E4Qzl1AlgRrhdKZuAAYbuQxouhHDxyW/FzddOeNI5uTpuXXybx7FOPpN90iYhOSUX5Ae5JF3CyB+PeBtew/vrr/hb/k7d6nvzmY/kVfmf/gFJe7Pz</latexit>

ŷ (t) = Xw(t) + ✏(t) , ✏(t) ⇠ N (0, (t)


)

Estimate the predication mean and


standard deviation using samples:
T
<latexit sha1_base64="XrFdu1q0JugGfIbuqGvbIidAyvI=">AAACLXicbVBNTxsxEPVSKDTQNpQjF6tRUXqJdlEFXJAQcOBIJQJI2WTldWYTC9u7smcRkbV/iAt/BSFxACGu/Rt1Pg6FdCRLb96bp/G8tJDCYhg+BQsfFpc+Lq98qq2uff7ytb7+7dzmpeHQ5rnMzWXKLEihoY0CJVwWBphKJVykV0dj/eIajBW5PsNRAV3FBlpkgjP0VFI/jocM3ahKXIxwg04B01VFt/ZpnBnGXVS5s4rGtlSJw/2o6k3aqafnmvjTd3FSb4StcFJ0HkQz0CCzOk3qD3E/56UCjVwyaztRWGDXMYOCS6hqcWmhYPyKDaDjoWYKbNdNrq3oD8/0aZYb/zTSCfuvwzFl7UilflIxHNr32pj8n9YpMdvrOqGLEkHz6aKslBRzOo6O9oUBjnLkAeNG+L9SPmQ+JfQB13wI0fuT58H5divaae38/tU4OJzFsUI2yXfSJBHZJQfkhJySNuHkltyTJ/Ic3AWPwUvwOh1dCGaeDfKmgj9/AU5oqVo=</latexit>

1 X (t)
ŷmean = ŷ
T t=1
v
u
<latexit sha1_base64="POhAZ8RuXGR4f8KnTbYvXLrqQ8s=">AAACTHicbVBNTxsxFPSmLdBA29Aee7EagcKh0S6qgAsSopceqUQAKZusvM5bYmF7F/stamT5B/bCobf+Ci4cWlWV6oQ9tKEjWRrPvPHH5JUUFuP4e9R68vTZyura8/b6xouXrzqbr89sWRsOA17K0lzkzIIUGgYoUMJFZYCpXMJ5fvVx7p/fgLGi1Kc4q2Ck2KUWheAMg5R1eDpl6GY+cynCF3QWJ97T7UOa2muDLi0M4y7x7tQHpVaZw8PEj+fbXpMcux7uePqeLp2kgGnvd8a7Put04368AH1MkoZ0SYOTrPMtnZS8VqCRS2btMIkrHDlmUHAJvp3WFirGr9glDAPVTIEduUUZnm4FZUKL0oSlkS7UvxOOKWtnKg+TiuHULntz8X/esMbiYOSErmoEzR8uKmpJsaTzZulEGOAoZ4EwbkR4K+VTFvrD0H87lJAsf/kxOdvtJ3v9vc8fukfHTR1r5C15R3okIfvkiHwiJ2RAOPlK7sgP8jO6je6jX9Hvh9FW1GTekH/QWvkDgUq16g==</latexit>

u1 X T
ŷstd = t (ŷ (t) ŷmean )2
T t=1

157
158
§ The entropy H(X) of a discrete random variable X with distribution
p(x) is: X <latexit sha1_base64="Eo3GMYzdGF3CvJnpABlYTScbIfo=">AAACB3icbZDLSsNAFIYn9VbrLepSkMEitAtLIlLdCEU3XVawF2hCmEyn7dDJJMxMpCV058ZXceNCEbe+gjvfxmmahbb+MPDxn3M4c34/YlQqy/o2ciura+sb+c3C1vbO7p65f9CSYSwwaeKQhaLjI0kY5aSpqGKkEwmCAp+Rtj+6ndXbD0RIGvJ7NYmIG6ABp32KkdKWZx7XS50yvIZn0JFx4I1hVBqXocPCQUqeWbQqViq4DHYGRZCp4ZlfTi/EcUC4wgxJ2bWtSLkJEopiRqYFJ5YkQniEBqSrkaOASDdJ75jCU+30YD8U+nEFU/f3RIICKSeBrzsDpIZysTYz/6t1Y9W/chPKo1gRjueL+jGDKoSzUGCPCoIVm2hAWFD9V4iHSCCsdHQFHYK9ePIytM4rdrVSvbso1m6yOPLgCJyAErDBJaiBOmiAJsDgETyDV/BmPBkvxrvxMW/NGdnMIfgj4/MHhV6WlQ==</latexit>

H(X) = p(x) log p(x)


x
Entropy measures the uncertainty inherent in the variable X

§ The conditional entropy H(Y|X) quantifies the remaining


uncertainty about Y given X:
<latexit sha1_base64="RSzMSnNLVbZ6mrC26pPrbA4TGBA=">AAACKXicbZDLSsNAFIYnXmu9RV26GSxCu7AkItWNUHTTZQV7kbaUyWTSDp1JwsxEGkpfx42v4kZBUbe+iJMmC9v6w8DHf87hzPmdkFGpLOvLWFldW9/YzG3lt3d29/bNg8OmDCKBSQMHLBBtB0nCqE8aiipG2qEgiDuMtJzRbVJvPRIhaeDfqzgkPY4GPvUoRkpbfbNaKz7ALqcubJfgNTyDXRnx/hiGxXEp5VhznLYkFgsGc0bfLFhlaya4DHYGBZCp3jffum6AI058hRmSsmNboepNkFAUMzLNdyNJQoRHaEA6Gn3EiexNZpdO4al2XOgFQj9fwZn7d2KCuJQxd3QnR2ooF2uJ+V+tEynvqjehfhgp4uN0kRcxqAKYxAZdKghWLNaAsKD6rxAPkUBY6XDzOgR78eRlaJ6X7Uq5cndRqN5kceTAMTgBRWCDS1AFNVAHDYDBE3gB7+DDeDZejU/jO21dMbKZIzAn4+cX93+i2A==</latexit>

X X
H(Y | X) = p(x) p(y | x) log p(y | x)
x y

§ The mutual information between X and Y is:

H(Y | X)
<latexit sha1_base64="B8/dUojrkf1aXInT8qJpNsAXiqw=">AAACBXicbVDLSsNAFJ34rPUVdamLwSK0C0siUgURim7qroJtI20ok8m0HTqZhJmJUEI3bvwVNy4Uces/uPNvnLRZaOuB4R7OuZc793gRo1JZ1rexsLi0vLKaW8uvb2xubZs7u00ZxgKTBg5ZKBwPScIoJw1FFSNOJAgKPEZa3vA69VsPREga8js1iogboD6nPYqR0lLXPLgpOhfwvgQvYa2oy3FaYCegPnRKXbNgla0J4DyxM1IAGepd86vjhzgOCFeYISnbthUpN0FCUczION+JJYkQHqI+aWvKUUCkm0yuGMMjrfiwFwr9uIIT9fdEggIpR4GnOwOkBnLWS8X/vHaseuduQnkUK8LxdFEvZlCFMI0E+lQQrNhIE4QF1X+FeIAEwkoHl9ch2LMnz5PmSdmulCu3p4XqVRZHDuyDQ1AENjgDVVADddAAGDyCZ/AK3own48V4Nz6mrQtGNrMH/sD4/AH2zpRw</latexit>

I(X; Y ) = H(Y )

159
Posterior
Predictive Distribution:
Z
High Mutual Information ->
<latexit sha1_base64="b8Y5jHsOhqxbY8tWs/yBKFii+MY=">AAACSXicbZBLSwMxFIUzrc/6qrp0EyxCK6XMiKgboagLlwq2Kp1aM5lMG8w8SO6IZejfc+POnf/BjQtFXJnpdGFbLwROzncvuTlOJLgC03wzcvmZ2bn5hcXC0vLK6lpxfaOpwlhS1qChCOWNQxQTPGAN4CDYTSQZ8R3Brp2H05RfPzKpeBhcQT9ibZ90A+5xSkBbneJ9VL7Fts9d/HS3W9WKQI8SkZwNKvgY2zwAPNEBPQakgu2qBtklo2OjmroZ7BRLZs0cFp4W1kiU0KguOsVX2w1p7LMAqCBKtSwzgnZCJHAq2KBgx4pFhD6QLmtpGRCfqXYyTGKAd7TjYi+U+ujVh+7fiYT4SvV9R3em66pJlpr/sVYM3lE74UEUAwto9pAXCwwhTmPFLpeMguhrQajkeldMe0QSCjr8gg7BmvzytGju1ayD2sHlfql+MopjAW2hbVRGFjpEdXSOLlADUfSM3tEn+jJejA/j2/jJWnPGaGYTjVUu/wsFPa74</latexit>

⇤ ⇤
p(Y | x , D) = p(Y | x , ✓) p(✓ | D) d✓ knowing the model parameters
significantly reduces uncertainty
Epistemic Uncertainty: How much uncertainty left about Y by knowing in the prediction -> uncertainty in
posterior distribution over ✓
<latexit sha1_base64="5bCyd0gYv8CeZMXZW/avxKBuZUk=">AAAB7XicbVBNS8NAEJ3Ur1q/qh69BIvgqSQi1WPRi8cK9gPaUDbbTbt2sxt2J0IJ/Q9ePCji1f/jzX/jts1BWx8MPN6bYWZemAhu0PO+ncLa+sbmVnG7tLO7t39QPjxqGZVqyppUCaU7ITFMcMmayFGwTqIZiUPB2uH4dua3n5g2XMkHnCQsiMlQ8ohTglZq9XDEkPTLFa/qzeGuEj8nFcjR6Je/egNF05hJpIIY0/W9BIOMaORUsGmplxqWEDomQ9a1VJKYmSCbXzt1z6wycCOlbUl05+rviYzExkzi0HbGBEdm2ZuJ/3ndFKPrIOMySZFJulgUpcJF5c5edwdcM4piYgmhmttbXToimlC0AZVsCP7yy6ukdVH1a9Xa/WWlfpPHUYQTOIVz8OEK6nAHDWgChUd4hld4c5Tz4rw7H4vWgpPPHMMfOJ8/p4mPMw==</latexit>

model params (epistemic


uncertainty)
I(Y ; ✓ | D, x⇤ ) = H(Y | x⇤ , D) Ep(✓|D) [H(Y | x⇤ , ✓)]
<latexit sha1_base64="WwjoY0EUo+2r6itrYZSwXMMyoPc=">AAACd3icdVFLT9tAEF4baCHQEuitHFgRqByURjZCUKlCQkAluIFEeCh2o/VmnKxYP7Q7rhpZ/gv8OG78Dy7c2DwONJSRdvXNN9/M7syEmRQaXffRsmdm5z58nF+oLC59+rxcXVm90mmuOLR4KlN1EzINUiTQQoESbjIFLA4lXId3x8P49R9QWqTJJQ4yCGLWS0QkOENDdar3Z87tT+pjH5BRPxZdczHscyaLk7JB//7ertMDeurcjoPGb7xW1On3sRuGxa+yU2TOO6XqJfUlRNierjWS16mvRK+PQadac5vuyOhb4E1AjUzsvFN98Lspz2NIkEumddtzMwwKplBwCWXFzzVkjN+xHrQNTFgMOihGcyvplmG6NEqVOQnSEfs6o2Cx1oM4NMphJ3o6NiT/F2vnGP0ICpFkOULCxw9FuaSY0uESaFco4CgHBjCuhPkr5X2mGEezqooZgjfd8ltwtdP09pp7F7u1w6PJOObJGtkgDvHIPjkkp+SctAgnT9ZXq2ZtWs/2uv3NdsZS25rkfCH/mO29AL6WuwU=</latexit>

Low Mutual Information -> knowing


params does not significantly
reduce uncertainty in prediction ->
We can approximate the posterior using samples from MCMC or uncertainty is primarily due to
variational distribution: S
<latexit sha1_base64="Sm1lieYbVsTEF8sJJKKyc/GGxas=">AAACPnicbVBNaxsxFNS6zZfTJm5zzEXUBOxQzG4Ibi8F0+SQo0vrj+K1zVtZGwtrd4X0tsQs+8t66W/ILcdcckgovfZYre1Da3dAMJr3BmkmUFIYdN07p/Ts+db2zu5eef/Fy4PDyqvXXZOkmvEOS2Si+wEYLkXMOyhQ8r7SHKJA8l4wuyjmvW9cG5HEX3Cu+DCC61iEggFaaVzpqNpX6kdiQm9Gp28tA5wykNllXqc+KKWTG+qHGljm5dnnnPomjcaZ+eDlo+K65sYpRxiZ+rhSdRvuAnSTeCtSJSu0x5Vbf5KwNOIxMgnGDDxX4TADjYJJnpf91HAFbAbXfGBpDBE3w2wRP6cnVpnQMNH2xEgX6t+ODCJj5lFgN4t4Zn1WiP+bDVIM3w8zEasUecyWD4WppJjQoks6EZozlHNLgGlh/0rZFGxZaBsv2xK89cibpHvW8JqN5qfzauvjqo5dckzekBrxyDvSIlekTTqEke/knjySJ+eH8+D8dH4tV0vOynNE/oHz+w+ub62p</latexit>

inherent randomness in the data


1X ⇤
p(Y | x , D) ⇡ p(Y | x⇤ , ✓s ) (aleatoric uncertainty).
S s=1
S
<latexit sha1_base64="ebeW0fz/iEVJtNKPVBB+yEzmudY=">AAACcHicbVFdT9swFHXCPljZRwcvk9iEt2pSi6YqQQj2MgmxTeIRxApMdRo5rtNa2Ill3yAqK8/7f3vjR/CyXzCn7cMGXMnS8Tn3XNvHmZbCQhTdBOHKo8dPnq4+a609f/HyVfv1+pktK8P4gJWyNBcZtVyKgg9AgOQX2nCqMsnPs8uvjX5+xY0VZfEDZponik4KkQtGwVNp+xdRFKZZ5r7XqdNdAlMOFBMlxniuMCrdt7pXYyJ5DkN81P25UK9H25/wor2HiRGTKSSYUK1NeY1Jbihzce1OvdFWKnX2S1yPmu1DA0a2l7Y7UT+aF74P4iXooGUdp+3fZFyySvECmKTWDuNIQ+KoAcEkr1ukslxTdkknfOhhQRW3iZsHVuOPnhnjvDR+FYDn7L8OR5W1M5X5ziYEe1dryIe0YQX558SJQlfAC7Y4KK8khhI36eOxMJyBnHlAmRH+rphNqQ8L/B+1fAjx3SffB2c7/Xivv3ey2zk4XMaxijbRB9RFMdpHB+gIHaMBYug22AjeBu+CP+GbcCt8v2gNg6VnA/1X4fZfh1+6xg==</latexit>

⇤ 1X
Ep(✓|D) [H(Y | x , ✓)] ⇡ H(Y | x⇤ , ✓s )
S s=1
160
1
<latexit sha1_base64="UK879WXAOc7iKCTgGU4H/rMWXPM=">AAACenichVHBThsxEPVuC4TQ0tAeEZJFRAmCRrsoor0gIbhwpBIBpGyIvN7ZYOFdr+xZ1Mjaj+iv9caXcOGAN+QApBIj2X567409nokLKQwGwb3nf/i4sLjUWG6ufPq8+qW19vXCqFJz6HMllb6KmQEpcuijQAlXhQaWxRIu49uTWr+8A22Eys9xUsAwY+NcpIIzdNSo9bfoTOghDWmUicRtDG/i1P6p9mgUK5mYSeYOG8WArNpxxsiIccY68+J1hKp4ccHUnGrGbVjZkO5SuLY/3k+rqlGrHXSDadB5EM5Am8zibNT6FyWKlxnkyCUzZhAGBQ4t0yi4hKoZlQYKxm/ZGAYO5iwDM7TT1lV0yzEJTZV2K0c6ZV9mWJaZumDnrGs0b7Wa/J82KDH9NbQiL0qEnD8/lJaSoqL1HGgiNHCUEwcY18LVSvkNc/1CN62ma0L49svz4GK/Gx50D3732kfHs3Y0yDrZJB0Skp/kiJySM9InnDx4G953b9t79Df9HX/32ep7s5xv5FX4vSd1h8Fj</latexit>

>
p(y = 1 | x, ) = ( x) = >x
1+e

p(D | ) p( )
<latexit sha1_base64="0IeZImacoYXxaHSKiTKeIwqxjog=">AAACaHicbVFbS8MwFE7rbW5e6g0RX8KGsIGMVmT6Igz1wccJ7gLrGGmabmHphSQVRin+R9/8Ab74K0y3gu5yIMnHd75zcvLFiRgV0jS/NH1jc2t7p7BbLO3tHxwaR8cdEcYckzYOWch7DhKE0YC0JZWM9CJOkO8w0nUmT1m++064oGHwJqcRGfhoFFCPYiQVNTQ+oqrthMwVU18die0QiVJo+9RVG5JjjFjynNbgA7Q9jnCi5H90rlupr0H7Gq5rXEsXG9TSoVEx6+Ys4CqwclABebSGxqfthjj2SSAxQ0L0LTOSgwRxSTEjadGOBYkQnqAR6SsYIJ+IQTIzKoVXinGhF3K1Agln7P+KBPkiG1gpsyHFci4j1+X6sfTuBwkNoliSAM8v8mIGZQgz16FLOcGSTRVAmFM1K8RjpAyV6m+KygRr+cmroHNTtxr1xuttpfmY21EAl6AMqsACd6AJXkALtAEG31pJO9XOtB/d0M/1i7lU1/KaE7AQevkXR/a6bw==</latexit>

p( | D) =
p(D)
Logistic likelihood: No conjugate prior
N N h iyi h i1
<latexit sha1_base64="gdNPsS9uYzwq1pk1T2YNLNZZ8lA=">AAAC7XicnVJNaxQxGM5M/ajr11aPXoKLsAVdZkSql0JRD56kgtsWNrNDJpPZjU0mIXlHXML8By8eFPHq//HmvzGzHcR2PYgvhDy8b57n/UphpHCQJD+jeOvS5StXt68Nrt+4eev2cOfOkdONZXzKtNT2pKCOS1HzKQiQ/MRYTlUh+XFx+qKLH7/n1gldv4WV4Zmii1pUglEIrnwn2jJjoigsGZX+ZYuJEiUmhZalW6lweVJwoO0u3sfEWF3mXuyn7dy/brEZr3LREzqFovIf2lw8/Ec6kbyCGSZOLBQdb3LmBLQ5p7yLiRWLJWRzHzL/Vkjxo/9U6ZidUj4cJZNkbXgTpD0Yod4O8+EPUmrWKF4Dk9S5WZoYyDy1IJjk7YA0jhvKTumCzwKsqeIu8+tttfhB8JS40jacGvDa+yfDU+W6HsLLrmx3MdY5/xabNVA9y7yoTQO8ZmeJqkZi0LhbPS6F5QzkKgDKrAi1YrakljIIH2QQhpBebHkTHD2epHuTvTdPRgfP+3Fso3voPhqjFD1FB+gVOkRTxKJ30cfoc/Ql1vGn+Gv87expHPWcu+icxd9/AXwZ7YU=</latexit>

Y Y yi
> >
p(D | ) = p(yi | xi , ) = ( xi ) 1 ( xi )
i=1 i=1

<latexit sha1_base64="wOsgiJuxSpwM84pA3v9dpwuTloI=">AAACS3icbZBNS8NAEIY39avWr6hHL4tFqCAlEamCCEUvnqSirUJTymS7rUt3k7C7EUrI//PixZt/wosHRTy4aXuwHwPLvjwzw8y8fsSZ0o7zbuUWFpeWV/KrhbX1jc0te3unocJYElonIQ/low+KchbQumaa08dIUhA+pw9+/yrLPzxTqVgY3OtBRFsCegHrMgLaoLbtRyXPD3lHDYT5Es+nGtJDfIE9AfqJAE9u0jkV53iCiThtO0eT7I71BBh82LaLTtkZBp4V7lgU0ThqbfvN64QkFjTQhINSTdeJdCsBqRnhNC14saIRkD70aNPIAARVrWToRYoPDOngbijNCzQe0v8dCQiVrWgqswvVdC6D83LNWHfPWgkLoljTgIwGdWOOdYgzY3GHSUo0HxgBRDKzKyZPIIFoY3/BmOBOnzwrGsdlt1Ku3J4Uq5djO/JoD+2jEnLRKaqia1RDdUTQC/pAX+jberU+rR/rd1Sas8Y9u2gickt/vAe1Ag==</latexit>

Gaussian prior : p( ) = N ( ; µ0 , ⌃0 )
d
<latexit sha1_base64="aP5+D+EL23yVRga+SDHz+y/R9DA=">AAACOXicbVC7TsMwFHV4U14FRhaLCgkGqgShwlIJwcJYJEqRmhI5zg0YnDiyHaTK5LdY+As2JBYGEGLlB3DbDLyOZPn4nHtt3xNmnCntuk/O2PjE5NT0zGxlbn5hcam6vHKmRC4ptKngQp6HRAFnKbQ10xzOMwkkCTl0wpujgd+5BamYSE91P4NeQi5TFjNKtJWCaivb9EPBI9VP7Gb8EDQptnAT+5kUUWCum15xYaIC+7Ek1PjcXh2RwuwUGC7MdnnGd8PG4PquCKo1t+4Ogf8SryQ1VKIVVB/9SNA8gVRTTpTqem6me4ZIzSiHouLnCjJCb8gldC1NSQKqZ4aTF3jDKhGOhbQr1Xiofu8wJFGD0WxlQvSV+u0NxP+8bq7j/Z5haZZrSOnooTjnWAs8iBFHTALVvG8JoZLZv2J6RWxE2oZdsSF4v0f+S8526l6j3jjZrR0clnHMoDW0jjaRh/bQATpGLdRGFN2jZ/SK3pwH58V5dz5GpWNO2bOKfsD5/ALI+q4m</latexit>

Y
Laplace Prior: p( ) = e | j| enforce sparsity
j=1
2
161
Posterior predictive distribution:
Z
<latexit sha1_base64="dlnB7fxa7yxQmivu8eVHOcyK2K4=">AAACi3icfVFbSwJBFJ7dbmZZVo+9DElgIrJboRIF0gV6NMgLuCqzs7M6OHthZjZaFv9MP6m3/k2zamAaHZiZb875Ps7NDhkV0jC+NH1jc2t7J7Ob3dvPHRzmj47bIog4Ji0csIB3bSQIoz5pSSoZ6YacIM9mpGNPHtJ4541wQQP/VcYh6Xto5FOXYiSVa5j/CIvxoATvoAktjzrqQnJsu8n7dFAqz38YseRxeqE4FvUl/F9gB8wRsaeexLKJREpnlZVmPbAk/8mgmM46cZgvGBVjZnAdmAtQAAtrDvOflhPgyCO+xAwJ0TONUPYTxCXFjEyzViRIiPAEjUhPQR95RPST2Syn8Fx5HOgGXB3V7cy7rEiQJ9LyFDMtXazGUudfsV4k3Xo/oX4YSeLjeSI3YlAGMF0MdCgnWLJYAYQ5VbVCPEYcYanWl1VDMFdbXgfty4pZrVRfrguN+8U4MuAUnIEiMEENNMAzaIIWwFpGq2g1ra7n9Cv9Rr+dU3VtoTkBv0x/+gaRdML/</latexit>

p(y ⇤ = 1 | x⇤ , D) = p(y ⇤ = 1 | x⇤ , ) p( | D) d

Approximating the predictive distribution using variational inference:


<latexit sha1_base64="0/Y56QwqHQOIMT0sQYw0ck/UowI=">AAACgHicfVFdS8MwFE3r15xfUx99CQ5hypytiIogiPrgo4JTYZ3jNk01mDYxSWWj7Hf4v3zzxwim2x7UqReSnJxzD7n3JpScaeN57447MTk1PVOaLc/NLywuVZZXbrTIFKFNIrhQdyFoyllKm4YZTu+kopCEnN6GT2eFfvtClWYivTY9SdsJPKQsZgSMpTqVV1nr3W/hY+zjIGGR3cA8hnHe7d9v1Yc3Ajw/72/iAKRUoosDlhr8vy0UPNK9xB55EFIDhbuOn2t/CNE436lUvYY3CDwO/BGoolFcdipvQSRIltDUEA5at3xPmnYOyjDCab8cZJpKIE/wQFsWppBQ3c4HA+zjDctEOBbKLtvcgP3qyCHRRXk2s2hU/9QK8jetlZn4sJ2zVGaGpmT4UJxxbAQufgNHTFFieM8CIIrZWjF5BAXE2D8r2yH4P1seBze7DX+/sX+1Vz05HY2jhNbQOqohHx2gE3SBLlETEfThVJ26s+26bs3dcf1hquuMPKvoW7hHn53SwAI=</latexit>

Z
p(y ⇤ = 1 | x⇤ , D) ⇡ p(y ⇤ = 1 | x⇤ , ) q( ) d
Approximating the predictive distribution using MCMC:
S
<latexit sha1_base64="PJs7HzpAdpzPGRTSMejj0VxxcvY=">AAACbHicfVFNTxsxEPVuoYT0g1A4gFAlqxFVglC0ixDlgoSAA0eqEkDKJtGs1wtWvGvL9lZE1p74h9z4CVz4DXiTHPioOpLtN29m7PGbWHKmTRA8eP6HufmPC7XF+qfPX74uNZa/XWhRKEK7RHChrmLQlLOcdg0znF5JRSGLOb2MR8dV/PIvVZqJ/NyMJe1ncJ2zlBEwjho27mRrPNjCBzjEUcYSt4G5iVN7Ww62tqceAW5PyjaOQEolbnGUKiA2LO2fEke6yIZWH4TloHL/f1kseKLHmTtsFFMDrqal22V72GgGnWBi+D0IZ6CJZnY2bNxHiSBFRnNDOGjdCwNp+haUYYTTsh4VmkogI7imPQdzyKju24lYJd50TIJTodzKDZ6wLyssZLrq0mVW7eu3sYr8V6xXmHS/b1kuC0NzMn0oLTg2AlfK44QpSgwfOwBEMdcrJjfgtDRuPnUnQvj2y+/BxU4n3Ovs/d5tHh7N5KihDfQDtVCIfqFDdIrOUBcR9OgteWveuvfkr/ob/vdpqu/NalbQK/N/PgNTJLkO</latexit>

⇤ 1X
⇤ (s)
p(y = 1 | x , D) ⇡ p(y ⇤ = 1 | x⇤ , )
S s=1
Mutual information as epistemic uncertainty:
S
<latexit sha1_base64="gVhDgaa4CDGhJfB8u9BdADcLD18=">AAACoHicjVFRSxwxEM6uturZ1mv7KIXgIdyJHrsitiAF0T6cYKmip9Lbu2M2m9VgdrMkWfEI+V39H33rv2n27h7UU3AgyTffzCRfZuKCM6WD4J/nz82/ebuwuFRbfvf+w0r946cLJUpJaJcILuRVDIpyltOuZprTq0JSyGJOL+Pbwyp+eUelYiI/16OC9jO4zlnKCGhHDet/jppRLHiiRpk7TBRTDXYPjwYbOMpY4jbQN3Fq7u1gY3PiEeDmh23hCIpCinvcab4mewtHqQRiQmvOLI5UmQ2N+h7aQeW+fMWMtoFpqpZtDeuNoB2MDc+CcAoaaGonw/rfKBGkzGiuCQelemFQ6L4BqRnh1NaiUtECyC1c056DOWRU9c24wRavOybBqZBu5RqP2YcVBjJVqXSZlXz1NFaRz8V6pU6/9Q3Li1LTnEweSkuOtcDVtHDCJCWajxwAIpnTiskNuD5qN9Oaa0L49Muz4GK7He62d093GvsH03YsolW0hpooRF/RPuqgE9RFxPviHXrH3k9/ze/4v/zTSarvTWs+o0fm//4PSxjMrA==</latexit>

⇤ ⇤ ⇤ ⇤ 1X (s)
I( ; y | x , D) ⇡ H(y | x , D) H(y ⇤ | x⇤ , )
S s=1
<latexit sha1_base64="iZNeKgdfG9y6wZFdxFuYtXsud7M=">AAADCHicpVJLSwMxEM6u7/pq9ShIsAhVtOyKqJdC0UuPFawK3bZks9k2NLtZkqxYlj168a948aCIV3+CN/+N2VrweVEHkvn4Zj5mMhk3YlQqy3oxzLHxicmp6Znc7Nz8wmK+sHQqeSwwaWDOuDh3kSSMhqShqGLkPBIEBS4jZ27/KIufXRAhKQ9P1CAirQB1Q+pTjJSmOgVjtVYatDehE1BPX0j1XD+5TNubW9BxOfPkINAucVyiUNpOSnIj3YAVuA2joawC7d9JHca7f9W+F7X+UfR32k6+aJWtocHvwB6BIhhZvZN/djyO44CECjMkZdO2ItVKkFAUM5LmnFiSCOE+6pKmhiEKiGwlw49M4bpmPOhzoU+o4JD9qEhQILMudWbWvvway8ifYs1Y+QethIZRrEiI3wr5MYOKw2wroEcFwYoNNEBYUN0rxD0kEFZ6d3J6CPbXJ38Hpztle6+8d7xbrB6OxjENVsAaKAEb7IMqqIE6aABsXBk3xp1xb16bt+aD+fiWahojzTL4ZObTK/Fh8No=</latexit>

(s) (s) (s) (s) (s)


H(y ⇤ | x⇤ , )= p(y ⇤ = 1 | x⇤ , ) log p(y ⇤ = 1 | x⇤ , ) p(y ⇤ = 0 | x⇤ , ) log p(y ⇤ = 0 | x⇤ , )

H(y ⇤ | x⇤ , D) = p(y ⇤ = 1 | x⇤ , D) log p(y ⇤ = 1 | x⇤ , D) p(y ⇤ = 0 | x⇤ , D) log p(y ⇤ = 0 | x⇤ , D)


<latexit sha1_base64="fJjF8bslupayImSFr+XevjpR47U=">AAACxniclVFbSwJBFJ7dbmY3q8dehiQwKdmNsF4EyR58NMgLuCqz46wOzs4uM7OVLEK/sbce+i+Nq1BpIB6Y4ZvvfOcy57gho1JZ1qdhbmxube+kdtN7+weHR5njk4YMIoFJHQcsEC0XScIoJ3VFFSOtUBDku4w03VFl6m++ECFpwJ/VOCQdHw049ShGSlO9zFc1N+7moePTvr6QGrpe/Dbp5q9mL4xY/Di5hCV4DcNEWYL2SrXDgsEa8p/U1nqpV8p7maxVsBKDy8CegyyYW62X+XD6AY58whVmSMq2bYWqEyOhKGZkknYiSUKER2hA2hpy5BPZiZM1TOCFZvrQC4Q+XMGE/R0RI1/Kse9q5bRHueibkv/52pHy7jsx5WGkCMezQl7EoArgdKewTwXBio01QFhQ3SvEQyQQVnrzaT0Ee/HLy6BxU7CLheLTbbb8MB9HCpyBc5ADNrgDZVAFNVAH2KgY1BCGNKsmNyPzdSY1jXnMKfhj5vs31nHVbg==</latexit>

162
MLP with point estimation of weights MLP with marginal distribution of weights

163

You might also like