Uncertainty Prediction
Uncertainty Prediction
in data or environments.
§ Irreducible through additional data.
§ Modeled using probability distributions.
§ Examples:
§ Sensor measurement noise.
§ Natural fluctuations in weather patterns.
144
§ Statistical Techniques
§ Confidence Intervals: Estimate the range of possible values for a parameter.
§ Prediction Intervals: Predict the range where unseen observations are likely to fall.
§ Probabilistic Modeling
§ Bayesian Methods: Incorporate prior knowledge and update beliefs with new data.
§ Monte Carlo Simulations: Use random sampling to model uncertainty propagation.
§ Ensemble Methods
§ Model Ensembles: Combine multiple models to capture a range of predictions.
§ Bootstrap Methods: Resample data to estimate the variability of predictions.
§ Sensitivity Analysis
§ Evaluate how changes in inputs affect outputs across the entire input space.
145
The goal is to predict the value of target given the features/input variables
In a regression problem the target is continuous variable
Linear regression with one variable
ŷ = fw (xi )
Hypothesis space:
fw (x) = wT x + b
|fw (xi ) yi |
Optimization problem: fw (x) = w1 x + b Prediction error
N
<latexit sha1_base64="3ykn1I99UJq26G2D0S53VKZsFbk=">AAACMXicbVDNSgMxGMz6b/1b9eglWAotaNktUr0IohdPomBV6LZLNs3W0CS7JFm1hH0lL76JePGgiFdfwrRW0OpAYJj5PvLNRCmjSnveszMxOTU9Mzs3X1hYXFpecVfXLlSSSUwaOGGJvIqQIowK0tBUM3KVSoJ4xMhl1Dsa+Jc3RCqaiHPdT0mLo66gMcVIWyl0jwNORWgCjvR1FJvbfCvKYaAyHhq67+ftExjEEmHj56aWl+Pwtvw9etemeQVuw36bVto1GLpFr+oNAf8Sf0SKYITT0H0MOgnOOBEaM6RU0/dS3TJIaooZyQtBpkiKcA91SdNSgThRLTNMnMOSVTowTqR9QsOh+nPDIK5Un9sspcG5atwbiP95zUzHey1DRZppIvDXR3HGoE7goD7YoZJgzfqWICypvRXia2Qb0rbkgi3BH4/8l1zUqn69Wj/bKR4cjuqYAxtgE5SBD3bBATgGp6ABMLgHT+AFvDoPzrPz5rx/jU44o5118AvOxydgPqme</latexit>
X 1
min (fw (xi ) y i )2
w,b
i=1
2
146
For example if we have features x1
and x2 = 2x1 then x1 and x2 are
not linearly independent
We can solve linear regression analytically
instead of gradient descent. Invertible if column of X are linearly independent
Xw = y w = (X T X) 1
XT y
Row i xi =
Pseudo-inverse of X
m⇥n n⇥1 𝑚× 1
Normal Equation Gradient Descent
Number of features
2
O(mn + n ) 3 O(kmn)
Number of data points Number of iteration
No learning rate
Works better if we have
Not invertible
many features (>10000)
In practice we use SVD for
1 finding the inverse of X which is
w=X y more stable that pseudo-inverse
147
P (y i ; w) = N (y i |fw (xi ), 2
)
y = Xw + ✏
<latexit sha1_base64="mZbe3Rh92HBDjw9QF5XPKEZtK+0=">AAACFnicbVDLSsNAFL3xWeur6tLNYBEEsSQi1Y1QdOOygn1AG8pkOmmHTh7MTJQQ8hVu/BU3LhRxK+78GydtFG09MHDmnHu59x4n5Ewq0/w05uYXFpeWCyvF1bX1jc3S1nZTBpEgtEECHoi2gyXlzKcNxRSn7VBQ7DmctpzRZea3bqmQLPBvVBxS28MDn7mMYKWlXumo62E1dNwkTtE5+v600x96l6JD1KWhZDyrL5sVcww0S6yclCFHvVf66PYDEnnUV4RjKTuWGSo7wUIxwmla7EaShpiM8IB2NPWxR6WdjM9K0b5W+sgNhH6+QmP1d0eCPSljz9GV2bJy2svE/7xOpNwzO2F+GCnqk8kgN+JIBSjLCPWZoETxWBNMBNO7IjLEAhOlkyzqEKzpk2dJ87hiVSvV65Ny7SKPowC7sAcHYMEp1OAK6tAAAvfwCM/wYjwYT8ar8TYpnTPynh34A+P9Cz3dn3A=</latexit>
148
y = Xw + ✏
<latexit sha1_base64="mZbe3Rh92HBDjw9QF5XPKEZtK+0=">AAACFnicbVDLSsNAFL3xWeur6tLNYBEEsSQi1Y1QdOOygn1AG8pkOmmHTh7MTJQQ8hVu/BU3LhRxK+78GydtFG09MHDmnHu59x4n5Ewq0/w05uYXFpeWCyvF1bX1jc3S1nZTBpEgtEECHoi2gyXlzKcNxRSn7VBQ7DmctpzRZea3bqmQLPBvVBxS28MDn7mMYKWlXumo62E1dNwkTtE5+v600x96l6JD1KWhZDyrL5sVcww0S6yclCFHvVf66PYDEnnUV4RjKTuWGSo7wUIxwmla7EaShpiM8IB2NPWxR6WdjM9K0b5W+sgNhH6+QmP1d0eCPSljz9GV2bJy2svE/7xOpNwzO2F+GCnqk8kgN+JIBSjLCPWZoETxWBNMBNO7IjLEAhOlkyzqEKzpk2dJ87hiVSvV65Ny7SKPowC7sAcHYMEp1OAK6tAAAvfwCM/wYjwYT8ar8TYpnTPynh34A+P9Cz3dn3A=</latexit>
ŵ = (X> X) 1 X> y
<latexit sha1_base64="3zqcvpIrNbDh9FpVxmp/AnleRG4=">AAACOHicdVDLSgMxFM3UV62vqks3wSLUhWVGpLoRim7cWcE+oNOWTJppQzMPkjtKGeaz3PgZ7sSNC0Xc+gWmD1FbPRA495x7yb3HCQVXYJqPRmpufmFxKb2cWVldW9/Ibm5VVRBJyio0EIGsO0QxwX1WAQ6C1UPJiOcIVnP650O/dsOk4oF/DYOQNT3S9bnLKQEttbOXdo9AbHsEeo4b3yYJPsX5r7KetGwIQvxd77fiAyvB/zUMknY2ZxbMEfAssSYkhyYot7MPdiegkcd8oIIo1bDMEJoxkcCpYEnGjhQLCe2TLmto6hOPqWY8OjzBe1rpYDeQ+vmAR+rPiZh4Sg08R3cON1TT3lD8y2tE4J40Y+6HETCfjj9yI4EhwMMUcYdLRkEMNCFUcr0rpj0iCQWddUaHYE2fPEuqhwWrWCheHeVKZ5M40mgH7aI8stAxKqELVEYVRNEdekIv6NW4N56NN+N93JoyJjPb6BeMj0+zwq4v</latexit>
= w + (X> X) 1 X> ✏
<latexit sha1_base64="ZPukaga9DtFb2EWubouKVm7FSWE=">AAACM3icbVDLSgMxFM3UV62vqks3wSJUxDIjUt0IRTfiqoJ9QKctmTTThmYmQ5JRyjD/5MYfcSGIC0Xc+g9m2hG19UDg3HMfufc4AaNSmeazkZmbX1hcyi7nVlbX1jfym1t1yUOBSQ1zxkXTQZIw6pOaooqRZiAI8hxGGs7wIsk3bomQlPs3ahSQtof6PnUpRkpL3fwVPIO2h9TAcaO7GB7A4nfUjDu24gH8ifc70aEVw5kCEkjKkmEFs2SOAWeJlZICSFHt5h/tHsehR3yFGZKyZZmBakdIKIoZiXN2KEmA8BD1SUtTH3lEtqPxzTHc00oPulzo5ys4Vn93RMiTcuQ5ujLZV07nEvG/XCtU7mk7on4QKuLjyUduyKDiMDEQ9qggWLGRJggLqneFeIAEwkrbnNMmWNMnz5L6Uckql8rXx4XKeWpHFuyAXVAEFjgBFXAJqqAGMLgHT+AVvBkPxovxbnxMSjNG2rMN/sD4/AJ9Earu</latexit>
2
(X> X) 1
X> X(X> X) 1
<latexit sha1_base64="hmqagN5O0O5RobfrKyHvckScUXI=">AAACU3ichVHPS8MwGE3r1DmdVj16CQ5hHhztkOlFGHrxOMH9gLUbaZZucekPklQYpf+jCB78R7x40HQrqJviB4GX977H9+XFjRgV0jRfNX2tsL6xWdwqbe+Ud/eM/YOOCGOOSRuHLOQ9FwnCaEDakkpGehEnyHcZ6brTm0zvPhIuaBjcy1lEHB+NA+pRjKSihsYDvIK2oGMfDeqwavtITlwv6aUDW4YR/LqfDpIzK4V/N/xvHhoVs2bOC64CKwcVkFdraDzboxDHPgkkZkiIvmVG0kkQlxQzkpbsWJAI4Skak76CAfKJcJJ5Jik8UcwIeiFXJ5Bwzn53JMgXYua7qjPbUyxrGfmb1o+ld+kkNIhiSQK8GOTFDMoQZgHDEeUESzZTAGFO1a4QTxBHWKpvKKkQrOUnr4JOvWY1ao2780rzOo+jCI7AMagCC1yAJrgFLdAGGDyBN/ChAe1Fe9d1vbBo1bXccwh+lF7+BMMcs/0=</latexit>
=
2
(X> X) 1
<latexit sha1_base64="oeqeK3O88BLsP5acDqaiewWNjzc=">AAACFXicbVDLSsNAFJ3UV62vqEs3g0WooCUpUt0IRTcuK9gHNGmZTCft0EkmzEyEEvoTbvwVNy4UcSu482+ctAG19cDAmXPu5d57vIhRqSzry8gtLa+sruXXCxubW9s75u5eU/JYYNLAnHHR9pAkjIakoahipB0JggKPkZY3uk791j0RkvLwTo0j4gZoEFKfYqS01DNPLqEj6SBA3QosOQFSQ89P2pOuo3gEf/7H3eTUnvTMolW2poCLxM5IEWSo98xPp89xHJBQYYak7NhWpNwECUUxI5OCE0sSITxCA9LRNEQBkW4yvWoCj7TShz4X+oUKTtXfHQkKpBwHnq5M95TzXir+53Vi5V+4CQ2jWJEQzwb5MYOKwzQi2KeCYMXGmiAsqN4V4iESCCsdZEGHYM+fvEialbJdLVdvz4q1qyyOPDgAh6AEbHAOauAG1EEDYPAAnsALeDUejWfjzXifleaMrGcf/IHx8Q1T655W</latexit>
= 149
a new observation
predicted mean
ŷ0 = x>
<latexit sha1_base64="SP64ja7jPOUXk5qBhLe3rRtEjO4=">AAACD3icbVDLSsNAFJ3UV62vqEs3g0VxVRKR6kYounFZwT6giWEynbRDJw9mbtQS8gdu/BU3LhRx69adf2OSdqGtBy4czrmXe+9xI8EVGMa3VlpYXFpeKa9W1tY3Nrf07Z22CmNJWYuGIpRdlygmeMBawEGwbiQZ8V3BOu7oMvc7d0wqHgY3MI6Y7ZNBwD1OCWSSox9aQwLJOHUMfI4fHOPWgjDChWj5BIaul9ynacXRq0bNKIDniTklVTRF09G/rH5IY58FQAVRqmcaEdgJkcCpYGnFihWLCB2RAetlNCA+U3ZS/JPig0zpYy+UWQWAC/X3REJ8pca+m3XmN6pZLxf/83oxeGd2woMoBhbQySIvFhhCnIeD+1wyCmKcEUIlz27FdEgkoZBFmIdgzr48T9rHNbNeq1+fVBsX0zjKaA/toyNkolPUQFeoiVqIokf0jF7Rm/akvWjv2sektaRNZ3bRH2ifP3YHnFQ=</latexit>
0 ŵ
0 ŵ) = x0 Var(ŵ)x0 =
y0 = x >
<latexit sha1_base64="Wziio1bFcmyhjGNXVDAzphUs+pw=">AAACGnicbVDLSsNAFJ3UV62vqks3g0UQhJKIVDdC0Y3LCvYBTQyT6aQdOnkwc6OW0O9w46+4caGIO3Hj3zhps9DWAwOHc+5h7j1eLLgC0/w2CguLS8srxdXS2vrG5lZ5e6elokRS1qSRiGTHI4oJHrImcBCsE0tGAk+wtje8zPz2HZOKR+ENjGLmBKQfcp9TAlpyyxbWGLkmPscPrnlrQxRje0AgtQMCA89P78djfIRtFisudMAsueWKWTUnwPPEykkF5Wi45U+7F9EkYCFQQZTqWmYMTkokcCrYuGQnisWEDkmfdTUNScCUk05OG+MDrfSwH0n9QsAT9XciJYFSo8DTk9nCatbLxP+8bgL+mZPyME6AhXT6kZ8IDBHOesI9LhkFMdKEUMn1rpgOiCQUdJtZCdbsyfOkdVy1atXa9UmlfpHXUUR7aB8dIgudojq6Qg3URBQ9omf0it6MJ+PFeDc+pqMFI8/soj8wvn4A6OyfAQ==</latexit>
Prediction
0 ŵ + ✏0
<latexit sha1_base64="dZnGqicgtNKW1ZJQ7E+7JQKN5Uk=">AAACxXiclVFNaxsxENVu+pG6H3GaYy+ipsWm1OyGkuZSCOmhPaZQOwbLXrTyrFdE+1FpNmRZlvzH3Ar9MZVslyZOLx0QPL03bzSaiUslDQbBT8/fefDw0ePdJ52nz56/2OvuvxybotICRqJQhZ7E3ICSOYxQooJJqYFnsYLz+OKz088vQRtZ5N+xLmGW8WUuEyk4Wirq/mIIV9iMuW77dRQM6NtP9BbFUo5N3Trh3R0eSiOVLWAFxjrOZOQy4/NDehUFc4ZFSfss45jGSTNp18Tf+2DevA9bl+rK/nFuFWIKEuzT0Kb8Z02m5TLFQSfq9oJhsAp6H4Qb0CObOIu6N2xRiCqDHIXixkzDoMRZwzVKoaDtsMpAycUFX8LUwpxnYGbNagstfWOZBU0KbU+OdMXedjQ8M6bOYpvpmjbbmiP/pU0rTI5njczLCiEX64eSSlEsqFspXUgNAlVtARda2l6pSLnmAu3i3RDC7S/fB+PDYXg0PPr2oXdyuhnHLnlFXpM+CclHckK+kjMyIsI79VLvh6f9L37mo3+5TvW9jeeA3An/+jcB/dda</latexit>
§ Use z-Values:
§ Sample size is large (n≥30).
§ Degrees of freedom are high.
P (µ
151
Standard error of prediction:
q
<latexit sha1_base64="W3w6a0/mpjRSNXn1ZdKs95i0oME=">AAACT3icbZHPixMxFMczdV3b7rpb9eglWIQWscwsUr0IxUXw2GXtD+i0QybNtKGZHyZvpCXkP/Sy3vw3vHhQxEw7sLutDwKffN97vJdvwkxwBa77w6k8OHp4/Khaq5+cPj47bzx5OlRpLikb0FSkchwSxQRP2AA4CDbOJCNxKNgoXF0W+dFXJhVPk8+wydg0JouER5wSsFLQiHxga9DXH02gd2jb58a0NoHbxu+xvySgfcUXMTHYV18kaA+/wuvAnfmQZrjlxwSWYaTHZifc3tsz/dozRampB42m23G3gQ/BK6GJyugHje/+PKV5zBKggig18dwMpppI4FQwU/dzxTJCV2TBJhYTEjM11Vs/DH5plTmOUmlPAnir3u3QJFZqE4e2sthW7ecK8X+5SQ7Ru6nmSZYDS+huUJQLDCkuzMVzLhkFsbFAqOR2V0yXRBIK9gsKE7z9Jx/C8KLjdTvdqzfN3ofSjip6jl6gFvLQW9RDn1AfDRBF39BP9Bv9cW6cX87fSllacUp4hu5FpfYPaQW0AQ==</latexit>
152
The regions that are farther
away from data has higher
uncertainty. This is captured
by x> > 1
<latexit sha1_base64="tfp0H0VEIH1IgV0L6Ld5agZf4tU=">AAACGXicbVDLSsNAFJ3UV62vqEs3g0WoC0siUl0W3bisYB/QpGEynbRDJ5kwMxFL6G+48VfcuFDEpa78GydtQG09MHDmnHu59x4/ZlQqy/oyCkvLK6trxfXSxubW9o65u9eSPBGYNDFnXHR8JAmjEWkqqhjpxIKg0Gek7Y+uMr99R4SkPLpV45i4IRpENKAYKS15pgXvPavnKB7DihMiNfSDtDOZCT//4156Yk+y0pJnlq2qNQVcJHZOyiBHwzM/nD7HSUgihRmSsmtbsXJTJBTFjExKTiJJjPAIDUhX0wiFRLrp9LIJPNJKHwZc6BcpOFV/d6QolHIc+royW1bOe5n4n9dNVHDhpjSKE0UiPBsUJAwqDrOYYJ8KghUba4KwoHpXiIdIIKx0mFkI9vzJi6R1WrVr1drNWbl+mcdRBAfgEFSADc5BHVyDBmgCDB7AE3gBr8aj8Wy8Ge+z0oKR9+yDPzA+vwHHo5+J</latexit>
0 (X X) x0
153
Likelihood Prior
2
<latexit sha1_base64="QNxAiAFWjHdriDtB7c7YnXdVHQg=">AAAClHicfVFbS8MwFE7rbc7bpuCLL8EhbDBGO0R9UJiK4JMouAusc6RZuoUlbUlSpZT+Iv+Nb/4b0zlvm3og8OU758s5+Y4bMiqVZb0a5sLi0vJKbjW/tr6xuVUobrdkEAlMmjhggei4SBJGfdJUVDHSCQVB3GWk7Y4vs3z7kQhJA/9exSHpcTT0qUcxUprqF57DssORGrle8pRCh9MB/LjHafUTdzIs6ZCjh3oFnkHHEwgnX9p4Rvv0p9apwu8tK+k/r/xQpvl+oWTVrEnAeWBPQQlM47ZfeHEGAY448RVmSMqubYWqlyChKGYkzTuRJCHCYzQkXQ19xInsJRNTU3igmQH0AqGPr+CE/a5IEJcy5q6uzMaVs7mM/C3XjZR30kuoH0aK+Pi9kRcxqAKYbQgOqCBYsVgDhAXVs0I8QtpvpfeYmWDPfnketOo1+6h2dHdYalxM7ciBPbAPysAGx6ABrsEtaAJsFI1jo2Gcm7vmqXlpXr2XmsZUswN+hHnzBi0eyGs=</latexit>
2
) = N (w | wN , ⇤N1 )
<latexit sha1_base64="2GdsUO9YCYAEqAnRfzDzDoWLdR4=">AAACZnicdVFNSwMxEM2un61fVREPXoJFUNCyW0S9CKIXD1IUrBa6tcxms21osrskWaUs+ye9efbizzDbVvFzIOTlzTxm5sVPOFPacV4se2p6ZnZuvlReWFxaXqmsrt2pOJWENknMY9nyQVHOItrUTHPaSiQF4XN67w8uivz9I5WKxdGtHia0I6AXsZAR0IbqVvJk1xOg+36YPeXYEyzAH+9Wvv+JhwVWrCfgob6HT8c8AZ418n/1T3m3YVR+zAM1FObKvCszWACGf8gO3Hyv3K1UnZozCvwbuBNQRZO47laevSAmqaCRJhyUartOojsZSM0Ip3nZSxVNgAygR9sGRiCo6mQjm3K8Y5gAh7E0J9J4xH5VZCBUMampLFZQP3MF+VeunerwpJOxKEk1jci4UZhyrGNceI4DJinRfGgAEMnMrJj0QQLR5mcKE9yfK/8Gd/Wae1Q7ujmsnp1P7JhHW2gb7SIXHaMzdImuURMR9GqVrDVr3Xqzl+0Ne3NcalsTzTr6FjZ+B7kXuIs=</latexit>
1
<latexit sha1_base64="O4g3gSETKfc8YqcGwSsHdE8esyI=">AAACSXicbVBNS8MwGE43P+b8qnr0EhyCIIxWZHoRhl48iExwH7BuM03TLSxtSpIKo/TvefHmzf/gxYMinky3Hea2F0KePM/7kPd93IhRqSzr3cjlV1bX1gsbxc2t7Z1dc2+/IXksMKljzrhouUgSRkNSV1Qx0ooEQYHLSNMd3mR685kISXn4qEYR6QSoH1KfYqQ01TOfHJczT44CfSXOnTZ6KO3dwyu4VLDgKXR8gXBip4kjaT9A3bMUOgFSA9dPWmnXUTyaeRd7ZskqW+OCi8CeghKYVq1nvjkex3FAQoUZkrJtW5HqJEgoihlJi04sSYTwEPVJW8MQBUR2knESKTzWjAd9LvQJFRyzs44EBTLbSXdmI8p5LSOXae1Y+ZedhIZRrEiIJx/5MYOKwyxW6FFBsGIjDRAWVM8K8QDpoJQOPwvBnl95ETTOynalXHk4L1Wvp3EUwCE4AifABhegCm5BDdQBBi/gA3yBb+PV+DR+jN9Ja86Yeg7Av8rl/wAHHrOU</latexit>
>
⇤N = ⇤0 + 2
X X
154
<latexit sha1_base64="4KMD6TDTbqAYnmabr6PlSnoLlFE=">AAACnXichVFbS8MwFE7rfd6qvumDwSFMkNEOUV+EoQgqIgrODdZZ0jTdgumFJFVL6b/yl/jmvzGdxTkveCDhy3fOl3NzY0aFNM03TZ+YnJqemZ2rzC8sLi0bK6t3Iko4Ji0csYh3XCQIoyFpSSoZ6cScoMBlpO0+nBT+9iPhgkbhrUxj0gtQP6Q+xUgqyjFe4lrqmNAOqKcuJAeunz3njrn7+erkI5wWWNB+gO4bO/AI2jSU8N8fnsZUcW3Ej6v+zmTvQm+kqjhG1aybQ4M/gVWCKijt2jFebS/CSUBCiRkSomuZsexliEuKGckrdiJIjPAD6pOugiEKiOhlw+nmcFsxHvQjro5qd8h+VWQoECINXBVZlCi++wryN183kf5hL6NhnEgS4o9EfsKgjGCxKuhRTrBkqQIIc6pqhXiAOMJSLbQYgvW95Z/grlG39uv7N3vV5nE5jlmwAbZADVjgADTBGbgGLYC1da2pnWsX+qZ+ql/qVx+hulZq1sCY6e13OXPKTA==</latexit>
Z
2 2 2
p(y0 | x0 , X, y, )= p(y0 | x0 , w, )p(w | X, y, ) dw
Since both the likelihood and the posterior are Gaussian, the posterior predictive
distribution is also Gaussian:
2 2
<latexit sha1_base64="MjHP3V8KBMk5hAM0ryt7bHoGXSI=">AAACTnicbVHLSgMxFM3UV62vUZdugkWoRcpMkepGKLpxJRXsA9o6ZNJMG5p5kGTEYZgvdCPu/Aw3LhTRTFu1tl5IODn33JubEztgVEjDeNYyC4tLyyvZ1dza+sbmlr690xB+yDGpY5/5vGUjQRj1SF1SyUgr4AS5NiNNe3iR5pt3hAvqezcyCkjXRX2POhQjqShLJ0EhsgzYcWlPbUgObCe+Tyzj6OfUSn5xlGJB+y66LR/CszGPEYuvkuk2oVX81llFpcxZet4oGaOA88CcgDyYRM3Snzo9H4cu8SRmSIi2aQSyGyMuKWYkyXVCQQKEh6hP2gp6yCWiG4/sSOCBYnrQ8blanoQjdroiRq4QkWsrZTq/mM2l5H+5diid025MvSCUxMPji5yQQenD1FvYo5xgySIFEOZUzQrxAHGEpfqB1ARz9snzoFEumZVS5fo4Xz2f2JEFe2AfFIAJTkAVXIIaqAMMHsALeAPv2qP2qn1on2NpRpvU7II/kcl+AXInsh4=</latexit>
p(y0 | x0 , X, y, ) = N (y0 | µ⇤ , ⇤)
µ⇤ = x >
<latexit sha1_base64="B2B2OsZ7KCGvViM0Y3k6ZQnnduM=">AAACEnicbVDLSsNAFJ34rPUVdelmsAjqoiQi1Y1QdONKKtgHtDFMppN26CQTZiZqCfkGN/6KGxeKuHXlzr9x0gbR1gPDHM65l3vv8SJGpbKsL2Nmdm5+YbGwVFxeWV1bNzc2G5LHApM65oyLlockYTQkdUUVI61IEBR4jDS9wXnmN2+JkJSH12oYESdAvZD6FCOlJdfch7ATxO4BPNU/Un3PT+5T17rpKB79KHepe1l0zZJVtkaA08TOSQnkqLnmZ6fLcRyQUGGGpGzbVqScBAlFMSNpsRNLEiE8QD3S1jREAZFOMjophbta6UKfC/1CBUfq744EBVIOA09XZkvKSS8T//PasfJPnISGUaxIiMeD/JhBxWGWD+xSQbBiQ00QFlTvCnEfCYSVTjELwZ48eZo0Dst2pVy5OipVz/I4CmAb7IA9YINjUAUXoAbqAIMH8ARewKvxaDwbb8b7uHTGyHu2wB8YH99xWpzC</latexit>
0 wN
1
2 2
+ x> parameter uncertainty
<latexit sha1_base64="NYA/BQRrU52XYlwK25sc+/BYzZM=">AAACPnicbVBNS8NAEN3Ur1q/qh69LBZBFEtSpHoRil48iFSwKjRt2Gw27dJNNuxuxBLyy7z4G7x59OJBEa8e3bQ5WOvAsm/evGFmnhsxKpVpvhiFmdm5+YXiYmlpeWV1rby+cSN5LDBpYc64uHORJIyGpKWoYuQuEgQFLiO37uAsq9/eEyEpD6/VMCKdAPVC6lOMlKaccgtCW9JegJy9bg2e5ImG+9AOkOq7fvKQOmbXVjyCtsuZJ4eB/hL7Qg/xUOpcdpMDK51Ql5xyxayao4DTwMpBBeTRdMrPtsdxHJBQYYakbFtmpDoJEopiRtKSHUsSITxAPdLWMEQBkZ1kdH4KdzTjQZ8L/UIFR+zvjgQFMttaK7Ml5d9aRv5Xa8fKP+4kNIxiRUI8HuTHDCoOMy+hRwXBig01QFhQvSvEfSQQVtrxzATr78nT4KZWterV+tVhpXGa21EEW2Ab7AILHIEGOAdN0AIYPIJX8A4+jCfjzfg0vsbSgpH3bIKJML5/AGSari0=</latexit>
⇤ = 0 ⇤ N x0 (epistemic uncertainty)
observation noise
(aleatoric uncertainty) 155
156
1- Define the model <latexit sha1_base64="7AhdD2uNN3KCY1BqPioUTvd/b1c=">AAAB83icbVBNSwMxEM3Wr1q/qh69BItQL2VXpHosevFYwX5Ady3ZNNuGJtklmRXK0r/hxYMiXv0z3vw3pu0etPXBwOO9GWbmhYngBlz32ymsrW9sbhW3Szu7e/sH5cOjtolTTVmLxiLW3ZAYJrhiLeAgWDfRjMhQsE44vp35nSemDY/VA0wSFkgyVDzilICVfN/woSSPWRXOp/1yxa25c+BV4uWkgnI0++UvfxDTVDIFVBBjep6bQJARDZwKNi35qWEJoWMyZD1LFZHMBNn85ik+s8oAR7G2pQDP1d8TGZHGTGRoOyWBkVn2ZuJ/Xi+F6DrIuEpSYIouFkWpwBDjWQB4wDWjICaWEKq5vRXTEdGEgo2pZEPwll9eJe2Lmlev1e8vK42bPI4iOkGnqIo8dIUa6A41UQtRlKBn9IrenNR5cd6dj0VrwclnjtEfOJ8/wqSRhQ==</latexit>
(t)
<latexit sha1_base64="Rjk5aftkgeZIWOsvoPULo6HGMy4=">AAAB+XicbVDLSsNAFL2pr1pfUZduBotQNyURqS6LblxWsA9oa5lMJ+3QySTMTCol5E/cuFDErX/izr9x0mahrQcGDufcyz1zvIgzpR3n2yqsrW9sbhW3Szu7e/sH9uFRS4WxJLRJQh7KjocV5UzQpmaa004kKQ48Ttve5Dbz21MqFQvFg55FtB/gkWA+I1gbaWDbvQDrsecnT+ljUtHn6cAuO1VnDrRK3JyUIUdjYH/1hiGJAyo04ViprutEup9gqRnhNC31YkUjTCZ4RLuGChxQ1U/myVN0ZpQh8kNpntBorv7eSHCg1CzwzGSWUy17mfif1421f91PmIhiTQVZHPJjjnSIshrQkElKNJ8ZgolkJisiYywx0aaskinBXf7yKmldVN1atXZ/Wa7f5HUU4QROoQIuXEEd7qABTSAwhWd4hTcrsV6sd+tjMVqw8p1j+APr8wek7JOu</latexit>
1 X (t)
ŷmean = ŷ
T t=1
v
u
<latexit sha1_base64="POhAZ8RuXGR4f8KnTbYvXLrqQ8s=">AAACTHicbVBNTxsxFPSmLdBA29Aee7EagcKh0S6qgAsSopceqUQAKZusvM5bYmF7F/stamT5B/bCobf+Ci4cWlWV6oQ9tKEjWRrPvPHH5JUUFuP4e9R68vTZyura8/b6xouXrzqbr89sWRsOA17K0lzkzIIUGgYoUMJFZYCpXMJ5fvVx7p/fgLGi1Kc4q2Ck2KUWheAMg5R1eDpl6GY+cynCF3QWJ97T7UOa2muDLi0M4y7x7tQHpVaZw8PEj+fbXpMcux7uePqeLp2kgGnvd8a7Put04368AH1MkoZ0SYOTrPMtnZS8VqCRS2btMIkrHDlmUHAJvp3WFirGr9glDAPVTIEduUUZnm4FZUKL0oSlkS7UvxOOKWtnKg+TiuHULntz8X/esMbiYOSErmoEzR8uKmpJsaTzZulEGOAoZ4EwbkR4K+VTFvrD0H87lJAsf/kxOdvtJ3v9vc8fukfHTR1r5C15R3okIfvkiHwiJ2RAOPlK7sgP8jO6je6jX9Hvh9FW1GTekH/QWvkDgUq16g==</latexit>
u1 X T
ŷstd = t (ŷ (t) ŷmean )2
T t=1
157
158
§ The entropy H(X) of a discrete random variable X with distribution
p(x) is: X <latexit sha1_base64="Eo3GMYzdGF3CvJnpABlYTScbIfo=">AAACB3icbZDLSsNAFIYn9VbrLepSkMEitAtLIlLdCEU3XVawF2hCmEyn7dDJJMxMpCV058ZXceNCEbe+gjvfxmmahbb+MPDxn3M4c34/YlQqy/o2ciura+sb+c3C1vbO7p65f9CSYSwwaeKQhaLjI0kY5aSpqGKkEwmCAp+Rtj+6ndXbD0RIGvJ7NYmIG6ABp32KkdKWZx7XS50yvIZn0JFx4I1hVBqXocPCQUqeWbQqViq4DHYGRZCp4ZlfTi/EcUC4wgxJ2bWtSLkJEopiRqYFJ5YkQniEBqSrkaOASDdJ75jCU+30YD8U+nEFU/f3RIICKSeBrzsDpIZysTYz/6t1Y9W/chPKo1gRjueL+jGDKoSzUGCPCoIVm2hAWFD9V4iHSCCsdHQFHYK9ePIytM4rdrVSvbso1m6yOPLgCJyAErDBJaiBOmiAJsDgETyDV/BmPBkvxrvxMW/NGdnMIfgj4/MHhV6WlQ==</latexit>
X X
H(Y | X) = p(x) p(y | x) log p(y | x)
x y
H(Y | X)
<latexit sha1_base64="B8/dUojrkf1aXInT8qJpNsAXiqw=">AAACBXicbVDLSsNAFJ34rPUVdamLwSK0C0siUgURim7qroJtI20ok8m0HTqZhJmJUEI3bvwVNy4Uces/uPNvnLRZaOuB4R7OuZc793gRo1JZ1rexsLi0vLKaW8uvb2xubZs7u00ZxgKTBg5ZKBwPScIoJw1FFSNOJAgKPEZa3vA69VsPREga8js1iogboD6nPYqR0lLXPLgpOhfwvgQvYa2oy3FaYCegPnRKXbNgla0J4DyxM1IAGepd86vjhzgOCFeYISnbthUpN0FCUczION+JJYkQHqI+aWvKUUCkm0yuGMMjrfiwFwr9uIIT9fdEggIpR4GnOwOkBnLWS8X/vHaseuduQnkUK8LxdFEvZlCFMI0E+lQQrNhIE4QF1X+FeIAEwkoHl9ch2LMnz5PmSdmulCu3p4XqVRZHDuyDQ1AENjgDVVADddAAGDyCZ/AK3own48V4Nz6mrQtGNrMH/sD4/AH2zpRw</latexit>
I(X; Y ) = H(Y )
159
Posterior
Predictive Distribution:
Z
High Mutual Information ->
<latexit sha1_base64="b8Y5jHsOhqxbY8tWs/yBKFii+MY=">AAACSXicbZBLSwMxFIUzrc/6qrp0EyxCK6XMiKgboagLlwq2Kp1aM5lMG8w8SO6IZejfc+POnf/BjQtFXJnpdGFbLwROzncvuTlOJLgC03wzcvmZ2bn5hcXC0vLK6lpxfaOpwlhS1qChCOWNQxQTPGAN4CDYTSQZ8R3Brp2H05RfPzKpeBhcQT9ibZ90A+5xSkBbneJ9VL7Fts9d/HS3W9WKQI8SkZwNKvgY2zwAPNEBPQakgu2qBtklo2OjmroZ7BRLZs0cFp4W1kiU0KguOsVX2w1p7LMAqCBKtSwzgnZCJHAq2KBgx4pFhD6QLmtpGRCfqXYyTGKAd7TjYi+U+ujVh+7fiYT4SvV9R3em66pJlpr/sVYM3lE74UEUAwto9pAXCwwhTmPFLpeMguhrQajkeldMe0QSCjr8gg7BmvzytGju1ayD2sHlfql+MopjAW2hbVRGFjpEdXSOLlADUfSM3tEn+jJejA/j2/jJWnPGaGYTjVUu/wsFPa74</latexit>
⇤ ⇤
p(Y | x , D) = p(Y | x , ✓) p(✓ | D) d✓ knowing the model parameters
significantly reduces uncertainty
Epistemic Uncertainty: How much uncertainty left about Y by knowing in the prediction -> uncertainty in
posterior distribution over ✓
<latexit sha1_base64="5bCyd0gYv8CeZMXZW/avxKBuZUk=">AAAB7XicbVBNS8NAEJ3Ur1q/qh69BIvgqSQi1WPRi8cK9gPaUDbbTbt2sxt2J0IJ/Q9ePCji1f/jzX/jts1BWx8MPN6bYWZemAhu0PO+ncLa+sbmVnG7tLO7t39QPjxqGZVqyppUCaU7ITFMcMmayFGwTqIZiUPB2uH4dua3n5g2XMkHnCQsiMlQ8ohTglZq9XDEkPTLFa/qzeGuEj8nFcjR6Je/egNF05hJpIIY0/W9BIOMaORUsGmplxqWEDomQ9a1VJKYmSCbXzt1z6wycCOlbUl05+rviYzExkzi0HbGBEdm2ZuJ/3ndFKPrIOMySZFJulgUpcJF5c5edwdcM4piYgmhmttbXToimlC0AZVsCP7yy6ukdVH1a9Xa/WWlfpPHUYQTOIVz8OEK6nAHDWgChUd4hld4c5Tz4rw7H4vWgpPPHMMfOJ8/p4mPMw==</latexit>
⇤ 1X
Ep(✓|D) [H(Y | x , ✓)] ⇡ H(Y | x⇤ , ✓s )
S s=1
160
1
<latexit sha1_base64="UK879WXAOc7iKCTgGU4H/rMWXPM=">AAACenichVHBThsxEPVuC4TQ0tAeEZJFRAmCRrsoor0gIbhwpBIBpGyIvN7ZYOFdr+xZ1Mjaj+iv9caXcOGAN+QApBIj2X567409nokLKQwGwb3nf/i4sLjUWG6ufPq8+qW19vXCqFJz6HMllb6KmQEpcuijQAlXhQaWxRIu49uTWr+8A22Eys9xUsAwY+NcpIIzdNSo9bfoTOghDWmUicRtDG/i1P6p9mgUK5mYSeYOG8WArNpxxsiIccY68+J1hKp4ccHUnGrGbVjZkO5SuLY/3k+rqlGrHXSDadB5EM5Am8zibNT6FyWKlxnkyCUzZhAGBQ4t0yi4hKoZlQYKxm/ZGAYO5iwDM7TT1lV0yzEJTZV2K0c6ZV9mWJaZumDnrGs0b7Wa/J82KDH9NbQiL0qEnD8/lJaSoqL1HGgiNHCUEwcY18LVSvkNc/1CN62ma0L49svz4GK/Gx50D3732kfHs3Y0yDrZJB0Skp/kiJySM9InnDx4G953b9t79Df9HX/32ep7s5xv5FX4vSd1h8Fj</latexit>
>
p(y = 1 | x, ) = ( x) = >x
1+e
p(D | ) p( )
<latexit sha1_base64="0IeZImacoYXxaHSKiTKeIwqxjog=">AAACaHicbVFbS8MwFE7rbW5e6g0RX8KGsIGMVmT6Igz1wccJ7gLrGGmabmHphSQVRin+R9/8Ab74K0y3gu5yIMnHd75zcvLFiRgV0jS/NH1jc2t7p7BbLO3tHxwaR8cdEcYckzYOWch7DhKE0YC0JZWM9CJOkO8w0nUmT1m++064oGHwJqcRGfhoFFCPYiQVNTQ+oqrthMwVU18die0QiVJo+9RVG5JjjFjynNbgA7Q9jnCi5H90rlupr0H7Gq5rXEsXG9TSoVEx6+Ys4CqwclABebSGxqfthjj2SSAxQ0L0LTOSgwRxSTEjadGOBYkQnqAR6SsYIJ+IQTIzKoVXinGhF3K1Agln7P+KBPkiG1gpsyHFci4j1+X6sfTuBwkNoliSAM8v8mIGZQgz16FLOcGSTRVAmFM1K8RjpAyV6m+KygRr+cmroHNTtxr1xuttpfmY21EAl6AMqsACd6AJXkALtAEG31pJO9XOtB/d0M/1i7lU1/KaE7AQevkXR/a6bw==</latexit>
p( | D) =
p(D)
Logistic likelihood: No conjugate prior
N N h iyi h i1
<latexit sha1_base64="gdNPsS9uYzwq1pk1T2YNLNZZ8lA=">AAAC7XicnVJNaxQxGM5M/ajr11aPXoKLsAVdZkSql0JRD56kgtsWNrNDJpPZjU0mIXlHXML8By8eFPHq//HmvzGzHcR2PYgvhDy8b57n/UphpHCQJD+jeOvS5StXt68Nrt+4eev2cOfOkdONZXzKtNT2pKCOS1HzKQiQ/MRYTlUh+XFx+qKLH7/n1gldv4WV4Zmii1pUglEIrnwn2jJjoigsGZX+ZYuJEiUmhZalW6lweVJwoO0u3sfEWF3mXuyn7dy/brEZr3LREzqFovIf2lw8/Ec6kbyCGSZOLBQdb3LmBLQ5p7yLiRWLJWRzHzL/Vkjxo/9U6ZidUj4cJZNkbXgTpD0Yod4O8+EPUmrWKF4Dk9S5WZoYyDy1IJjk7YA0jhvKTumCzwKsqeIu8+tttfhB8JS40jacGvDa+yfDU+W6HsLLrmx3MdY5/xabNVA9y7yoTQO8ZmeJqkZi0LhbPS6F5QzkKgDKrAi1YrakljIIH2QQhpBebHkTHD2epHuTvTdPRgfP+3Fso3voPhqjFD1FB+gVOkRTxKJ30cfoc/Ql1vGn+Gv87expHPWcu+icxd9/AXwZ7YU=</latexit>
Y Y yi
> >
p(D | ) = p(yi | xi , ) = ( xi ) 1 ( xi )
i=1 i=1
<latexit sha1_base64="wOsgiJuxSpwM84pA3v9dpwuTloI=">AAACS3icbZBNS8NAEIY39avWr6hHL4tFqCAlEamCCEUvnqSirUJTymS7rUt3k7C7EUrI//PixZt/wosHRTy4aXuwHwPLvjwzw8y8fsSZ0o7zbuUWFpeWV/KrhbX1jc0te3unocJYElonIQ/low+KchbQumaa08dIUhA+pw9+/yrLPzxTqVgY3OtBRFsCegHrMgLaoLbtRyXPD3lHDYT5Es+nGtJDfIE9AfqJAE9u0jkV53iCiThtO0eT7I71BBh82LaLTtkZBp4V7lgU0ThqbfvN64QkFjTQhINSTdeJdCsBqRnhNC14saIRkD70aNPIAARVrWToRYoPDOngbijNCzQe0v8dCQiVrWgqswvVdC6D83LNWHfPWgkLoljTgIwGdWOOdYgzY3GHSUo0HxgBRDKzKyZPIIFoY3/BmOBOnzwrGsdlt1Ku3J4Uq5djO/JoD+2jEnLRKaqia1RDdUTQC/pAX+jberU+rR/rd1Sas8Y9u2gickt/vAe1Ag==</latexit>
Gaussian prior : p( ) = N ( ; µ0 , ⌃0 )
d
<latexit sha1_base64="aP5+D+EL23yVRga+SDHz+y/R9DA=">AAACOXicbVC7TsMwFHV4U14FRhaLCgkGqgShwlIJwcJYJEqRmhI5zg0YnDiyHaTK5LdY+As2JBYGEGLlB3DbDLyOZPn4nHtt3xNmnCntuk/O2PjE5NT0zGxlbn5hcam6vHKmRC4ptKngQp6HRAFnKbQ10xzOMwkkCTl0wpujgd+5BamYSE91P4NeQi5TFjNKtJWCaivb9EPBI9VP7Gb8EDQptnAT+5kUUWCum15xYaIC+7Ek1PjcXh2RwuwUGC7MdnnGd8PG4PquCKo1t+4Ogf8SryQ1VKIVVB/9SNA8gVRTTpTqem6me4ZIzSiHouLnCjJCb8gldC1NSQKqZ4aTF3jDKhGOhbQr1Xiofu8wJFGD0WxlQvSV+u0NxP+8bq7j/Z5haZZrSOnooTjnWAs8iBFHTALVvG8JoZLZv2J6RWxE2oZdsSF4v0f+S8526l6j3jjZrR0clnHMoDW0jjaRh/bQATpGLdRGFN2jZ/SK3pwH58V5dz5GpWNO2bOKfsD5/ALI+q4m</latexit>
Y
Laplace Prior: p( ) = e | j| enforce sparsity
j=1
2
161
Posterior predictive distribution:
Z
<latexit sha1_base64="dlnB7fxa7yxQmivu8eVHOcyK2K4=">AAACi3icfVFbSwJBFJ7dbmZZVo+9DElgIrJboRIF0gV6NMgLuCqzs7M6OHthZjZaFv9MP6m3/k2zamAaHZiZb875Ps7NDhkV0jC+NH1jc2t7J7Ob3dvPHRzmj47bIog4Ji0csIB3bSQIoz5pSSoZ6YacIM9mpGNPHtJ4541wQQP/VcYh6Xto5FOXYiSVa5j/CIvxoATvoAktjzrqQnJsu8n7dFAqz38YseRxeqE4FvUl/F9gB8wRsaeexLKJREpnlZVmPbAk/8mgmM46cZgvGBVjZnAdmAtQAAtrDvOflhPgyCO+xAwJ0TONUPYTxCXFjEyzViRIiPAEjUhPQR95RPST2Syn8Fx5HOgGXB3V7cy7rEiQJ9LyFDMtXazGUudfsV4k3Xo/oX4YSeLjeSI3YlAGMF0MdCgnWLJYAYQ5VbVCPEYcYanWl1VDMFdbXgfty4pZrVRfrguN+8U4MuAUnIEiMEENNMAzaIIWwFpGq2g1ra7n9Cv9Rr+dU3VtoTkBv0x/+gaRdML/</latexit>
p(y ⇤ = 1 | x⇤ , D) = p(y ⇤ = 1 | x⇤ , ) p( | D) d
Z
p(y ⇤ = 1 | x⇤ , D) ⇡ p(y ⇤ = 1 | x⇤ , ) q( ) d
Approximating the predictive distribution using MCMC:
S
<latexit sha1_base64="PJs7HzpAdpzPGRTSMejj0VxxcvY=">AAACbHicfVFNTxsxEPVuoYT0g1A4gFAlqxFVglC0ixDlgoSAA0eqEkDKJtGs1wtWvGvL9lZE1p74h9z4CVz4DXiTHPioOpLtN29m7PGbWHKmTRA8eP6HufmPC7XF+qfPX74uNZa/XWhRKEK7RHChrmLQlLOcdg0znF5JRSGLOb2MR8dV/PIvVZqJ/NyMJe1ncJ2zlBEwjho27mRrPNjCBzjEUcYSt4G5iVN7Ww62tqceAW5PyjaOQEolbnGUKiA2LO2fEke6yIZWH4TloHL/f1kseKLHmTtsFFMDrqal22V72GgGnWBi+D0IZ6CJZnY2bNxHiSBFRnNDOGjdCwNp+haUYYTTsh4VmkogI7imPQdzyKju24lYJd50TIJTodzKDZ6wLyssZLrq0mVW7eu3sYr8V6xXmHS/b1kuC0NzMn0oLTg2AlfK44QpSgwfOwBEMdcrJjfgtDRuPnUnQvj2y+/BxU4n3Ovs/d5tHh7N5KihDfQDtVCIfqFDdIrOUBcR9OgteWveuvfkr/ob/vdpqu/NalbQK/N/PgNTJLkO</latexit>
⇤ 1X
⇤ (s)
p(y = 1 | x , D) ⇡ p(y ⇤ = 1 | x⇤ , )
S s=1
Mutual information as epistemic uncertainty:
S
<latexit sha1_base64="gVhDgaa4CDGhJfB8u9BdADcLD18=">AAACoHicjVFRSxwxEM6uturZ1mv7KIXgIdyJHrsitiAF0T6cYKmip9Lbu2M2m9VgdrMkWfEI+V39H33rv2n27h7UU3AgyTffzCRfZuKCM6WD4J/nz82/ebuwuFRbfvf+w0r946cLJUpJaJcILuRVDIpyltOuZprTq0JSyGJOL+Pbwyp+eUelYiI/16OC9jO4zlnKCGhHDet/jppRLHiiRpk7TBRTDXYPjwYbOMpY4jbQN3Fq7u1gY3PiEeDmh23hCIpCinvcab4mewtHqQRiQmvOLI5UmQ2N+h7aQeW+fMWMtoFpqpZtDeuNoB2MDc+CcAoaaGonw/rfKBGkzGiuCQelemFQ6L4BqRnh1NaiUtECyC1c056DOWRU9c24wRavOybBqZBu5RqP2YcVBjJVqXSZlXz1NFaRz8V6pU6/9Q3Li1LTnEweSkuOtcDVtHDCJCWajxwAIpnTiskNuD5qN9Oaa0L49Muz4GK7He62d093GvsH03YsolW0hpooRF/RPuqgE9RFxPviHXrH3k9/ze/4v/zTSarvTWs+o0fm//4PSxjMrA==</latexit>
⇤ ⇤ ⇤ ⇤ 1X (s)
I( ; y | x , D) ⇡ H(y | x , D) H(y ⇤ | x⇤ , )
S s=1
<latexit sha1_base64="iZNeKgdfG9y6wZFdxFuYtXsud7M=">AAADCHicpVJLSwMxEM6u7/pq9ShIsAhVtOyKqJdC0UuPFawK3bZks9k2NLtZkqxYlj168a948aCIV3+CN/+N2VrweVEHkvn4Zj5mMhk3YlQqy3oxzLHxicmp6Znc7Nz8wmK+sHQqeSwwaWDOuDh3kSSMhqShqGLkPBIEBS4jZ27/KIufXRAhKQ9P1CAirQB1Q+pTjJSmOgVjtVYatDehE1BPX0j1XD+5TNubW9BxOfPkINAucVyiUNpOSnIj3YAVuA2joawC7d9JHca7f9W+F7X+UfR32k6+aJWtocHvwB6BIhhZvZN/djyO44CECjMkZdO2ItVKkFAUM5LmnFiSCOE+6pKmhiEKiGwlw49M4bpmPOhzoU+o4JD9qEhQILMudWbWvvway8ifYs1Y+QethIZRrEiI3wr5MYOKw2wroEcFwYoNNEBYUN0rxD0kEFZ6d3J6CPbXJ38Hpztle6+8d7xbrB6OxjENVsAaKAEb7IMqqIE6aABsXBk3xp1xb16bt+aD+fiWahojzTL4ZObTK/Fh8No=</latexit>
162
MLP with point estimation of weights MLP with marginal distribution of weights
163