Transfer Learning
Transfer Learning
YouTube Playlist
Maziar Raissi
Assistant Professor
fi
[email protected]
How transferable are features in deep neural networks?
YouTube Playlist
}
sel↵er
{z
|
}
transefer
random split v.s. man-made/natural split {z
|
| {z } | {z }
similar A & B di↵erent tasks A & B
Yosinski, Jason, et al. "How transferable are features in deep neural networks?." Advances in neural information processing systems. 2014.
DeCAF: A Deep Convolutional Activation
Feature for Generic Visual Recognition YouTube Playlist
Donahue, Je , et al. "Decaf: A deep convolutional activation feature for generic visual recognition." International conference on machine learning. 2014.
ff
CNN Features o -the-shelf: an
Astounding Baseline for Recognition YouTube Playlist
!"#$%
Object retrieval (L2 distance) CNNaug: augment the
mean accuracy: mean of the confusion matrix diagonal training set by adding
cropped and rotated samples
CUB 200-2011 Bird dataset
H3D Human Attributes dataset
Sharif Razavian, Ali, et al. "CNN features o -the-shelf: an astounding baseline for recognition." Proceedings of the IEEE conference on computer vision and pattern
recognition workshops. 2014.
ff
ff
Return of the Devil in the Details:
Delving Deep into Convolutional Nets YouTube Video
I ! image
! encoding function
(I) 2 Rd ! vector image representation (Fast)
(Medium)
Shallow Representation (IFV) (Slow)
(Intra-normalisation)
Nearest Neighbor
ranking hing loss wcT (Ipos ) > wcT (Ineg ) + 1 ⇠ Performance of shallow representations
accumulated first order di↵erences
X can be significantly improved by adopting
vk := (xi µ k ) 2 2 RD data augmentation, typically used in deep
{xi :N N (xi )=µk } learning. In spite of this improvement,
accumulated second order di↵erences deep architectures still outperform the
2KD
FV (I) = [u ;
1 1v ; . . . ; u ;
K Kv ] 2 R shallow methods by a large margin.
p
IFV (I) FV (I) . sign(·) | · | . `2 normalize
Deep representation (CNN) with pre-training
CNN (I) ! vector activities of penultimate layer
Deep representation (CNN) with pre-training
and fine-tuning
Chat eld, Ken, et al. "Return of the devil in the details: Delving deep into convolutional nets." arXiv preprint arXiv:1405.3531 (2014).
fi
Learning and Transferring Mid-Level Image
Representations using Convolutional Neural Networks YouTube Video
Inference
<latexit sha1_base64="fFZJzAgRAgDqMCLWw3i3vcBT/fg=">AAACEHicbVBJTsMwFHUYS5nCsGMTUSGxqpIugGUFG9gViQ5SG1WO89NadZzIdpBKlEtwArZwAnaILTfgANwDJ82CtjzJ0tP7w/t+XsyoVLb9baysrq1vbFa2qts7u3v75sFhR0aJINAmEYtEz8MSGOXQVlQx6MUCcOgx6HqTm7zefQQhacQf1DQGN8QjTgNKsNLS0DweFDtSAX52xwMQwAkMzZpdtwtYy8QpSQ2VaA3Nn4EfkSQErgjDUvYdO1ZuioWihEFWHSQSYkwmeAR9TTkOQbpp4ZxZZ1rxrSAS+nFlFerfiRSHUk5DT3eGWI3lYi0X/6v1ExVcuSnlcaL0r2ZGQcIsFVl5FJZPBRDFpppgIqi+1SJjLDBROrA5F1/mp2U6F2cxhWXSadSdi3rjvlFrXpcJVdAJOkXnyEGXqIluUQu1EUFP6AW9ojfj2Xg3PozPWeuKUc4coTkYX79imZ4i</latexit>
Label Bias: ImageNet ! husky dog, australian terrier, etc. P ! bounding box of a patch
<latexit sha1_base64="lABnpLak84zxhvR9iz7ZADvXCME=">AAACQnicbVBNS1tBFJ3nd+NXapfdDEbBRQjvZWGLq6BQFIooGBWSEO6bd5MMmTfzmLmvEkL+kT/DX9Cd1K0bd8VtF53ELBrTAwOHc+7lnjlxpqSjMHwMFhaXlldW1z4U1jc2t7aLH3euncmtwLowytjbGBwqqbFOkhTeZhYhjRXexP2TsX/zA62TRl/RIMNWCl0tO1IAeald/PYdYlT8WII74mfexHMkvte0stsjsNbc7fFe7voDnphumUPuyIKSoDmhtRJtmSOJSrtYCivhBHyeRFNSYlNctIvPzcSIPEVNQoFzjSjMqDUES1IoHBWaucMMRN8HaniqIUXXGk7+O+L7Xkl4x1j/NPGJ+u/GEFLnBmnsJ1OgnnvvjcX/eY2cOl9bQ6mznFCLt0OdXHEyfFweT6RFQcqXIUFY6bNy0QMLwpcxeyVx42gj30v0voV5cl2tRIeV6mW1VDueNrTGPrNddsAi9oXV2Cm7YHUm2D37yX6xp+AheAl+B69vowvBdOcTm0Hw5y9E9rDK</latexit>
<latexit sha1_base64="dj+984Sb63SpRnOzhyU93l9igCc=">AAACKHicbVDLSgNBEJz1/Tbq0ctgEDxI2A2iHkUvHiOYREhC6J2dTYbMziwzvZqw5Cf8DL/Aq36BN8lV8D/cTXIwxoKGoqqb7i4/lsKi646chcWl5ZXVtfWNza3tnd3C3n7N6sQwXmVaavPgg+VSKF5FgZI/xIZD5Ete93s3uV9/5MYKre5xEPNWBB0lQsEAM6ldOK3QphGdLoIx+ok2kfcx9XWiAqE61Nd9qkMKNAZk3WG7UHRL7hh0nnhTUiRTVNqF72agWRJxhUyCtQ3PjbGVgkHBJB9uNBPLY2A96PBGRhVE3LbS8VdDepwpAQ21yUohHau/J1KIrB1EftYZAXbtXy8X//MaCYaXrVSoOEGu2GRRmEiKmuYR0UAYzlAOMgLMiOxWyrpggGEW5MyWwOan5bl4f1OYJ7VyyTsvle/OilfX04TWyCE5IifEIxfkitySCqkSRp7JK3kj786L8+F8OqNJ64IznTkgM3C+fgByPqeK</latexit>
<latexit sha1_base64="h5jMJ7XG2qWBgMI6wZMy4QPE1P8=">AAACLXicbVDLSgNBEJz1/Tbq0ctgEDyF3YCPo+jFYwTzgCSE3kknGZydWWZ6lRDyHX6GX+BVv8CDIN7E33DyOGi0YKCoqqZ7Kk6VdBSGb8Hc/MLi0vLK6tr6xubWdm5nt+JMZgWWhVHG1mJwqKTGMklSWEstQhIrrMa3lyO/eofWSaNvqJ9iM4Gulh0pgLzUykVAXCE44sdh45Abn1WQ8hjpHlFzjbLbi42VustTINFD18rlw0I4Bv9LoinJsylKrdxno21ElqAmocC5ehSm1ByAJSkUDtcamcMUxC10se6phgRdczD+2pAfeqXNO8b6p4mP1Z8TA0ic6yexTyZAPTfrjcT/vHpGnbPmQOo0I9RisqiTKU6Gj3ribWlRkOp7AsJKfysXPbAgyLf5a0vbjU4b+l6i2Rb+kkqxEJ0UitfF/PnFtKEVts8O2BGL2Ck7Z1esxMpMsAf2xJ7ZS/AYvAbvwcckOhdMZ/bYLwRf30+dqPI=</latexit>
at least 50% overlap between neighboring patches 3. the patch overlaps with no more than one object
<latexit sha1_base64="MbW2AYR0e9V34wD3foNndP6ivgM=">AAACPXicbVDLThtBEJyF8A7ESY5cWlhIOVm7ECUcSXLhSKQYkGzL6h232YHZmdVML8Ramd/hM/gCronyAdyiXLkya/vAIyW1VKrqVndXWmjlOY7/RHPzrxYWl5ZXVtder2+8abx9d+Rt6SS1pdXWnaToSStDbVas6aRwhHmq6Tg9/1b7xxfkvLLmB48K6uV4atRQSeQg9Rtfukw/udptXY1hSjkjKJBlBjYMaiw8XCrOwFjIrSPgDA1YQ2DTM5I87jeacSueAF6SZEaaYobDfuOuO7CyzMmw1Oh9J4kL7lXoWElN49Vu6alAeY6n1AnUYE6+V01eHcN2UAYwtC6UYZiojycqzL0f5WnozJEz/9yrxf95nZKHe71KmaJkMnK6aFhqYAt1bjBQLnyrR4GgdCrcCjJDh5JDuk+2DHx9Wp1L8jyFl+Rop5V8au18/9jc/zpLaFlsii3xQSTis9gXB+JQtIUU1+JW/BK/o5voLvob/Zu2zkWzmffiCaL7B87GsEc=</latexit>
! 8 di↵erent scales
<latexit sha1_base64="tKh7psqr9jB8kqRpf3O+XEQEZoY=">AAACIXicbVDLTgJBEJzFF+IL9ehlIiHxRHaJUY5ELx4xkUcChMwOvTBh9pGZXpVs+AI/wy/wql/gzXgznv0PZ4GDgJVMUqnqnu4uN5JCo21/WZm19Y3Nrex2bmd3b/8gf3jU0GGsONR5KEPVcpkGKQKoo0AJrUgB810JTXd0nfrNe1BahMEdjiPo+mwQCE9whkbq5YsdJQZDZEqFD7SD8IhJhfaF54GCAKnmTIKe9PIFu2RPQVeJMycFMketl//p9EMe++YPLpnWbceOsJswhYJLmOQ6sYaI8REbQNvQgPmgu8n0nAktGqVPvVCZZ3aYqn87EuZrPfZdU+kzHOplLxX/89oxepVuIoIoRgj4bJAXS4ohTbMxdyvgKMeGMK6E2ZXyIVOMo0lwYUpfp6uluTjLKaySRrnkXJTKt+eF6tU8oSw5IafkjDjkklTJDamROuHkibyQV/JmPVvv1of1OSvNWPOeY7IA6/sXlpGlGQ==</latexit>
Oquab, Maxime, et al. "Learning and transferring mid-level image representations using convolutional neural networks." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2014.
Questions?
YouTube Playlist