Adversarial Examples have long been a fascinating topic for many Machine Learning researchers. How can a tiny perturbation cause the neural network to change its output by so much? While many explanations have been proposed over the years, they all appear to fall short. This paper attempts to comprehensively explain the existence of adversarial examples by proposing a view of the classification landscape, which they call the Dimpled Manifold Model, which says that any classifier will adjust its decision boundary to align with the low-dimensional data manifold, and only slightly bend around the data. This potentially explains many phenomena around adversarial examples. Warning: In this video, I disagree. Remember that I’m not an authority, but simply give my own opinions.
0:00 - Intro & Overview
7:30 - The old mental image of Adversarial Examples
11:25 - The new Dimpled Manifold Hypothesis
22:55 - The Stretchy Feature Model
29:05 - Why do DNNs create Dimpled Manifolds?
38:30 - What can be explained with the new model?
1:00:40 - Experimental evidence for the Dimpled Manifold Model
1:10:25 - Is Goodfellow’s claim debunked?
1:13:00 - Conclusion & Comments
Paper: [2106.10151] The Dimpled Manifold Model of Adversarial Examples in Machine Learning
My replication code: https://gist.github.com/yk/de8d987c4e…
Goodfellow’s Talk: https://youtu.be/CIfsB_EYsVI?t=4280