Jelasity, Márk
(University of Szeged)
Adversarial examples in machine
learning
Since the publication of
the seminal paper by Szegedy et al, 2014, adversarial
examples for machine learning models have been in the focus of interest. In a
nutshell, the problem Szegedy et al discovered is
that machine learning models can be fooled very easily. That is, with little effort,
one can find examples very close to an original example (for example, changing
only one pixel of an image, or adding invisible noise, etc)
that makes the model output arbitrary labels for the given example: a panda is
recognized as a school bus, or an oistrich, or, in fact,
anything we can think of. This is quite alarming, since with the increasing
levels of automation, it is natural to require reliability and robustness from
AI solutions, yet we see that they are in fact extremely fragile. Also, this
phenomenon sheds light on the fact that the mechanisms machine learning models
use to classify examples are extremely different from those ones that humans
are using and, most importantly, we have very little understanding of (and thus
very little control over) these mechanisms. In this talk I will review the
problem and present some interesting approaches for explaining it and some promising
attempts for solving it.
The talk is held in Hungarian!
Az előadás nyelve magyar!
Date: Dec 3, Tuesday 4:15pm
Place: BME, Building „Q”, Room QBF13