Jelasity, Márk (University of Szeged)

Adversarial examples in machine learning

Since the publication of the seminal paper by Szegedy et al, 2014, adversarial examples for machine learning models have been in the focus of interest. In a nutshell, the problem Szegedy et al discovered is that machine learning models can be fooled very easily. That is, with little effort, one can find examples very close to an original example (for example, changing only one pixel of an image, or adding invisible noise, etc) that makes the model output arbitrary labels for the given example: a panda is recognized as a school bus, or an oistrich, or, in fact, anything we can think of. This is quite alarming, since with the increasing levels of automation, it is natural to require reliability and robustness from AI solutions, yet we see that they are in fact extremely fragile. Also, this phenomenon sheds light on the fact that the mechanisms machine learning models use to classify examples are extremely different from those ones that humans are using and, most importantly, we have very little understanding of (and thus very little control over) these mechanisms. In this talk I will review the problem and present some interesting approaches for explaining it and some promising attempts for solving it.

 

The talk is held in Hungarian!

Az előadás nyelve magyar!

Date: Dec 3, Tuesday 4:15pm

Place: BME, Building „Q”, Room QBF13

Homepage of the Seminar