Abstract |
While deep neural networks models have exhibited astonishing results in a variety of tasks over the past few years, researchers have recently shown how these models can be easily fooled and attacked using adversarial examples. Adversarial examples originate from benign examples which are correctly classified by the machine learning model, but an attacker adds a small amount of noise, usually unnoticeable by a human, which are deliberately crafted to make the classifier mistake on classifying the example after adding this adversarial noise. In this paper, we study the characteristics of adversarial noise and present approaches for increasing the robustness of models against adversarial examples attacks. |