Abstract |
Explaining the inner workings of deep neural net- work models has received considerable attention in recent years. Researchers have developed various explanation techniques in an attempt to provide human-understandable explanations, jus- tifying a model’s decision. In this work, we perform an in- depth study of one such explanation technique – explanation-by- example. For a given test input, explanation-by-example provides the nearest matching data examples from the training data as representative examples of the model’s decision boundary. To understand its relative performance compared to other state-of- the-art explanation methods we perform a cross-analysis Amazon Mechanical Turk study. The participants were asked to compare explanation methods across applications spanning image, text, audio, and sensory domains. Among the surveyed methods, explanation-by-example was preferred in all domains except text sentiment classification. Furthermore, our initial investigation also indicates that the explanation by examples is a good detector of adversarial inputs generated using both white and black-box attacks. |