Gradient Free Attacks on Multiple Modalities
Military / Coalition Issue
Coalitions are often characterized by ad-hoc teams with distributed learning requirements. In such a setting, distributed learning is needed for collaborative situational understanding. However, since the learning is performed in an ad-hoc way, the coalition’s decision may be compromised in multi-modalities by inference-time adversaries. This work will show the feasibility and the efficiency of inference time attacks, including images, text, and audio.
Core idea and key achievements
Nowadays, attacking a black-box AI system usually requires many queries of the system, which renders the attack algorithm impractical. Existing black-box approaches to generating adversarial examples typically require a significant number of queries, either for training a substitute network or performing gradient estimation. We introduce GenAttack, a gradient-free optimization technique that uses genetic algorithms for synthesizing adversarial examples in the black-box setting.
The key achievements are as follows:
- First, we show that the query efficiency of GenAttack leads to 237 times fewer queries than ZOO, one of the first black-box attack algorithms.
- Second, we demonstrate that efficient attack can be mounted on black-box models, indicating that further robustness of models needs to be attained in multiple domains, including images, text, and audio.
Implications for Defence
When coalitions perform learning in a distributed manner, it is possible that some parties may be compromised, and the machine learning model may be subverted during inference time. We show that black-box attacks can be mounted efficiently, and study these vulnerabilities can help us build more robust machine learning models.
Readiness & alternative Defence uses
Attack feasibility and demonstration of defence efficiency, level 3.
Resources and references
- GenAttack: Practical Black-box Attacks with Gradient-Free Optimization”, in ACM GECCO, 2019.
- “Did you hear that? adversarial examples against automatic speech recognition”, in NIPS 2017 Machine Deception workshop
- “Generating natural language adversarial examples”, EMNLP 2018
IBM US, UCLA,