Gradient Free Attacks on Multiple Modalities

Watch the video

Military / Coalition Issue

Coalitions are often characterized by ad-hoc teams with distributed learning requirements. In such a setting, distributed learning is needed for collaborative situational understanding. However, since the learning is performed in an ad-hoc way, the coalition’s decision may be compromised in multi-modalities by inference-time adversaries. This work will show the feasibility and the efficiency of inference time attacks, including images, text, and audio.

Core idea and key achievements

Nowadays, attacking a black-box AI system usually requires many queries of the system, which renders the attack algorithm impractical. Existing black-box approaches to generating adversarial examples typically require a significant number of queries, either for training a substitute network or performing gradient estimation. We introduce GenAttack, a gradient-free optimization technique that uses genetic algorithms for synthesizing adversarial examples in the black-box setting.

The key achievements are as follows:

  • First, we show that the query efficiency of GenAttack leads to 237 times fewer queries than ZOO, one of the first black-box attack algorithms.
  • Second, we demonstrate that efficient attack can be mounted on black-box models, indicating that further robustness of models needs to be attained in multiple domains, including images, text, and audio.

Implications for Defence

When coalitions perform learning in a distributed manner, it is possible that some parties may be compromised, and the machine learning model may be subverted during inference time. We show that black-box attacks can be mounted efficiently, and study these vulnerabilities can help us build more robust machine learning models.

Readiness & alternative Defence uses

Attack feasibility and demonstration of defence efficiency, level 3.

Resources and references

  • GenAttack: Practical Black-box Attacks with Gradient-Free Optimization”, in ACM GECCO, 2019.
  • “Did you hear that? adversarial examples against automatic speech recognition”, in NIPS 2017 Machine Deception workshop
  • “Generating natural language adversarial examples”, EMNLP 2018