Efficient Attacks Using Side-channels

Watch the video

Military / Coalition Issue

Explanations are increasingly required by the coalitions to build transparent and trustworthy AIs. Especially in safety-critical settings, coalition members may not rely on the AI systems if their predictions are not understandable. However, the explanations give rise a potential side channel which an adversarial user can leverage for attacking the models. In this work, we investigate in depth how one can leverage explanation information to attack neural networks. We also propose a differential privacy-based strategy to defend against these attacks.

Core idea and key achievements

Nowadays, attacking a black-box AI system usually requires many queries of the system which render the attack algorithm impractical. The core idea is to design attack algorithms to subvert systems by reducing the number of queries. Explanations happened to leak information such that the attacker can fool the system strategically.

The key achievements are as follows:

  • First, we exploit explanations as a side-channel that is available to an attacker, in addition to the model decision, and develop query-efficient black box attacks.
  • Second, we propose a differential privacy (DP) based provable defence to minimize the advantage that an adversary, with access to explanation, enjoys in gradient estimation.

Implications for Defence

Explanations are proposed to produce more transparent AI to increase human’s trust and reliance. However, the work exposes the potential risk of such explanations – they can be leveraged for more efficient attack. The proposed defence mechanism demonstrates that the system can reduces the adversarial advantage while maintaining the fidelity of explanations.

Readiness & alternative Defence uses

Attack feasibility and demonstration of defence efficiency, level 3.

Resources and references