[go: up one dir, main page]

Interpreting Global Perturbation Robustness of Image Models using Axiomatic Spectral Importance Decomposition

Published: 18 Jul 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Perturbation robustness evaluates the vulnerabilities of models, arising from a variety of perturbations, such as data corruptions and adversarial attacks. Understanding the mechanisms of perturbation robustness is critical for global interpretability. We present a model-agnostic, global mechanistic interpretability method to interpret the perturbation robustness of image models. This research is motivated by two key aspects. First, previous global interpretability works, in tandem with robustness benchmarks, *eg.* mean corruption error (mCE), are not designed to directly interpret the mechanisms of perturbation robustness within image models. Second, we notice that the spectral signal-to-noise ratios (SNR) of perturbed natural images exponentially decay over the frequency. This power-law-like decay implies that: Low-frequency signals are generally more robust than high-frequency signals -- yet high classification accuracy can not be achieved by low-frequency signals alone. By applying Shapley value theory, our method axiomatically quantifies the predictive powers of robust features and non-robust features within an information theory framework. Our method, dubbed as **I-ASIDE** (**I**mage **A**xiomatic **S**pectral **I**mportance **D**ecomposition **E**xplanation), provides a unique insight into model robustness mechanisms. We conduct extensive experiments over a variety of vision models pre-trained on ImageNet, including both convolutional neural networks (*eg.* *AlexNet*, *VGG*, *GoogLeNet/Inception-v1*, *Inception-v3*, *ResNet*, *SqueezeNet*, *RegNet*, *MnasNet*, *MobileNet*, *EfficientNet*, *etc.*) and vision transformers (*eg.* *ViT*, *Swin Transformer*, and *MaxViT*), to show that **I-ASIDE** can not only **measure** the perturbation robustness but also **provide interpretations** of its mechanisms.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=D2qMFfYlYb
Changes Since Last Submission: De-anonymized camera-ready manuscript.
Code: https://github.com/aoibhinncrtai/xai_iaside_torch
Assigned Action Editor: ~Satoshi_Hara1
Submission Number: 2588
Loading