Robust multimodal models have outlier features and encode more concepts
- URL: http://arxiv.org/abs/2310.13040v1
- Date: Thu, 19 Oct 2023 17:59:12 GMT
- Title: Robust multimodal models have outlier features and encode more concepts
- Authors: Jonathan Crabb\'e, Pau Rodr\'iguez, Vaishaal Shankar, Luca Zappella,
Arno Blaas
- Abstract summary: We probe the representation spaces of 12 robust multimodal models with various backbones and pretraining sets.
We find two signatures of robustness in the representation spaces of these models.
These insights pave the way for future research in various fields, such as model pruning and mechanistic interpretability.
- Score: 14.555055710021715
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: What distinguishes robust models from non-robust ones? This question has
gained traction with the appearance of large-scale multimodal models, such as
CLIP. These models have demonstrated unprecedented robustness with respect to
natural distribution shifts. While it has been shown that such differences in
robustness can be traced back to differences in training data, so far it is not
known what that translates to in terms of what the model has learned. In this
work, we bridge this gap by probing the representation spaces of 12 robust
multimodal models with various backbones (ResNets and ViTs) and pretraining
sets (OpenAI, LAION-400M, LAION-2B, YFCC15M, CC12M and DataComp). We find two
signatures of robustness in the representation spaces of these models: (1)
Robust models exhibit outlier features characterized by their activations, with
some being several orders of magnitude above average. These outlier features
induce privileged directions in the model's representation space. We
demonstrate that these privileged directions explain most of the predictive
power of the model by pruning up to $80 \%$ of the least important
representation space directions without negative impacts on model accuracy and
robustness; (2) Robust models encode substantially more concepts in their
representation space. While this superposition of concepts allows robust models
to store much information, it also results in highly polysemantic features,
which makes their interpretation challenging. We discuss how these insights
pave the way for future research in various fields, such as model pruning and
mechanistic interpretability.
Related papers
- ProtoS-ViT: Visual foundation models for sparse self-explainable classifications [0.6249768559720122]
This work demonstrates how frozen pre-trained ViT backbones can be effectively turned into prototypical models.
ProtoS-ViT surpasses existing prototypical models showing strong performance in terms of accuracy, compactness, and explainability.
arXiv Detail & Related papers (2024-06-14T13:36:30Z) - DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple and effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Rethinking Robustness of Model Attributions [24.317595434521504]
We show that many attribution methods are fragile and have proposed improvements in either these methods or the model training.
We observe two main causes for fragile attributions: first, the existing metrics of robustness over-penalize even reasonable local shifts in attribution.
We propose simple ways to strengthen existing metrics and attribution methods that incorporate locality of pixels in robustness metrics and diversity of pixel locations in attributions.
arXiv Detail & Related papers (2023-12-16T20:20:38Z) - OtterHD: A High-Resolution Multi-modality Model [57.16481886807386]
OtterHD-8B is an innovative multimodal model engineered to interpret high-resolution visual inputs with granular precision.
Our study highlights the critical role of flexibility and high-resolution input capabilities in large multimodal models.
arXiv Detail & Related papers (2023-11-07T18:59:58Z) - Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion [54.33764537135906]
VideoQA Transformer models demonstrate competitive performance on standard benchmarks.
Do these models capture the rich multimodal structures and dynamics from video and text jointly?
Are they achieving high scores by exploiting biases and spurious features?
arXiv Detail & Related papers (2023-06-15T06:45:46Z) - Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains.
How do language models of different sizes learn during pre-training?
Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive
Representation Learning [35.25854322376364]
We show that different data modalities are embedded at arm's length in their shared representation in multi-modal models such as CLIP.
In contrastive learning keeps the different modalities separate by a certain distance, which is influenced by the temperature parameter in the loss function.
Our experiments further demonstrate that varying the modality gap distance has a significant impact in improving the model's downstream zero-shot classification performance and fairness.
arXiv Detail & Related papers (2022-03-03T22:53:54Z) - What shapes feature representations? Exploring datasets, architectures,
and training [14.794135558227682]
In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not.
These questions are important for understanding the basis of models' decisions.
We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly.
arXiv Detail & Related papers (2020-06-22T17:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.