Interpreting and Controlling Vision Foundation Models via Text
Explanations
- URL: http://arxiv.org/abs/2310.10591v1
- Date: Mon, 16 Oct 2023 17:12:06 GMT
- Title: Interpreting and Controlling Vision Foundation Models via Text
Explanations
- Authors: Haozhe Chen, Junfeng Yang, Carl Vondrick, Chengzhi Mao
- Abstract summary: We present a framework for interpreting vision transformer's latent tokens with natural language.
Our approach enables understanding of model visual reasoning procedure without needing additional model training or data collection.
- Score: 45.30541722925515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale pre-trained vision foundation models, such as CLIP, have become
de facto backbones for various vision tasks. However, due to their black-box
nature, understanding the underlying rules behind these models' predictions and
controlling model behaviors have remained open challenges. We present a
framework for interpreting vision transformer's latent tokens with natural
language. Given a latent token, our framework retains its semantic information
to the final layer using transformer's local operations and retrieves the
closest text for explanation. Our approach enables understanding of model
visual reasoning procedure without needing additional model training or data
collection. Based on the obtained interpretations, our framework allows for
model editing that controls model reasoning behaviors and improves model
robustness against biases and spurious correlations.
Related papers
- Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework [2.8470354623829577]
We develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers.
We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts.
We find that the model performance remains mostly unaffected, while the model shows much improved interpretability.
arXiv Detail & Related papers (2024-10-08T14:22:40Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Self-supervised Interpretable Concept-based Models for Text Classification [9.340843984411137]
This paper proposes a self-supervised Interpretable Concept Embedding Models (ICEMs)
We leverage the generalization abilities of Large-Language Models to predict the concepts labels in a self-supervised way.
ICEMs can be trained in a self-supervised way achieving similar performance to fully supervised concept-based models and end-to-end black-box ones.
arXiv Detail & Related papers (2024-06-20T14:04:53Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Influence Tuning: Demoting Spurious Correlations via Instance
Attribution and Instance-Driven Updates [26.527311287924995]
influence tuning can help deconfounding the model from spurious patterns in data.
We show that in a controlled setup, influence tuning can help deconfounding the model from spurious patterns in data.
arXiv Detail & Related papers (2021-10-07T06:59:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.