Related papers: Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

URL: http://arxiv.org/abs/2308.14179v1
Date: Sun, 27 Aug 2023 18:46:47 GMT
Title: Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP
Authors: Vedant Palit and Rohan Pandey and Aryaman Arora and Paul Pu Liang
Abstract summary: We adapt a unimodal causal tracing tool to BLIP to enable the study of the neural mechanisms underlying image-conditioned text generation. We release our BLIP causal tracing tool as open source to enable further experimentation in vision-language mechanistic interpretability.
Score: 27.51318030253248
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mechanistic interpretability seeks to understand the neural mechanisms that enable specific behaviors in Large Language Models (LLMs) by leveraging causality-based methods. While these approaches have identified neural circuits that copy spans of text, capture factual knowledge, and more, they remain unusable for multimodal models since adapting these tools to the vision-language domain requires considerable architectural changes. In this work, we adapt a unimodal causal tracing tool to BLIP to enable the study of the neural mechanisms underlying image-conditioned text generation. We demonstrate our approach on a visual question answering dataset, highlighting the causal relevance of later layer representations for all tokens. Furthermore, we release our BLIP causal tracing tool as open source to enable further experimentation in vision-language mechanistic interpretability by the community. Our code is available at https://github.com/vedantpalit/Towards-Vision-Language-Mechanistic-Interpretability.

Related papers

MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection [55.702662643521265]
We propose the multimodal graph-conditioned vision-language reconstruction network (MGCR-Net) to explore the semantic interaction capabilities of multimodal data.<n> Experimental results on four public datasets demonstrate that MGCR achieves superior performance compared to mainstream CD methods.
arXiv Detail & Related papers (2025-08-03T02:50:08Z)
Explainability in Neural Networks for Natural Language Processing Tasks [5.812284760539713]
Local Interpretable Model-Agnostic Explanations (LIME) have emerged as essential tools for providing insights into the behavior of complex systems. This study leverages LIME to interpret a multi-layer perceptron (MLP) neural network trained on a text classification task. Despite its effectiveness in offering localized explanations, LIME has limitations in capturing global patterns and feature interactions.
arXiv Detail & Related papers (2024-12-23T23:09:56Z)
Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering [19.472889262384818]
We develop an interpretability tool to help users and researchers identify important visual locations for final predictions. Our method demonstrates faster and more effective results compared to existing interpretability approaches.
arXiv Detail & Related papers (2024-11-17T03:32:50Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models. Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer. We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z)
Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning [64.1316997189396]
We present a novel language-tied self-supervised learning framework, Hierarchical Language-tied Self-Supervision (HLSS) for histopathology images. Our resulting model achieves state-of-the-art performance on two medical imaging benchmarks, OpenSRH and TCGA datasets.
arXiv Detail & Related papers (2024-03-21T17:58:56Z)
Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning [67.0609518552321]
We propose to conduct Machine Vision Therapy which aims to rectify the noisy predictions from vision models. By fine-tuning with the denoised labels, the learning model performance can be boosted in an unsupervised manner.
arXiv Detail & Related papers (2023-12-05T07:29:14Z)
Sparse Autoencoders Find Highly Interpretable Features in Language Models [0.0]
Polysemanticity prevents us from identifying concise, human-understandable explanations for what neural networks are doing internally. We use sparse autoencoders to reconstruct the internal activations of a language model. Our method may serve as a foundation for future mechanistic interpretability work.
arXiv Detail & Related papers (2023-09-15T17:56:55Z)
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models. A meta-model can learn on self-supervised prompts consisting of tailored demonstrations. Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z)
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections [104.14624185375897]
mPLUG is a new vision-language foundation model for both cross-modal understanding and generation. It achieves state-of-the-art results on a wide range of vision-language downstream tasks, such as image captioning, image-text retrieval, visual grounding and visual question answering.
arXiv Detail & Related papers (2022-05-24T11:52:06Z)
AttViz: Online exploration of self-attention for transparent neural language modeling [7.574392147428978]
We propose AttViz, an online toolkit for exploration of self-attention---real values associated with individual text tokens. We show how existing deep learning pipelines can produce outputs suitable for AttViz, offering novel visualizations of the attention heads and their aggregations with minimal effort, online.
arXiv Detail & Related papers (2020-05-12T12:21:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.