Related papers: Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers

Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers

URL: http://arxiv.org/abs/2311.07470v2
Date: Tue, 11 Jun 2024 12:30:02 GMT
Title: Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers
Authors: Haowen Pan, Yixin Cao, Xiaozhi Wang, Xun Yang, Meng Wang,
Abstract summary: We propose a novel method to identify key neurons for interpretability. Our method improves conventional works upon efficiency and applied range by removing needs of costly gradient computation. Based on those identified neurons, we further design a multi-modal knowledge editing method, beneficial to mitigate sensitive words or hallucination.
Score: 24.936419036304855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding the internal mechanisms by which multi-modal large language models (LLMs) interpret different modalities and integrate cross-modal representations is becoming increasingly critical for continuous improvements in both academia and industry. In this paper, we propose a novel method to identify key neurons for interpretability -- how multi-modal LLMs bridge visual and textual concepts for captioning. Our method improves conventional works upon efficiency and applied range by removing needs of costly gradient computation. Based on those identified neurons, we further design a multi-modal knowledge editing method, beneficial to mitigate sensitive words or hallucination. For rationale of our design, we provide theoretical assumption. For empirical evaluation, we have conducted extensive quantitative and qualitative experiments. The results not only validate the effectiveness of our methods, but also offer insightful findings that highlight three key properties of multi-modal neurons: sensitivity, specificity and causal-effect, to shed light for future research.

Related papers

Quantifying Cross-Modality Memorization in Vision-Language Models [86.82366725590508]
We study the unique characteristics of cross-modality memorization and conduct a systematic study centered on vision-language models.<n>Our results reveal that facts learned in one modality transfer to the other, but a significant gap exists between recalling information in the source and target modalities.
arXiv Detail & Related papers (2025-06-05T16:10:47Z)
Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence [12.09002670544188]
The brain exhibits an inverse effectiveness phenomenon, wherein weaker unimodal cues yield stronger multisensory integration benefits.<n>Inspired by this biological mechanism, we propose an inverse effectiveness driven multimodal fusion (IEMF) strategy.<n>By incorporating this strategy into neural networks, we achieve more efficient integration with improved model performance and computational efficiency.
arXiv Detail & Related papers (2025-05-15T11:08:50Z)
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons [15.883209651151155]
Previous studies primarily utilize neurons as units of analysis for understanding the mechanisms of factual knowledge in Language Models (LMs) In this paper, we first conduct preliminary experiments to validate that Sparse Autoencoders (SAE) can effectively decompose neurons into features, which serve as alternative analytical units.
arXiv Detail & Related papers (2025-02-18T03:09:55Z)
Neurons Speak in Ranges: Breaking Free from Discrete Neuronal Attribution [16.460751105639623]
We introduce NeuronLens, a novel range-based interpretation and manipulation framework. It provides a finer view of neuron activation distributions to localize concept attribution within a neuron.
arXiv Detail & Related papers (2025-02-04T03:33:55Z)
Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities. Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts. Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z)
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks. We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture. Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z)
What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models [50.97705264224828]
We propose Counterfactual Inception, a novel method that implants counterfactual thinking into Large Multi-modal Models. We aim for the models to engage with and generate responses that span a wider contextual scene understanding. Comprehensive analyses across various LMMs, including both open-source and proprietary models, corroborate that counterfactual thinking significantly reduces hallucination.
arXiv Detail & Related papers (2024-03-20T11:27:20Z)
CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic Decoding [14.484475792279671]
We propose a CLIP-guided Multi-sUbject visual neural information SEmantic Decoding (CLIP-MUSED) method. Our method consists of a Transformer-based feature extractor to effectively model global neural representations. It also incorporates learnable subject-specific tokens that facilitates the aggregation of multi-subject data.
arXiv Detail & Related papers (2024-02-14T07:41:48Z)
Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process. We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z)
Adversarial Attacks on the Interpretation of Neuron Activation Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models. In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z)
Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features [9.783560855840602]
This paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories.
arXiv Detail & Related papers (2022-10-13T05:49:33Z)
Multimodal foundation models are better simulators of the human brain [65.10501322822881]
We present a newly-designed multimodal foundation model pre-trained on 15 million image-text pairs. We find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones.
arXiv Detail & Related papers (2022-08-17T12:36:26Z)
Neural Dependency Coding inspired Multimodal Fusion [11.182263394122142]
Recent work in deep fusion models via neural networks has led to substantial improvements in areas like speech recognition, emotion recognition and analysis, captioning and image description. Inspired by recent neuroscience ideas about multisensory integration and processing, we investigate the effect of synergy maximizing loss functions.
arXiv Detail & Related papers (2021-09-28T17:52:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.