Information Flow Routes: Automatically Interpreting Language Models at Scale
- URL: http://arxiv.org/abs/2403.00824v2
- Date: Tue, 16 Apr 2024 23:32:38 GMT
- Title: Information Flow Routes: Automatically Interpreting Language Models at Scale
- Authors: Javier Ferrando, Elena Voita,
- Abstract summary: Information flows by routes inside the network via mechanisms implemented in the model.
We build these graphs in a top-down manner, for each prediction leaving only the most important nodes and edges.
We show that some model components can be specialized on domains such as coding or multilingual texts.
- Score: 9.156549818722581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information flows by routes inside the network via mechanisms implemented in the model. These routes can be represented as graphs where nodes correspond to token representations and edges to operations inside the network. We automatically build these graphs in a top-down manner, for each prediction leaving only the most important nodes and edges. In contrast to the existing workflows relying on activation patching, we do this through attribution: this allows us to efficiently uncover existing circuits with just a single forward pass. Additionally, the applicability of our method is far beyond patching: we do not need a human to carefully design prediction templates, and we can extract information flow routes for any prediction (not just the ones among the allowed templates). As a result, we can talk about model behavior in general, for specific types of predictions, or different domains. We experiment with Llama 2 and show that the role of some attention heads is overall important, e.g. previous token heads and subword merging heads. Next, we find similarities in Llama 2 behavior when handling tokens of the same part of speech. Finally, we show that some model components can be specialized on domains such as coding or multilingual texts.
Related papers
- Higher-Order DeepTrails: Unified Approach to *Trails [7.270112855088838]
Analyzing, understanding, and describing human behavior is advantageous in different settings, such as web browsing or traffic navigation.
We propose to analyze entire sequences using autoregressive language models, as they are traditionally used to model higher-order dependencies in sequences.
We show that our approach can be easily adapted to model different settings introduced in previous work, namely HypTrails, MixedTrails and even SubTrails.
arXiv Detail & Related papers (2023-10-06T06:54:11Z) - One for All: Towards Training One Graph Model for All Classification Tasks [61.656962278497225]
A unified model for various graph tasks remains underexplored, primarily due to the challenges unique to the graph learning domain.
We propose textbfOne for All (OFA), the first general framework that can use a single graph model to address the above challenges.
OFA performs well across different tasks, making it the first general-purpose across-domains classification model on graphs.
arXiv Detail & Related papers (2023-09-29T21:15:26Z) - Single Sequence Prediction over Reasoning Graphs for Multi-hop QA [8.442412179333205]
We propose a single-sequence prediction method over a local reasoning graph (model)footnoteCode/Models.
We use a graph neural network to encode this graph structure and fuse the resulting representations into the entity representations of the model.
Our experiments show significant improvements in answer exact-match/F1 scores and faithfulness of grounding in the reasoning path.
arXiv Detail & Related papers (2023-07-01T13:15:09Z) - Towards Few-shot Entity Recognition in Document Images: A Graph Neural
Network Approach Robust to Image Manipulation [38.09501948846373]
We introduce the topological adjacency relationship among the tokens, emphasizing their relative position information.
We incorporate these graphs into the pre-trained language model by adding graph neural network layers on top of the language model embeddings.
Experiments on two benchmark datasets show that LAGER significantly outperforms strong baselines under different few-shot settings.
arXiv Detail & Related papers (2023-05-24T07:34:33Z) - You Only Transfer What You Share: Intersection-Induced Graph Transfer
Learning for Link Prediction [79.15394378571132]
We investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph.
The denser graph may share nodes with the original graph, which offers a natural bridge for transferring selective, meaningful knowledge.
We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions.
arXiv Detail & Related papers (2023-02-27T22:56:06Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented
Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries.
We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP.
We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.