When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality
- URL: http://arxiv.org/abs/2402.13113v2
- Date: Sun, 2 Jun 2024 14:48:13 GMT
- Title: When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality
- Authors: Brielen Madureira, Patrick Kahardipraja, David Schlangen,
- Abstract summary: Causal models are forced to output one interpretation and continue, whereas models that can revise may edit their previous output as the ambiguity is resolved.
In this work, we look at how restart-incremental Transformers build and update internal states, in an effort to shed light on what processes cause revisions not viable in autoregressive models.
- Score: 19.103130032967663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Incremental models that process sentences one token at a time will sometimes encounter points where more than one interpretation is possible. Causal models are forced to output one interpretation and continue, whereas models that can revise may edit their previous output as the ambiguity is resolved. In this work, we look at how restart-incremental Transformers build and update internal states, in an effort to shed light on what processes cause revisions not viable in autoregressive models. We propose an interpretable way to analyse the incremental states, showing that their sequential structure encodes information on the garden path effect and its resolution. Our method brings insights on various bidirectional encoders for contextualised meaning representation and dependency parsing, contributing to show their advantage over causal models when it comes to revisions.
Related papers
- Interpret the Internal States of Recommendation Model with Sparse Autoencoder [26.021277330699963]
RecSAE is an automatic, generalizable probing method for interpreting the internal states of Recommendation models.
We train an autoencoder with sparsity constraints to reconstruct internal activations of recommendation models.
We automated the construction of concept dictionaries based on the relationship between latent activations and input item sequences.
arXiv Detail & Related papers (2024-11-09T08:22:31Z) - How much do contextualized representations encode long-range context? [10.188367784207049]
We analyze contextual representations in neural autoregressive language models, emphasizing long-range contexts that span several thousand tokens.
Our methodology employs a perturbation setup and the metric emphAnisotropy-Calibrated Cosine Similarity, to capture the degree of contextualization of long-range patterns from the perspective of representation geometry.
arXiv Detail & Related papers (2024-10-16T06:49:54Z) - Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework [2.8470354623829577]
We develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers.
We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts.
We find that the model performance remains mostly unaffected, while the model shows much improved interpretability.
arXiv Detail & Related papers (2024-10-08T14:22:40Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - Counterfactuals of Counterfactuals: a back-translation-inspired approach
to analyse counterfactual editors [3.4253416336476246]
We focus on the analysis of counterfactual, contrastive explanations.
We propose a new back translation-inspired evaluation methodology.
We show that by iteratively feeding the counterfactual to the explainer we can obtain valuable insights into the behaviour of both the predictor and the explainer models.
arXiv Detail & Related papers (2023-05-26T16:04:28Z) - What Are You Token About? Dense Retrieval as Distributions Over the
Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space.
We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z) - IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision
Transformers [81.31885548824926]
Self-attention-based model, transformer, is recently becoming the leading backbone in the field of computer vision.
We present an Interpretability-Aware REDundancy REDuction framework (IA-RED$2$)
We include extensive experiments on both image and video tasks, where our method could deliver up to 1.4X speed-up.
arXiv Detail & Related papers (2021-06-23T18:29:23Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - VisBERT: Hidden-State Visualizations for Transformers [66.86452388524886]
We present VisBERT, a tool for visualizing the contextual token representations within BERT for the task of (multi-hop) Question Answering.
VisBERT enables users to get insights about the model's internal state and to explore its inference steps or potential shortcomings.
arXiv Detail & Related papers (2020-11-09T15:37:43Z) - Back to the Future: Unsupervised Backprop-based Decoding for
Counterfactual and Abductive Commonsense Reasoning [79.48769764508006]
generative language models (LMs) can be trained to condition only on the past context or to perform narrowly scoped text-infilling.
We propose DeLorean, a new unsupervised decoding algorithm that can flexibly incorporate both the past and future contexts.
We demonstrate that our approach is general and applicable to two nonmonotonic reasoning tasks: abductive text generation and counterfactual story revision.
arXiv Detail & Related papers (2020-10-12T17:58:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.