Backpack Language Models
- URL: http://arxiv.org/abs/2305.16765v1
- Date: Fri, 26 May 2023 09:26:23 GMT
- Title: Backpack Language Models
- Authors: John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang
- Abstract summary: We present Backpacks, a new neural architecture that marries strong modeling performance with an interface for interpretability and control.
We find that, after training, sense vectors specialize, each encoding a different aspect of a word.
We present simple algorithms that intervene on sense vectors to perform controllable text generation and debiasing.
- Score: 108.65930795825416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Backpacks: a new neural architecture that marries strong modeling
performance with an interface for interpretability and control. Backpacks learn
multiple non-contextual sense vectors for each word in a vocabulary, and
represent a word in a sequence as a context-dependent, non-negative linear
combination of sense vectors in this sequence. We find that, after training,
sense vectors specialize, each encoding a different aspect of a word. We can
interpret a sense vector by inspecting its (non-contextual, linear) projection
onto the output space, and intervene on these interpretable hooks to change the
model's behavior in predictable ways. We train a 170M-parameter Backpack
language model on OpenWebText, matching the loss of a GPT-2 small
(124Mparameter) Transformer. On lexical similarity evaluations, we find that
Backpack sense vectors outperform even a 6B-parameter Transformer LM's word
embeddings. Finally, we present simple algorithms that intervene on sense
vectors to perform controllable text generation and debiasing. For example, we
can edit the sense vocabulary to tend more towards a topic, or localize a
source of gender bias to a sense vector and globally suppress that sense.
Related papers
- Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
Interpretability: A Case Study on Othello-GPT [59.245414547751636]
We propose a circuit discovery framework alternative to activation patching.
Our framework suffers less from out-of-distribution and proves to be more efficient in terms of complexity.
We dig in a small transformer trained on a synthetic task named Othello and find a number of human-understandable fine-grained circuits inside of it.
arXiv Detail & Related papers (2024-02-19T15:04:53Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Character-level Chinese Backpack Language Models [19.329707412615047]
We train, evaluate, interpret, and control Backpack language models in character-tokenized Chinese.
We find that our (134M parameter) Chinese Backpack language model performs comparably to a (104M parameter) Transformer.
We find that complex multi-character meanings are often formed by using the same per-character sense weights consistently across context.
arXiv Detail & Related papers (2023-10-19T13:54:57Z) - Analyzing Transformer Dynamics as Movement through Embedding Space [0.0]
This paper explores how Transformer based language models exhibit intelligent behaviors such as understanding natural language.
We propose framing Transformer dynamics as movement through embedding space.
arXiv Detail & Related papers (2023-08-21T17:21:23Z) - Adapting Language Models to Compress Contexts [71.98287002918941]
Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window.
We propose to adapt pre-trained LMs into AutoCompressors, which are capable of compressing long contexts into compact summary vectors.
We fine-tune OPT and Llama-2 models on sequences of up to 30,720 tokens and show that AutoCompressors can utilize long contexts to improve perplexity.
arXiv Detail & Related papers (2023-05-24T06:42:44Z) - Grounding and Distinguishing Conceptual Vocabulary Through Similarity
Learning in Embodied Simulations [4.507860128918788]
We present a novel method for using agent experiences gathered through an embodied simulation to ground contextualized word vectors to object representations.
We use similarity learning to make comparisons between different object types based on their properties when interacted with, and to extract common features pertaining to the objects' behavior.
arXiv Detail & Related papers (2023-05-23T04:22:00Z) - What learning algorithm is in-context learning? Investigations with
linear models [87.91612418166464]
We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly.
We show that trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression.
Preliminary evidence that in-context learners share algorithmic features with these predictors.
arXiv Detail & Related papers (2022-11-28T18:59:51Z) - Extracting Latent Steering Vectors from Pretrained Language Models [14.77762401765532]
We show that latent vectors can be extracted directly from language model decoders without fine-tuning.
Experiments show that there exist steering vectors, which, when added to the hidden states of the language model, generate a target sentence nearly perfectly.
We find that distances between steering vectors reflect sentence similarity when evaluated on a textual similarity benchmark.
arXiv Detail & Related papers (2022-05-10T19:04:37Z) - WOVe: Incorporating Word Order in GloVe Word Embeddings [0.0]
Defining a word as a vector makes it easy for machine learning algorithms to understand a text and extract information from it.
Word vector representations have been used in many applications such word synonyms, word analogy, syntactic parsing, and many others.
arXiv Detail & Related papers (2021-05-18T15:28:20Z) - Visually Grounded Compound PCFGs [65.04669567781634]
Exploiting visual groundings for language understanding has recently been drawing much attention.
We study visually grounded grammar induction and learn a constituency from both unlabeled text and its visual captions.
arXiv Detail & Related papers (2020-09-25T19:07:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.