Revision Transformers: Instructing Language Models to Change their
Values
- URL: http://arxiv.org/abs/2210.10332v3
- Date: Tue, 25 Jul 2023 13:02:49 GMT
- Title: Revision Transformers: Instructing Language Models to Change their
Values
- Authors: Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian
Kersting
- Abstract summary: Current transformer language models (LM) are large-scale models with billions of parameters.
We propose the Revision Transformer (RiT) to facilitate easy model updating.
The specific combination of a large-scale pre-trained LM that inherently but also diffusely encodes world knowledge with a clear-structured revision engine makes it possible to update the model's knowledge with little effort and the help of user interaction.
- Score: 21.645935518842744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current transformer language models (LM) are large-scale models with billions
of parameters. They have been shown to provide high performances on a variety
of tasks but are also prone to shortcut learning and bias. Addressing such
incorrect model behavior via parameter adjustments is very costly. This is
particularly problematic for updating dynamic concepts, such as moral values,
which vary culturally or interpersonally. In this work, we question the current
common practice of storing all information in the model parameters and propose
the Revision Transformer (RiT) to facilitate easy model updating. The specific
combination of a large-scale pre-trained LM that inherently but also diffusely
encodes world knowledge with a clear-structured revision engine makes it
possible to update the model's knowledge with little effort and the help of
user interaction. We exemplify RiT on a moral dataset and simulate user
feedback demonstrating strong performance in model revision even with small
data. This way, users can easily design a model regarding their preferences,
paving the way for more transparent AI models.
Related papers
- MUSCLE: A Model Update Strategy for Compatible LLM Evolution [29.032461144831053]
Large Language Models (LLMs) are frequently updated due to data or architecture changes to improve their performance.
Users often build a mental model of the functionality and capabilities of a particular machine learning model they are interacting with.
We propose a training strategy to minimize the number of inconsistencies in model updates.
arXiv Detail & Related papers (2024-07-12T17:12:48Z) - Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models [6.809572275782338]
We develop a unified signal propagation theory and provide formulae that govern the moments of the forward and backward signal through the transformer model.
Our framework can be used to understand and mitigate vanishing/exploding gradients, rank collapse, and instability associated with high attention scores.
arXiv Detail & Related papers (2024-03-14T17:59:14Z) - Learning to Grow Pretrained Models for Efficient Transformer Training [72.20676008625641]
We learn to grow pretrained transformers, where we learn to linearly map the parameters of the smaller model to initialize the larger model.
Experiments across both language and vision transformers demonstrate that our learned Linear Growth Operator (LiGO) can save up to 50% computational cost of training from scratch.
arXiv Detail & Related papers (2023-03-02T05:21:18Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Revealing Secrets From Pre-trained Models [2.0249686991196123]
Transfer-learning has been widely adopted in many emerging deep learning algorithms.
We show that pre-trained models and fine-tuned models have significantly high similarities in weight values.
We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
arXiv Detail & Related papers (2022-07-19T20:19:03Z) - Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope.
We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC)
SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z) - Re-parameterizing Your Optimizers rather than Architectures [119.08740698936633]
We propose a novel paradigm of incorporating model-specific prior knowledge into Structurals and using them to train generic (simple) models.
As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper- parameters.
For a simple model trained with a Repr, we focus on a VGG-style plain model and showcase that such a simple model trained with a Repr, which is referred to as Rep-VGG, performs on par with the recent well-designed models.
arXiv Detail & Related papers (2022-05-30T16:55:59Z) - Visformer: The Vision-friendly Transformer [105.52122194322592]
We propose a new architecture named Visformer, which is abbreviated from the Vision-friendly Transformer'
With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy.
arXiv Detail & Related papers (2021-04-26T13:13:03Z) - Modifying Memories in Transformer Models [71.48657481835767]
We propose a new task of emphexplicitly modifying specific factual knowledge in Transformer models.
This task is useful in many scenarios, such as updating stale knowledge, protecting privacy, and eliminating unintended biases stored in the models.
arXiv Detail & Related papers (2020-12-01T09:39:13Z) - Lifting Interpretability-Performance Trade-off via Automated Feature
Engineering [5.802346990263708]
Complex black-box predictive models may have high performance, but lack of interpretability causes problems.
We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models.
arXiv Detail & Related papers (2020-02-11T09:16:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.