Self-Adapting Language Models
- URL: http://arxiv.org/abs/2506.10943v1
- Date: Thu, 12 Jun 2025 17:48:13 GMT
- Title: Self-Adapting Language Models
- Authors: Adam Zweiger, Jyothish Pari, Han Guo, Ekin Akyürek, Yoon Kim, Pulkit Agrawal,
- Abstract summary: Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples.<n>We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives.<n> Experiments on knowledge incorporation and few-shot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation.
- Score: 44.50247531174026
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a self-edit-a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates. Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation. To train the model to produce effective self-edits, we use a reinforcement learning loop with the downstream performance of the updated model as the reward signal. Unlike prior approaches that rely on separate adaptation modules or auxiliary networks, SEAL directly uses the model's own generation to control its adaptation process. Experiments on knowledge incorporation and few-shot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation. Our website and code is available at https://jyopari.github.io/posts/seal.
Related papers
- Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach [65.6966065843227]
Iterative Reweight-then-IRO is a framework that performs RL-style alignment of a frozen base model without touching its parameters.<n>At test time, the value functions are used to guide the base model generation via a search-based optimization process.<n> Notably, users can apply IRO to align a model on their own dataset, similar to OpenAI's reinforcement fine-tuning (RFT)
arXiv Detail & Related papers (2025-06-21T21:49:02Z) - ChameleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters [3.729242965449096]
This paper introduces ChameleonLLM, a novel framework that enables inference-time adaptation of large language models.<n>Unlike traditional fine-tuning approaches such as Low-Rank Adaptation (LoRA), our method dynamically generates adaptive modifications to the decoder weights.<n>By intelligently grouping similar inputs and computing context-aware low-rank updates via a hyper-network, ChameleonLLM achieves significant performance gains.
arXiv Detail & Related papers (2025-02-06T18:57:06Z) - Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains [114.76612918465948]
Large language models (LLMs) have achieved remarkable performance in recent years but are fundamentally limited by the underlying training data.<n>We propose a complementary approach towards self-improvement where finetuning is applied to a multiagent society of language models.
arXiv Detail & Related papers (2025-01-10T04:35:46Z) - How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario [72.02391485962127]
Speech Self-Supervised Learning (SSL) models achieve impressive performance on Automatic Speech Recognition (ASR)<n>In low-resource language ASR, they encounter the domain mismatch problem between pre-trained and low-resource languages.<n>We extend a conventional efficient fine-tuning scheme based on the adapter to handle these issues.
arXiv Detail & Related papers (2024-11-27T10:51:00Z) - Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation [13.120801609024147]
retrieval-augmented generation (RAG) has been shown to enhance factuality of large language model (LLM) outputs.<n>RAG inputs are more complex than most datasets used for training NLI models.<n>We introduce Automatic Generative Adaptation (Auto-GDA) to enable unsupervised domain adaptation.
arXiv Detail & Related papers (2024-10-04T14:21:27Z) - Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion [10.322832012497722]
Class-incremental learning is a challenging problem, where the goal is to train a model that can classify data from an increasing number of classes over time.
With the advancement of vision-language pre-trained models such as CLIP, they demonstrate good generalization ability.
However, further adaptation to downstream tasks by simply fine-tuning the model leads to severe forgetting.
Most existing works with pre-trained models assume that the forgetting of old classes is uniform when the model acquires new knowledge.
arXiv Detail & Related papers (2024-07-19T09:20:33Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z) - SELF: Self-Evolution with Language Feedback [68.6673019284853]
'SELF' (Self-Evolution with Language Feedback) is a novel approach to advance large language models.
It enables LLMs to self-improve through self-reflection, akin to human learning processes.
Our experiments in mathematics and general tasks demonstrate that SELF can enhance the capabilities of LLMs without human intervention.
arXiv Detail & Related papers (2023-10-01T00:52:24Z) - Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition [55.362258027878966]
We present momentum pseudo-labeling (MPL) as a simple yet effective strategy for semi-supervised speech recognition.
MPL consists of a pair of online and offline models that interact and learn from each other, inspired by the mean teacher method.
The experimental results demonstrate that MPL effectively improves over the base model and is scalable to different semi-supervised scenarios.
arXiv Detail & Related papers (2021-06-16T16:24:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.