Related papers: Words & Weights: Streamlining Multi-Turn Interactions via Co-Adaptation

Words & Weights: Streamlining Multi-Turn Interactions via Co-Adaptation

URL: http://arxiv.org/abs/2603.01375v1
Date: Mon, 02 Mar 2026 02:16:20 GMT
Title: Words & Weights: Streamlining Multi-Turn Interactions via Co-Adaptation
Authors: Chenxing Wei, Hong Wang, Ying He, Zhongxiang Dai, Bo Jiang, F. Richard Yu, Yao Shu,
Abstract summary: Test-time policy adaptation for multi-turn interactions (T2PAM) is essential for aligning Large Language Models (LLMs) with dynamic user needs during inference time.<n>We propose ROSA2, a framework that reformulates interaction as a joint optimization problem over the heterogeneous space of Words and Weights.
Score: 55.938648534942665
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Test-time policy adaptation for multi-turn interactions (T2PAM) is essential for aligning Large Language Models (LLMs) with dynamic user needs during inference time. However, existing paradigms commonly treat test-time adaptation as a single-axis problem, either purely refining instructions (Prompt Engineering) or only adjusting weights (Test-Time Training), ignoring that interaction failures stem from a coupled mix of ambiguity and incapacity. We argue that these two optimization paths are not merely additive but synergistic: semantic clarity acts as a pre-conditioner for effective parameter updates. To this end, we propose ROSA2, a framework that reformulates interaction as a joint optimization problem over the heterogeneous space of Words and Weights. By mathematically decomposing the error signal, ROSA2 utilizes textual gradients to rectify intent ambiguity and parameter updates to bridge capability gaps. Theoretically, we prove that this co-adaptation strictly reduces the required parameter shift for convergence. Empirically, ROSA2 outperforms state-of-the-art baselines by 30% on MATH while reducing interaction turns by 40%, demonstrating that refining the context unlocks the true potential of parameter updates.

Related papers

Weight Updates as Activation Shifts: A Principled Framework for Steering [54.70188910511715]
Activation steering promises to be an extremely parameter-efficient form of adaptation, but its effectiveness depends on critical design choices.<n>We establish a first-order equivalence between activation-space interventions and weight-space updates, deriving the conditions under which activation steering can replicate fine-tuning behavior.<n>This equivalence yields a principled framework for steering design and identifies the post-block output as a theoretically-backed and highly expressive intervention site.
arXiv Detail & Related papers (2026-02-28T02:50:04Z)
Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting [92.57796055887995]
We introduce ECHO, a prompting framework that adapts hindsight experience replay from reinforcement learning for language model agents.<n> ECHO generates optimized trajectories for alternative goals that could have been achieved during failed attempts.<n>We evaluate ECHO on stateful versions of XMiniGrid, a text-based navigation and planning benchmark, and PeopleJoinQA, a collaborative information-gathering enterprise simulation.
arXiv Detail & Related papers (2025-10-11T18:11:09Z)
Test-Time Policy Adaptation for Enhanced Multi-Turn Interactions with LLMs [20.892283201423048]
We introduce Test-Time Policy Adaptation for Multi-Turn Interactions (T2PAM)<n>We first propose a new paradigm: T2PAM, which utilizes user feedback as a reward signal to estimate a latent optimal policy aligned with user preferences.<n>We then introduce Optimum-Referenced One-Step Adaptation (ROSA), a lightweight algorithm that operationalizes T2PAM.
arXiv Detail & Related papers (2025-09-27T07:46:15Z)
EmbedGrad: Gradient-Based Prompt Optimization in Embedding Space for Large Language Models [45.78656491861157]
We propose EmbedGrad, a framework that optimize text prompt embeddings through gradient-based refinement.<n>Our approach uniquely decouples training from deployment.<n> Comprehensive evaluations across mathematical reasoning, sentiment analysis, and causal judgment tasks demonstrate EmbedGrad's effectiveness.
arXiv Detail & Related papers (2025-08-05T15:03:10Z)
Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation [14.086036250269613]
Adapting Vision-Language Models to new domains with few labeled samples is a challenge due to severe overfitting and computational constraints.<n>In this paper, we propose a novel Sparse Optimization framework that dynamically adjust very few parameters.<n>Experiments on 11 diverse datasets show that SO achieves state-of-the-art few-shot adaptation performance while reducing memory overhead.
arXiv Detail & Related papers (2025-04-16T19:10:34Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision [52.80792724919329]
We introduce a novel framework named Adapter-X to improve fine-tuning in 2D image and 3D point cloud modalities. It is the first to outperform full fine-tuning in both 2D image and 3D point cloud modalities with significantly fewer parameters, i.e., only 0.20% and 1.88% of original trainable parameters for 2D and 3D classification tasks.
arXiv Detail & Related papers (2024-06-05T08:26:44Z)
Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition [16.647413058592125]
Multi-level implicit discourse relation recognition (MIDRR) aims at identifying hierarchical discourse relations among arguments. In this paper, we propose a prompt-based. Efficient Multi-level IDRR (PEMI) framework to solve the above problems.
arXiv Detail & Related papers (2024-02-23T03:53:39Z)
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges. Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning. A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z)
RatE: Relation-Adaptive Translating Embedding for Knowledge Graph Completion [51.64061146389754]
We propose a relation-adaptive translation function built upon a novel weighted product in complex space. We then present our Relation-adaptive translating Embedding (RatE) approach to score each graph triple.
arXiv Detail & Related papers (2020-10-10T01:30:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.