Experience Scaling: Post-Deployment Evolution For Large Language Models
- URL: http://arxiv.org/abs/2509.18771v1
- Date: Tue, 23 Sep 2025 08:04:58 GMT
- Title: Experience Scaling: Post-Deployment Evolution For Large Language Models
- Authors: Xingkun Yin, Kaibin Huang, Dong In Kim, Hongyang Du,
- Abstract summary: We propose experience scaling, a framework for continuous post-deployment evolution for large language models (LLMs)<n>We validate the framework in simulated real-world scenarios involving generalization to previously unseen but related tasks, repetitive queries, and over-saturated knowledge stores.<n>Results demonstrate that structured post-deployment learning can extend LLM capabilities beyond the limits of static human-generated data.
- Score: 44.48142891798125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scaling model size, training data, and compute power have driven advances in large language models (LLMs), but these approaches are reaching saturation as human-generated text is exhausted and further gains diminish. We propose experience scaling, a framework for continuous post-deployment evolution for LLMs through autonomous interaction with the environment and collaborative sharing of accumulated experience. The framework captures raw interactions, distills them into compact, reusable knowledge, and periodically refines stored content to preserve relevance and efficiency. We validate the framework in simulated real-world scenarios involving generalization to previously unseen but related tasks, repetitive queries, and over-saturated knowledge stores. Across all settings, experience scaling improves accuracy, sustains performance over time, and maintains gains when applied to novel situations. These results demonstrate that structured post-deployment learning can extend LLM capabilities beyond the limits of static human-generated data, offering a scalable path for continued intelligence progress.
Related papers
- Scaling In-Context Online Learning Capability of LLMs via Cross-Episode Meta-RL [28.82521610729606]
Large language models (LLMs) achieve strong performance when all task-relevant information is available upfront.<n>We introduce ORBIT, a multi-task, multi-episode meta-reinforcement learning framework that trains LLMs to learn from interaction in context.<n>After meta-training, a relatively small open-source model (Qwen3-14B) demonstrates substantially improved in-context online learning on entirely unseen environments.
arXiv Detail & Related papers (2026-02-03T23:53:05Z) - DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization [10.083326281775939]
Large language models (LLMs) have achieved impressive performance in text summarization.<n>Fine-tuning can improve summarization quality, but it typically relies on costly and scarce high-quality labeled data.<n>We explore continual pre-training as a scalable, self-supervised approach to adapt LLMs for downstream summarization tasks.
arXiv Detail & Related papers (2025-10-07T12:26:19Z) - TRAIL: Joint Inference and Refinement of Knowledge Graphs with Large Language Models [5.678291291711662]
TRAIL is a novel, unified framework for Thinking, Reasoning, And Incremental Learning.<n>It couples joint inference and dynamic KG refinement with large language models.<n>Extensive experiments on multiple benchmarks demonstrate that TRAIL outperforms existing KG-augmented and retrieval-augmented LLM baselines by 3% to 13%.
arXiv Detail & Related papers (2025-08-06T14:25:05Z) - Mind the Gap: Preserving and Compensating for the Modality Gap in CLIP-Based Continual Learning [11.50324946279326]
Contrastive Language-Image Pre-trained model (CLIP) exhibiting strong capabilities across various downstream tasks.<n>We analyze the variations in the modality gap during the fine-tuning of vision-language pre-trained models.<n>We propose a simple yet effective method, MG-CLIP, that improves CLIP's performance in class-incremental learning.
arXiv Detail & Related papers (2025-07-12T02:28:42Z) - Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search [48.348209577994865]
Large Language Models (LLMs) are increasingly capable but often require significant guidance or extensive interaction history to perform effectively in complex, interactive environments.<n>We introduce a novel LLM agent framework that enhances planning capabilities through in-context learning.<n>Our agent learns to extract task-critical atomic facts'' from its interaction trajectories.
arXiv Detail & Related papers (2025-06-10T18:36:31Z) - Scalable In-Context Q-Learning [68.9917436397079]
We propose textbfScalable textbfIn-textbfContext textbfQ-textbfLearning (textbfSICQL) to steer in-context reinforcement learning.<n>textbfSICQL harnesses dynamic programming and world modeling to steer ICRL toward efficient reward and task generalization.
arXiv Detail & Related papers (2025-06-02T04:21:56Z) - MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z) - Counterfactual experience augmented off-policy reinforcement learning [9.77739016575541]
CEA builds efficient inference model and enhances representativeness of learning data.<n>Uses variational autoencoders to model the dynamic patterns of state transitions.<n>Builds a complete counterfactual experience to alleviate the out-of-distribution problem of the learning data.
arXiv Detail & Related papers (2025-03-18T02:32:50Z) - LLM Post-Training: A Deep Dive into Reasoning Large Language Models [131.10969986056]
Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications.<n>Post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations.
arXiv Detail & Related papers (2025-02-28T18:59:54Z) - Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency.<n>This study demonstrates that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z) - CTP: Towards Vision-Language Continual Pretraining via Compatible
Momentum Contrast and Topology Preservation [128.00940554196976]
Vision-Language Continual Pretraining (VLCP) has shown impressive results on diverse downstream tasks by offline training on large-scale datasets.
To support the study of Vision-Language Continual Pretraining (VLCP), we first contribute a comprehensive and unified benchmark dataset P9D.
The data from each industry as an independent task supports continual learning and conforms to the real-world long-tail nature to simulate pretraining on web data.
arXiv Detail & Related papers (2023-08-14T13:53:18Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.