Related papers: Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning

Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning

URL: http://arxiv.org/abs/2601.03641v2
Date: Thu, 08 Jan 2026 08:36:44 GMT
Title: Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning
Authors: Zheng Wu, Xingyu Lou, Xinbei Ma, Yansi Li, Weiwen Liu, Weinan Zhang, Jun Wang, Zhuosheng Zhang,
Abstract summary: Large Language Model (LLM)-based agents learn new tasks without catastrophic forgetting.<n>Agent-Dice is a parameter fusion framework based on directional consensus evaluation.<n>Experiments on GUI agents and tool-use agent domains demonstrate that Agent-Dice exhibits outstanding continual learning performance.
Score: 41.461840578204956
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLM)-based agents significantly extend the utility of LLMs by interacting with dynamic environments. However, enabling agents to continually learn new tasks without catastrophic forgetting remains a critical challenge, known as the stability-plasticity dilemma. In this work, we argue that this dilemma fundamentally arises from the failure to explicitly distinguish between common knowledge shared across tasks and conflicting knowledge introduced by task-specific interference. To address this, we propose Agent-Dice, a parameter fusion framework based on directional consensus evaluation. Concretely, Agent-Dice disentangles knowledge updates through a two-stage process: geometric consensus filtering to prune conflicting gradients, and curvature-based importance weighting to amplify shared semantics. We provide a rigorous theoretical analysis that establishes the validity of the proposed fusion scheme and offers insight into the origins of the stability-plasticity dilemma. Extensive experiments on GUI agents and tool-use agent domains demonstrate that Agent-Dice exhibits outstanding continual learning performance with minimal computational overhead and parameter updates. The codes are available at https://github.com/Wuzheng02/Agent-Dice.

Related papers

OMG-Agent: Toward Robust Missing Modality Generation with Decoupled Coarse-to-Fine Agentic Workflows [9.617220633655716]
We present textbfunderlineOmni-textbfunderlineModality textbfunderlineGeneration Agent (textbfOMG-Agent)
arXiv Detail & Related papers (2026-02-04T02:25:40Z)
Self-Consolidation for Self-Evolving Agents [51.94826934403236]
Large language model (LLM) agents operate as static systems, lacking the ability to evolve through lifelong interaction.<n>We propose a novel self-evolving framework for LLM agents that introduces a complementary evolution mechanism.
arXiv Detail & Related papers (2026-02-02T11:16:07Z)
AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts [78.33143446024485]
We introduce textbfAgentLongBench, which evaluates agents through simulated environment rollouts based on Lateral Thinking Puzzles.<n>This framework generates rigorous interaction trajectories across knowledge-intensive and knowledge-free scenarios.
arXiv Detail & Related papers (2026-01-28T16:05:44Z)
Agent Drift: Quantifying Behavioral Degradation in Multi-Agent LLM Systems Over Extended Interactions [0.0]
Agent drift is the progressive degradation of agent behavior, decision quality, and inter-agent coherence over extended interaction sequences.<n>We introduce the Agent Stability Index (ASI), a novel composite metric for quantifying drift across twelve dimensions.<n>We show how unchecked agent drift can lead to substantial reductions in task completion accuracy and increased human intervention requirements.
arXiv Detail & Related papers (2026-01-07T18:37:26Z)
Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection [59.04089915447622]
ForenAgent is an interactive IFD framework that enables MLLMs to autonomously generate, execute, and refine Python-based low-level tools around the detection objective.<n>Inspired by human reasoning, we design a dynamic reasoning loop comprising global perception, local focusing, iterative probing, and holistic adjudication.<n>Experiments show that ForenAgent exhibits emergent tool-use competence and reflective reasoning on challenging IFD tasks.
arXiv Detail & Related papers (2025-12-18T08:38:44Z)
The Social Laboratory: A Psychometric Framework for Multi-Agent LLM Evaluation [0.16921396880325779]
We introduce a novel evaluation framework that uses multi-agent debate as a controlled "social laboratory"<n>We show that assigned personas induce stable, measurable psychometric profiles, particularly in cognitive effort.<n>This work provides a blueprint for a new class of dynamic, psychometrically grounded evaluation protocols.
arXiv Detail & Related papers (2025-10-01T07:10:28Z)
MAGIC-MASK: Multi-Agent Guided Inter-Agent Collaboration with Mask-Based Explainability for Reinforcement Learning [0.0]
We propose a mathematically grounded framework, MAGIC-MASK, that extends perturbation-based explanation to Multi-Agent Reinforcement Learning.<n>Our method integrates Proximal Policy Optimization, adaptive epsilon-greedy exploration, and lightweight inter-agent collaboration.<n>This collaboration enables each agent to perform saliency-guided masking and share reward-based insights with peers, reducing the time required for critical state discovery.
arXiv Detail & Related papers (2025-09-30T20:53:28Z)
GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments [56.007498767771075]
GSM-Agent is a novel benchmark for evaluating agentic reasoning in complex environments.<n>We analyze the agentic reasoning patterns by cluster the environment's document embeddings into nodes, and map each tool call to its nearest node.<n>We propose a tool-augmented test-time scaling method to improve LLM's agentic reasoning performance by adding tools to encourage models to revisit.
arXiv Detail & Related papers (2025-09-26T07:24:37Z)
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [103.32591749156416]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z)
A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining [18.607974352313832]
This paper introduces a framework: Emergence Analysis based on Multi-Agent Intention (EAMI)<n>EAMI enables dynamic and interpretable emergence analysis.<n>Experiments validate EAMI in complex online-to-offline (O2O) service system.
arXiv Detail & Related papers (2025-07-21T16:26:49Z)
Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations [58.96953392466609]
We take an in-depth look at the causal awareness of modern representations of agent interactions.<n>We show that recent representations are already partially resilient to perturbations of non-causal agents.<n>We introduce a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.