PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution
- URL: http://arxiv.org/abs/2601.10657v2
- Date: Fri, 16 Jan 2026 22:31:40 GMT
- Title: PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution
- Authors: Minghao Yan, Bo Peng, Benjamin Coleman, Ziqi Chen, Zhouhang Xie, Shuo Chen, Zhankui He, Noveen Sachdeva, Isabella Ye, Weili Wang, Chi Wang, Ed H. Chi, Fernando Pereira, Wang-Cheng Kang, Derek Zhiyuan Cheng, Beidou Wang,
- Abstract summary: PACEvolve is a framework designed to robustly govern the agent's context and search dynamics.<n>We demonstrate that PACEvolve provides a systematic path to consistent, long-horizon self-improvement.
- Score: 64.15555230987222
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have emerged as powerful operators for evolutionary search, yet the design of efficient search scaffolds remains ad hoc. While promising, current LLM-in-the-loop systems lack a systematic approach to managing the evolutionary process. We identify three distinct failure modes: Context Pollution, where experiment history biases future candidate generation; Mode Collapse, where agents stagnate in local minima due to poor exploration-exploitation balance; and Weak Collaboration, where rigid crossover strategies fail to leverage parallel search trajectories effectively. We introduce Progress-Aware Consistent Evolution (PACEvolve), a framework designed to robustly govern the agent's context and search dynamics, to address these challenges. PACEvolve combines hierarchical context management (HCM) with pruning to address context pollution; momentum-based backtracking (MBB) to escape local minima; and a self-adaptive sampling policy that unifies backtracking and crossover for dynamic search coordination (CE), allowing agents to balance internal refinement with cross-trajectory collaboration. We demonstrate that PACEvolve provides a systematic path to consistent, long-horizon self-improvement, achieving state-of-the-art results on LLM-SR and KernelBench, while discovering solutions surpassing the record on Modded NanoGPT.
Related papers
- AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization [61.535567824938205]
We introduce AdaEvolve, a framework that reformulates LLM-driven evolution as a hierarchical adaptive optimization problem.<n>AdaEvolve consistently outperforms the open-ended baselines across 185 different open-ended optimization problems.
arXiv Detail & Related papers (2026-02-23T18:45:31Z) - K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model [57.440609834690385]
Existing approaches treat Large Language Models (LLMs) as rapid code generators within evolutionary loops.<n>We propose Search via Co-Evolving World Model and build K-Search based on this method.<n>We evaluate K-Search on diverse, complex kernels FlashInfer, including GQA, MLA, and MoE kernels.
arXiv Detail & Related papers (2026-02-22T11:06:22Z) - OR-Agent: Bridging Evolutionary Search and Structured Research for Automated Algorithm Discovery [10.217363774023033]
OR-Agent is a multi-agent research framework designed for automated exploration in rich experimental environments.<n>We introduce an evolutionary-systematic mechanism that unifies evolutionary selection of research starting points, comprehensive research plan generation, and coordinated exploration within a research tree.<n>We conduct experiments across classical optimization benchmarks-including traveling salesman, capacitated vehicle routing, bin packing, orienteering, and multiple knapsack problems-as well as a simulation-based cooperative driving scenarios.
arXiv Detail & Related papers (2026-02-14T13:32:03Z) - Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration [49.9937230730202]
We propose Search-R2, a novel Actor-Refiner collaboration framework that enhances reasoning through targeted intervention.<n>Our approach decomposes the generation process into an Actor, which produces initial reasoning trajectories.<n>We show that Search-R2 consistently outperforms strong RAG and RL-based baselines across model scales.
arXiv Detail & Related papers (2026-02-03T15:32:09Z) - PathWise: Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs [16.59846708454225]
We propose a novel multi-agent reasoning framework, referred to as Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs (PathWise)<n>PathWise formulates a sequential decision process over an entailment graph serving as a compact, stateful memory of the search trajectory.<n> Experiments across diverse COPs show that PathWise converges faster to better generalizes, generalizes across different LLM backbones, and scales to larger problem sizes.
arXiv Detail & Related papers (2026-01-28T12:34:50Z) - Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search [56.78490647843876]
Agentic search has emerged as a promising paradigm for complex information seeking by enabling Large Language Models (LLMs) to interleave reasoning with tool use.<n>We propose bfM-ASK, a framework that explicitly decouples agentic search into two complementary roles: Search Behavior Agents, which plan and execute search actions, and Knowledge Management Agents, which aggregate, filter, and maintain a compact internal context.
arXiv Detail & Related papers (2026-01-08T08:13:27Z) - LoongFlow: Directed Evolutionary Search via a Cognitive Plan-Execute-Summarize Paradigm [8.050281821865978]
LoongFlow is a self-evolving agent framework that achieves state-of-the-art solution quality with significantly reduced computational costs.<n>Unlike "blind" mutation operators, LoongFlow integrates Large Language Models into a cognitive "Plan-Execute-Summarize" (PES) paradigm.<n>To sustain long-term architectural coherence, we incorporate a hybrid evolutionary memory system.
arXiv Detail & Related papers (2025-12-30T08:39:28Z) - IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction [107.49922328855025]
IterResearch is a novel iterative deep-research paradigm that reformulates long-horizon research as a Markov Decision Process.<n>It achieves substantial improvements over existing open-source agents with average +14.5pp across six benchmarks.<n>It serves as an effective prompting strategy, improving frontier models by up to 19.2pp over ReAct on long-horizon tasks.
arXiv Detail & Related papers (2025-11-10T17:30:08Z) - Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails [103.05296856071931]
We identify the Alignment Tipping Process (ATP), a critical post-deployment risk unique to self-evolving Large Language Model (LLM) agents.<n>ATP arises when continual interaction drives agents to abandon alignment constraints established during training in favor of reinforced, self-interested strategies.<n>Our experiments show that alignment benefits erode rapidly under self-evolution, with initially aligned models converging toward unaligned states.
arXiv Detail & Related papers (2025-10-06T14:48:39Z) - Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm [60.36837655498119]
We propose a Trajectory-based validated-by-Reproducing Agent-benchmark Complexity Evolution framework.<n>This framework takes an original task from an existing benchmark and encourages agents to evolve it into a new task with higher difficulty.<n>Experiments on the GAIA benchmark demonstrate that the TRACE framework consistently enhances task complexity while improving the reliability of correctness.
arXiv Detail & Related papers (2025-10-01T01:52:52Z) - Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics [0.0]
Collective action problems, which require aligning individual incentives with collective goals, are classic examples of Ill-Structured Problems (ISPs)<n>We present ECHO-MIMIC, a computational framework that converts this global complexity into a tractable, Well-Structured Problem (WSP) for each agent.<n>By coupling algorithmic discovery with tailored communication, ECHO-MIMIC transforms the cognitive burden of collective action into a simple set of agent-level instructions.
arXiv Detail & Related papers (2025-09-24T08:26:56Z) - PILOC: A Pheromone Inverse Guidance Mechanism and Local-Communication Framework for Dynamic Target Search of Multi-Agent in Unknown Environments [11.626888857723067]
We propose PILOC, a framework that operates without global prior knowledge, leveraging local perception and communication.<n> PILOC promotes decentralized cooperation through local communication, significantly reducing reliance on global channels.<n>Results show that combining local communication with pheromone-based guidance significantly boosts search efficiency, adaptability, and system robustness.
arXiv Detail & Related papers (2025-07-10T02:10:18Z) - Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout [16.454305212398328]
We propose a goal-conditioned hierarchical reinforcement learning (HRL) framework named Guided Cooperation via Model-based Rollout (GCMR)
GCMR aims to bridge inter-layer information synchronization and cooperation by exploiting forward dynamics.
Experimental results demonstrate that incorporating the proposed GCMR framework with a disentangled variant of HIGL, namely ACLG, yields more stable and robust policy improvement.
arXiv Detail & Related papers (2023-09-24T00:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.