Related papers: PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution

PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution

URL: http://arxiv.org/abs/2601.10657v2
Date: Fri, 16 Jan 2026 22:31:40 GMT
Title: PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution
Authors: Minghao Yan, Bo Peng, Benjamin Coleman, Ziqi Chen, Zhouhang Xie, Shuo Chen, Zhankui He, Noveen Sachdeva, Isabella Ye, Weili Wang, Chi Wang, Ed H. Chi, Fernando Pereira, Wang-Cheng Kang, Derek Zhiyuan Cheng, Beidou Wang,
Abstract summary: PACEvolve is a framework designed to robustly govern the agent's context and search dynamics.<n>We demonstrate that PACEvolve provides a systematic path to consistent, long-horizon self-improvement.
Score: 64.15555230987222
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have emerged as powerful operators for evolutionary search, yet the design of efficient search scaffolds remains ad hoc. While promising, current LLM-in-the-loop systems lack a systematic approach to managing the evolutionary process. We identify three distinct failure modes: Context Pollution, where experiment history biases future candidate generation; Mode Collapse, where agents stagnate in local minima due to poor exploration-exploitation balance; and Weak Collaboration, where rigid crossover strategies fail to leverage parallel search trajectories effectively. We introduce Progress-Aware Consistent Evolution (PACEvolve), a framework designed to robustly govern the agent's context and search dynamics, to address these challenges. PACEvolve combines hierarchical context management (HCM) with pruning to address context pollution; momentum-based backtracking (MBB) to escape local minima; and a self-adaptive sampling policy that unifies backtracking and crossover for dynamic search coordination (CE), allowing agents to balance internal refinement with cross-trajectory collaboration. We demonstrate that PACEvolve provides a systematic path to consistent, long-horizon self-improvement, achieving state-of-the-art results on LLM-SR and KernelBench, while discovering solutions surpassing the record on Modded NanoGPT.

Related papers

AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization [61.535567824938205]
We introduce AdaEvolve, a framework that reformulates LLM-driven evolution as a hierarchical adaptive optimization problem.<n>AdaEvolve consistently outperforms the open-ended baselines across 185 different open-ended optimization problems.
arXiv Detail & Related papers (2026-02-23T18:45:31Z)
K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model [57.440609834690385]
Existing approaches treat Large Language Models (LLMs) as rapid code generators within evolutionary loops.<n>We propose Search via Co-Evolving World Model and build K-Search based on this method.<n>We evaluate K-Search on diverse, complex kernels FlashInfer, including GQA, MLA, and MoE kernels.
arXiv Detail & Related papers (2026-02-22T11:06:22Z)
OR-Agent: Bridging Evolutionary Search and Structured Research for Automated Algorithm Discovery [10.217363774023033]
OR-Agent is a multi-agent research framework designed for automated exploration in rich experimental environments.<n>We introduce an evolutionary-systematic mechanism that unifies evolutionary selection of research starting points, comprehensive research plan generation, and coordinated exploration within a research tree.<n>We conduct experiments across classical optimization benchmarks-including traveling salesman, capacitated vehicle routing, bin packing, orienteering, and multiple knapsack problems-as well as a simulation-based cooperative driving scenarios.
arXiv Detail & Related papers (2026-02-14T13:32:03Z)
Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration [49.9937230730202]
We propose Search-R2, a novel Actor-Refiner collaboration framework that enhances reasoning through targeted intervention.<n>Our approach decomposes the generation process into an Actor, which produces initial reasoning trajectories.<n>We show that Search-R2 consistently outperforms strong RAG and RL-based baselines across model scales.
arXiv Detail & Related papers (2026-02-03T15:32:09Z)
PathWise: Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs [16.59846708454225]
We propose a novel multi-agent reasoning framework, referred to as Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs (PathWise)<n>PathWise formulates a sequential decision process over an entailment graph serving as a compact, stateful memory of the search trajectory.<n> Experiments across diverse COPs show that PathWise converges faster to better generalizes, generalizes across different LLM backbones, and scales to larger problem sizes.
arXiv Detail & Related papers (2026-01-28T12:34:50Z)
Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search [56.78490647843876]
Agentic search has emerged as a promising paradigm for complex information seeking by enabling Large Language Models (LLMs) to interleave reasoning with tool use.<n>We propose bfM-ASK, a framework that explicitly decouples agentic search into two complementary roles: Search Behavior Agents, which plan and execute search actions, and Knowledge Management Agents, which aggregate, filter, and maintain a compact internal context.
arXiv Detail & Related papers (2026-01-08T08:13:27Z)
LoongFlow: Directed Evolutionary Search via a Cognitive Plan-Execute-Summarize Paradigm [8.050281821865978]
LoongFlow is a self-evolving agent framework that achieves state-of-the-art solution quality with significantly reduced computational costs.<n>Unlike "blind" mutation operators, LoongFlow integrates Large Language Models into a cognitive "Plan-Execute-Summarize" (PES) paradigm.<n>To sustain long-term architectural coherence, we incorporate a hybrid evolutionary memory system.
arXiv Detail & Related papers (2025-12-30T08:39:28Z)
IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction [107.49922328855025]
IterResearch is a novel iterative deep-research paradigm that reformulates long-horizon research as a Markov Decision Process.<n>It achieves substantial improvements over existing open-source agents with average +14.5pp across six benchmarks.<n>It serves as an effective prompting strategy, improving frontier models by up to 19.2pp over ReAct on long-horizon tasks.
arXiv Detail & Related papers (2025-11-10T17:30:08Z)
Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails [103.05296856071931]
We identify the Alignment Tipping Process (ATP), a critical post-deployment risk unique to self-evolving Large Language Model (LLM) agents.<n>ATP arises when continual interaction drives agents to abandon alignment constraints established during training in favor of reinforced, self-interested strategies.<n>Our experiments show that alignment benefits erode rapidly under self-evolution, with initially aligned models converging toward unaligned states.
arXiv Detail & Related papers (2025-10-06T14:48:39Z)
Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm [60.36837655498119]
We propose a Trajectory-based validated-by-Reproducing Agent-benchmark Complexity Evolution framework.<n>This framework takes an original task from an existing benchmark and encourages agents to evolve it into a new task with higher difficulty.<n>Experiments on the GAIA benchmark demonstrate that the TRACE framework consistently enhances task complexity while improving the reliability of correctness.
arXiv Detail & Related papers (2025-10-01T01:52:52Z)
Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics [0.0]
Collective action problems, which require aligning individual incentives with collective goals, are classic examples of Ill-Structured Problems (ISPs)<n>We present ECHO-MIMIC, a computational framework that converts this global complexity into a tractable, Well-Structured Problem (WSP) for each agent.<n>By coupling algorithmic discovery with tailored communication, ECHO-MIMIC transforms the cognitive burden of collective action into a simple set of agent-level instructions.
arXiv Detail & Related papers (2025-09-24T08:26:56Z)
PILOC: A Pheromone Inverse Guidance Mechanism and Local-Communication Framework for Dynamic Target Search of Multi-Agent in Unknown Environments [11.626888857723067]
We propose PILOC, a framework that operates without global prior knowledge, leveraging local perception and communication.<n> PILOC promotes decentralized cooperation through local communication, significantly reducing reliance on global channels.<n>Results show that combining local communication with pheromone-based guidance significantly boosts search efficiency, adaptability, and system robustness.
arXiv Detail & Related papers (2025-07-10T02:10:18Z)
Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout [16.454305212398328]
We propose a goal-conditioned hierarchical reinforcement learning (HRL) framework named Guided Cooperation via Model-based Rollout (GCMR) GCMR aims to bridge inter-layer information synchronization and cooperation by exploiting forward dynamics. Experimental results demonstrate that incorporating the proposed GCMR framework with a disentangled variant of HIGL, namely ACLG, yields more stable and robust policy improvement.
arXiv Detail & Related papers (2023-09-24T00:13:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.