Evolving, Not Training: Zero-Shot Reasoning Segmentation via Evolutionary Prompting
- URL: http://arxiv.org/abs/2512.24702v1
- Date: Wed, 31 Dec 2025 08:10:03 GMT
- Title: Evolving, Not Training: Zero-Shot Reasoning Segmentation via Evolutionary Prompting
- Authors: Kai Ye, Xiaotong You, Jianghang Lin, Jiayi Ji, Pingyang Dai, Liujuan Cao,
- Abstract summary: We propose EVOL-SAM3, a zero-shot framework that reformulates reasoning segmentation as an inference-time evolutionary search process.<n>EVOL-SAM3 not only substantially outperforms static baselines but also significantly surpasses fully supervised state-of-the-art methods on the challenging ReasonSeg benchmark in a zero-shot setting.
- Score: 44.347846669388446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reasoning Segmentation requires models to interpret complex, context-dependent linguistic queries to achieve pixel-level localization. Current dominant approaches rely heavily on Supervised Fine-Tuning (SFT) or Reinforcement Learning (RL). However, SFT suffers from catastrophic forgetting and domain dependency, while RL is often hindered by training instability and rigid reliance on predefined reward functions. Although recent training-free methods circumvent these training burdens, they are fundamentally limited by a static inference paradigm. These methods typically rely on a single-pass "generate-then-segment" chain, which suffers from insufficient reasoning depth and lacks the capability to self-correct linguistic hallucinations or spatial misinterpretations. In this paper, we challenge these limitations and propose EVOL-SAM3, a novel zero-shot framework that reformulates reasoning segmentation as an inference-time evolutionary search process. Instead of relying on a fixed prompt, EVOL-SAM3 maintains a population of prompt hypotheses and iteratively refines them through a "Generate-Evaluate-Evolve" loop. We introduce a Visual Arena to assess prompt fitness via reference-free pairwise tournaments, and a Semantic Mutation operator to inject diversity and correct semantic errors. Furthermore, a Heterogeneous Arena module integrates geometric priors with semantic reasoning to ensure robust final selection. Extensive experiments demonstrate that EVOL-SAM3 not only substantially outperforms static baselines but also significantly surpasses fully supervised state-of-the-art methods on the challenging ReasonSeg benchmark in a zero-shot setting. The code is available at https://github.com/AHideoKuzeA/Evol-SAM3.
Related papers
- Learning to Evolve with Convergence Guarantee via Neural Unrolling [37.99564850768798]
We introduce Learning to Evolve (L2E), a unified bilevel meta-optimization framework.<n>L2E reformulates evolutionary search as a Neural Unrolling process grounded in Krasnosel'skii-Mann (KM) fixed-point theory.<n>Experiments demonstrate the scalability of L2E in high-dimensional spaces and its robust zero-shot generalization across synthetic and real-world control tasks.
arXiv Detail & Related papers (2025-12-12T10:46:25Z) - ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning [44.49803237328707]
ReVSeg executes reasoning as sequential decisions in the native interface of pretrained vision language models.<n>We employ reinforcement learning to optimize the multi-step reasoning chain, enabling the model to self-refine its decision quality from outcome-driven signals.
arXiv Detail & Related papers (2025-12-02T14:44:12Z) - Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs [78.18336140706471]
Sci-LLMs have emerged as a promising frontier for accelerating biological discovery.<n>Current strategies limit Sci-LLMs' reasoning capacity when processing raw biomolecular sequences.<n>We show that a more effective strategy is to provide Sci-LLMs with high-level structured context.
arXiv Detail & Related papers (2025-10-27T09:03:21Z) - Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails [103.05296856071931]
We identify the Alignment Tipping Process (ATP), a critical post-deployment risk unique to self-evolving Large Language Model (LLM) agents.<n>ATP arises when continual interaction drives agents to abandon alignment constraints established during training in favor of reinforced, self-interested strategies.<n>Our experiments show that alignment benefits erode rapidly under self-evolution, with initially aligned models converging toward unaligned states.
arXiv Detail & Related papers (2025-10-06T14:48:39Z) - Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation [74.75716642635484]
Large language models (LLMs) are increasingly trained with reinforcement learning from verifiable rewards (RLVR)<n>We propose EVOL-RL, a label-free framework that mirrors the evolutionary principle of balancing selection with variation.<n>EVOL-RL consistently outperforms the majority-only baseline.
arXiv Detail & Related papers (2025-09-18T17:50:04Z) - RL for Reasoning by Adaptively Revealing Rationales [36.50924054394857]
Supervised fine-tuning (SFT) relies on dense ground-truth labels, which become increasingly costly as sequence length grows.<n>We address this by adaptive backtracking (AdaBack), a per-sample curriculum learning algorithm that reveals only a partial prefix of the target output during training.<n>We show that our adaptive curriculum over partial answers reliably solves problems that are otherwise intractable.
arXiv Detail & Related papers (2025-06-22T17:46:14Z) - Scalable Reinforcement Post-Training Beyond Static Human Prompts: Evolving Alignment via Asymmetric Self-Play [52.3079697845254]
eva is the first method that allows language models to adaptively create training prompts in both offline and online RL post-training.<n>We show eva can create effective RL curricula and is robust across ablations.
arXiv Detail & Related papers (2024-10-31T08:15:32Z) - Diffusing States and Matching Scores: A New Framework for Imitation Learning [16.941612670582522]
Adversarial Imitation Learning is traditionally framed as a two-player zero-sum game between a learner and an adversarially chosen cost function.<n> diffusion models have emerged as a non-adversarial alternative to GANs that merely require training a score function via regression.<n>We show our approach outperforms both GAN-style imitation learning baselines and discriminator-free imitation learning baselines across various continuous control problems.
arXiv Detail & Related papers (2024-10-17T17:59:25Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Test-Time Training for Semantic Segmentation with Output Contrastive
Loss [12.535720010867538]
Deep learning-based segmentation models have achieved impressive performance on public benchmarks, but generalizing well to unseen environments remains a major challenge.
This paper introduces Contrastive Loss (OCL), known for its capability to learn robust and generalized representations, to stabilize the adaptation process.
Our method excels even when applied to models initially pre-trained using domain adaptation methods on test domain data, showcasing its resilience and adaptability.
arXiv Detail & Related papers (2023-11-14T03:13:47Z) - Over-parameterized Adversarial Training: An Analysis Overcoming the
Curse of Dimensionality [74.0084803220897]
Adversarial training is a popular method to give neural nets robustness against adversarial perturbations.
We show convergence to low robust training loss for emphpolynomial width instead of exponential, under natural assumptions and with the ReLU activation.
arXiv Detail & Related papers (2020-02-16T20:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.