Fugu-MT 論文翻訳(概要): Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution

論文の概要: Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution

arxiv url: http://arxiv.org/abs/2604.07725v2
Date: Fri, 10 Apr 2026 17:54:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 13:51:27.756401
Title: Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution
Title（参考訳）: Squeeze Evolve: 検証自由進化のための統一マルチモデルオーケストレーション
Authors: Monishwaran Maheswaran, Leon Lakhani, Zhongzhu Zhou, Shijia Yang, Junxiong Wang, Coleman Hooper, Yuezhou Hu, Rishabh Tiwari, Jue Wang, Harman Singh, Qingyang Wu, Yuqing Jian, Ce Zhang, Kurt Keutzer, Tri Dao, Xiaoxia Wu, Ben Athiwaratkun, James Zou, Chenfeng Xu,
Abstract要約: 検証不要な進化推論のための統合型マルチモデルオーケストレーションフレームワークであるSqueeze Evolveを紹介する。われわれのアプローチは単純な原則で導かれており、最良効能を有するモデル能力を割り当てる。
参考スコア（独自算出の注目度）: 81.46210789228296
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We show that verifier-free evolution is bottlenecked by both diversity and efficiency: without external correction, repeated evolution accelerates collapse toward narrow modes, while the uniform use of a high-cost model wastes compute and quickly becomes economically impractical. We introduce Squeeze Evolve, a unified multi-model orchestration framework for verifier-free evolutionary inference. Our approach is guided by a simple principle: allocate model capability where it has the highest marginal utility. Stronger models are reserved for high-impact stages, while cheaper models handle the other stages at much lower costs. This principle addresses diversity and cost-efficiency jointly while remaining lightweight. Squeeze Evolve naturally supports open-source, closed-source, and mixed-model deployments. Across AIME 2025, HMMT 2025, LiveCodeBench V6, GPQA-Diamond, ARC-AGI-V2, and multimodal vision benchmarks, such as MMMU-Pro and BabyVision, Squeeze Evolve consistently improves the cost-capability frontier over single-model evolution and achieves new state-of-the-art results on several tasks. Empirically, Squeeze Evolve reduces API cost by up to $\sim$3$\times$ and increases fixed-budget serving throughput by up to $\sim$10$\times$. Moreover, on discovery tasks, Squeeze Evolve is the first verifier-free evolutionary method to match, and in some cases exceed, the performance of verifier-based evolutionary methods.
Abstract（参考訳）: 検証不要な進化は, 外部修正がなければ, 繰り返しの進化は狭いモードへの崩壊を加速し, コストのかかるモデル廃棄物の均一利用は計算し, 経済的に非現実的なものとなる。検証不要な進化推論のための統合型マルチモデルオーケストレーションフレームワークであるSqueeze Evolveを紹介する。われわれのアプローチは単純な原理で導かれており、最良効能を有するモデル能力を割り当てる。より強力なモデルは高いインパクトのステージに予約され、より安価なモデルはより低コストで他のステージを扱う。この原則は、軽量を維持しながら、多様性と費用対効果を共同で解決する。 Squeeze Evolveは、自然にオープンソース、クローズドソース、ミックスモデルデプロイメントをサポートしている。 AIME 2025, HMMT 2025, LiveCodeBench V6, GPQA-Diamond, ARC-AGI-V2, MMMU-ProやBabyVision, Squeeze Evolveといったマルチモーダルビジョンベンチマークは、単一モデル進化に対するコスト-キャパビリティのフロンティアを一貫して改善し、いくつかのタスクにおいて新たな最先端結果を達成する。経験的に、Squeeze EvolveはAPIコストを最大$\sim$3$\times$に削減し、固定予算サービススループットを最大$\sim$10$\times$に向上させる。さらに、発見タスクにおいて、Squeeze Evolveは最初の検証対象のない進化法であり、場合によっては検証対象に基づく進化法の性能を上回っている。

論文の概要: Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution

関連論文リスト