Fugu-MT 論文翻訳(概要): Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

論文の概要: Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

arxiv url: http://arxiv.org/abs/2605.14062v1
Date: Wed, 13 May 2026 19:35:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 21:45:34.480795
Title: Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection
Title（参考訳）: マルチステージイン・フライ・リジェクションによる高効率LCM合成データ生成
Authors: Anjir Ahmed Chowdhury, Syed Zawad, Feng Yan,
Abstract要約: Multi-Stage In-Flight Rejection (MSIFR) は軽量でトレーニング不要なフレームワークで、完成前に低品質な世代軌道を終了する。飛行中の拒否を逐次決定プロセスとして定式化し、非自明な破棄ポリシーが期待されるトークン消費を減少させることを示す。
参考スコア（独自算出の注目度）: 3.1572670872557196
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While synthetic data generation with large language models (LLMs) is widely used in post-training pipelines, existing approaches typically generate full outputs before applying quality filters, leading to substantial token waste on samples that are ultimately discarded. To address this, we propose Multi-Stage In-Flight Rejection (MSIFR), a lightweight, training-free framework that detects and terminates low-quality generation trajectories at intermediate checkpoints before they reach full completion. MSIFR decomposes the generation process into sequential stages and applies fast rule-based validators to identify arithmetic inconsistencies, hallucination patterns, and formatting violations, enabling early rejection of faulty samples. We formalize in-flight rejection as a sequential decision process and show that any non-trivial discard policy reduces expected token consumption, with stage-wise savings increasing when rejection occurs earlier in the generation pipeline. We further demonstrate that conditional utility estimates form a martingale, ensuring that early, in-flight rejection does not bias the expected utility of retained samples. Across five instruction-tuned models and seven reasoning benchmarks, MSIFR reduces token consumption by 11%-77% as a standalone method, and up to 78.2% when combined with early-exit methods, while preserving or improving evaluation accuracy. These results confirm that MSIFR provides a practical mechanism for improving the efficiency of LLM-based synthetic data generation without additional training or architectural changes.
Abstract（参考訳）: 大規模言語モデル(LLM)を用いた合成データ生成は、訓練後のパイプラインで広く使われているが、既存のアプローチは通常、品質フィルタを適用する前に完全な出力を生成し、最終的に破棄されるサンプルにかなりのトークンの無駄をもたらす。そこで本研究では,マルチステージ・イン・フライ・リジェクション(Multi-Stage In-Flight Rejection, MSIFR)を提案する。 MSIFRは、生成プロセスを逐次段階に分解し、高速なルールベースのバリデータを適用して、演算の不整合、幻覚パターン、フォーマット違反を識別し、欠陥サンプルの早期拒絶を可能にする。我々は、飛行中の拒絶を逐次決定プロセスとして定式化し、非自明な破棄ポリシーが期待されるトークン消費を減少させることを示す。さらに、条件付き効用推定がマーチンゲールを形成し、早期の飛行中の拒絶が保持サンプルの期待効用に偏らないことを保証する。 5つの命令調整されたモデルと7つの推論ベンチマークで、MSIFRはトークンの消費を11%-77%削減し、早期終了法と組み合わせて78.2%まで削減し、評価精度を保留または改善した。これらの結果から, MSIFRは, 付加的なトレーニングやアーキテクチャ変更を伴わずに, LLMベースの合成データ生成の効率を向上させるための実用的なメカニズムを提供することを確認した。

論文の概要: Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

関連論文リスト