Fugu-MT 論文翻訳(概要): Data Agent: Learning to Select Data via End-to-End Dynamic Optimization

論文の概要: Data Agent: Learning to Select Data via End-to-End Dynamic Optimization

arxiv url: http://arxiv.org/abs/2603.07433v1
Date: Sun, 08 Mar 2026 03:10:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:14.588125
Title: Data Agent: Learning to Select Data via End-to-End Dynamic Optimization
Title（参考訳）: データエージェント: エンドツーエンド動的最適化によるデータ選択の学習
Authors: Suorong Yang, Fangjian Su, Hai Gan, Ziqi Ye, Jie Li, Baile Xu, Furao Shen, Soujanya Poria,
Abstract要約: データエージェントは、トレーニング対応のシーケンシャルな意思決定問題としてデータ選択を定式化する。データエージェントは、パフォーマンスを保留または改善しながら、トレーニングを継続的に加速する。データセットに依存しない定式化とモジュラー報酬により、タスクやシナリオをプラグイン&プレイできる。
参考スコア（独自算出の注目度）: 37.1771265765151
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dynamic Data selection aims to accelerate training by prioritizing informative samples during online training. However, existing methods typically rely on task-specific handcrafted metrics or static/snapshot-based criteria to estimate sample importance, limiting scalability across learning paradigms and making it difficult to capture the evolving utility of data throughout training. To address this challenge, we propose Data Agent, an end-to-end dynamic data selection framework that formulates data selection as a training-aware sequential decision-making problem. The agent learns a sample-wise selection policy that co-evolves with model optimization, guided by a composite reward that integrates loss-based difficulty and confidence-based uncertainty signals. The reward signals capture complementary objectives of optimization impact and information gain, together with a tuning-free adaptive weighting mechanism that balances these signals over training. Extensive experiments across a wide range of datasets and architectures demonstrate that Data Agent consistently accelerates training while preserving or improving performance, e.g., reducing costs by over 50\% on ImageNet-1k and MMLU with lossless performance. Moreover, its dataset-agnostic formulation and modular reward make it plug-and-play across tasks and scenarios, e.g., robustness to noisy datasets, highlighting its potential in real-world scenarios.
Abstract（参考訳）: 動的データ選択は、オンライントレーニング中に情報的サンプルを優先順位付けすることで、トレーニングを加速することを目的としている。しかしながら、既存のメソッドは通常、サンプルの重要度を見積り、学習パラダイム間のスケーラビリティを制限し、トレーニングを通じてデータの進化する有用性を捉えるのが困難になるために、タスク固有の手作業メトリクスや静的/スナップショットベースの基準に依存しています。この課題に対処するために,データ選択をトレーニング対応のシーケンシャル意思決定問題として定式化する,エンドツーエンドの動的データ選択フレームワークであるData Agentを提案する。エージェントは、損失に基づく困難と信頼に基づく不確実性信号を統合する複合報酬によって導かれるモデル最適化と共進化するサンプルワイズ選択ポリシーを学習する。報酬信号は、最適化効果と情報ゲインの相補的な目的を、トレーニング中にこれらの信号のバランスをとる調整不要適応重み付け機構とともに捉えている。幅広いデータセットとアーキテクチャにわたる大規模な実験により、Data Agentは継続的にトレーニングを加速し、パフォーマンスを保存または改善している。さらに、データセットに依存しない定式化とモジュラー報酬により、タスクやシナリオ、例えば、ノイズの多いデータセットに対する堅牢性など、プラグアンドプレイが可能になり、現実のシナリオにおけるその可能性を強調している。

論文の概要: Data Agent: Learning to Select Data via End-to-End Dynamic Optimization

関連論文リスト