Fugu-MT 論文翻訳(概要): IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback

論文の概要: IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback

arxiv url: http://arxiv.org/abs/2605.12693v1
Date: Tue, 12 May 2026 19:43:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-14 23:30:27.651978
Title: IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback
Title（参考訳）: IGT-OMD:遅延フィードバックによる決定焦点学習のための暗黙のグラディエント・トランスポート
Authors: Benjamin Amoh, Geoffrey G. Parker, Wesley Marrero,
Abstract要約: 意思決定にフォーカスした学習トレインは、下流の意思決定損失に対してエンドツーエンドでモデルをモデル化するが、オンライン設定は遅延したフィードバックに悩まされる。遅延下での2レベル最適化に特有の障害モードであるエンフスタレンス増幅を同定する。ブラックボックス遅延勾配が内部解法近似誤差から既約後悔コストを生じることを証明した。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision-focused learning trains predictive models end-to-end against downstream decision loss, but online settings suffer delayed feedback: outcomes may not arrive for many environment interactions. We identify \emph{staleness amplification}, a failure mode unique to bilevel optimization under delay, in which gradient staleness couples with inner-solver sensitivity to inflate regret beyond single-level delay theory. We prove that any black-box delayed optimizer incurs an irreducible regret cost from inner-solver approximation error, and that gradient staleness contributes a quadratically growing transport error without bilevel-aware correction. Our algorithm, \textbf{IGT-OMD}, applies Implicit Gradient Transport to hypergradients within Online Mirror Descent, re-evaluating stale gradients at the current parameters using stored inner solutions. This method reduces transport error from a quadratic to a linear dependence on delay and achieves the first sublinear regret bound for delayed bilevel optimization with queue-length-adaptive step sizes. Controlled experiments provide a \emph{mechanistic fingerprint}: transport benefit is exactly $0.0\%$ ($p=1.00$) at unit delay and grows monotonically to $9.5\%$ at fifty rounds ($p<0.001$), isolating the correction's effect. On Linear Quadratic Regulator, Warcraft shortest-path, and Sinkhorn optimal transport, IGT-OMD reduces decision loss by $17$--$55\%$ relative to single-level baselines, with phase transitions matching the theory.
Abstract（参考訳）: 意思決定に焦点を当てた学習トレインは、下流の意思決定損失に対してエンドツーエンドで予測するが、オンライン設定は遅延したフィードバックに悩まされる。遅延下での2レベル最適化に特有の障害モードである 'emph{staleness amplification} を同定する。我々は,ブラックボックス遅延オプティマイザが内部解法近似誤差から既約後悔コストを発生させることを証明し,勾配安定度がバイレベル認識補正なしで2次的に増加する輸送誤差に寄与することを証明した。提案アルゴリズムはインプリシット・グラディエント・トランスポートをオンラインミラー・ディクエンス内の過度勾配に適用し, 現在のパラメータにおける定常勾配を再評価する。本手法は,2次数から1次数への遅延依存性を低減し,待ち行列長適応ステップサイズで遅延二段階最適化を行うための第1次サブ線形リフレッシュバウンドを実現する。輸送効果は正確に0.0 %$(p=1.00$)で単調に成長し、50ラウンドで9.5 %$(p<0.001$)に成長し、補正の効果を分離する。リニア・クアドラティック・レギュレータ、ウォークラフト・ショート・パス、シンクホーンの最適輸送では、IGT-OMDは決定損失を1つのレベル基底線に対して17$--55\%で削減し、位相遷移は理論と一致する。

論文の概要: IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback

関連論文リスト