Fugu-MT 論文翻訳(概要): AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems

論文の概要: AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems

arxiv url: http://arxiv.org/abs/2604.16804v1
Date: Sat, 18 Apr 2026 03:24:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.18092
Title: AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems
Title（参考訳）: AutoOR: 運用研究を自動化するためのスケーラブルなポストトレーニングLDM
Authors: Sumeet Ramesh Motwani, Chuan Du, Aleksander Petrov, Christopher Davis, Philip Torr, Antonio Papania-Davis, Weishi Yan,
Abstract要約: 本稿では,拡張性のある合成データ生成および強化学習パイプラインであるAutoORについて述べる。 AutoORは、標準最適化フォームから検証済みのトレーニングデータを生成し、RL後トレーニングの報奨信号としてソルバ実行フィードバックを使用する。我々は、AutoORのような手法がAIによる工業的意思決定を著しく加速できると考えている。
参考スコア（独自算出の注目度）: 54.593031581486116
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Optimization problems are central to decision-making in manufacturing, logistics, scheduling, and other industrial settings. Translating complicated descriptions of these problems into solver-ready formulations requires specialized operations research (OR) expertise, making it hard to scale. We present AutoOR, a scalable synthetic data generation and reinforcement learning pipeline that trains LLMs to autoformalize optimization problems specified in natural language across linear, mixed-integer, and non-linear categories. AutoOR generates verified training data from standard optimization forms and uses solver execution feedback as the reward signal for RL post-training. AutoOR applied to an 8B model achieves state-of-the-art or competitive results across six established OR benchmarks, matching significantly larger frontier models. For a non-linear problem class involving physical dynamics, where frontier models score near 0%, we introduce a curriculum RL strategy that bootstraps from limited initial training data to make this class tractable for post-training. We believe that methods such as AutoOR can significantly accelerate industrial decision-making with AI.
Abstract（参考訳）: 最適化問題は、製造、物流、スケジューリング、その他の産業環境における意思決定の中心である。これらの問題の複雑な記述をソルバ対応の定式化に変換するには、専門的なオペレーションリサーチ(OR)の専門知識が必要であるため、スケールアップが困難である。本稿では,線形,混合整数,非線形のカテゴリにまたがって,自然言語で指定された最適化問題の自動形式化をLLMに教える,スケーラブルな合成データ生成および強化学習パイプラインであるAutoORを提案する。 AutoORは、標準最適化フォームから検証済みのトレーニングデータを生成し、RL後トレーニングの報奨信号としてソルバ実行フィードバックを使用する。 8Bモデルに適用されたAutoORは、6つの確立されたORベンチマークで最先端または競合的な結果を達成する。本稿では,フロンティアモデルのスコアが0%近くとなる物理力学を含む非線形問題クラスについて,初期訓練データからブートストラップするカリキュラムRL戦略を導入し,このクラスを後学習に利用できるようにする。我々は、AutoORのような手法がAIによる工業的意思決定を著しく加速できると考えている。

論文の概要: AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems

関連論文リスト