Fugu-MT 論文翻訳(概要): Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM

論文の概要: Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM

arxiv url: http://arxiv.org/abs/2510.11121v1
Date: Mon, 13 Oct 2025 08:08:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:30.259429
Title: Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM
Title（参考訳）: 強化学習型LLMによるCVRPのハイブリッド遺伝的検索
Authors: Rongjie Zhu, Cong Zhang, Zhiguang Cao,
Abstract要約: 大型言語モデル (LLM) は、車両ルーティング問題 (VRP) の自動化デザイナーとして、ますます使われている。この作業は、小さくて専門的なLLMが微調整された場合、先進的な解法の中で専門家が作り上げたものを超えるコンポーネントを生成できることを実証することによって、パラダイムに挑戦する。高速クロスオーバー演算子を生成するために,小型LLMを微調整する新しい強化学習フレームワークRFTHGSを提案する。
参考スコア（独自算出の注目度）: 32.938753667649074
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: While large language models (LLMs) are increasingly used as automated heuristic designers for vehicle routing problems (VRPs), current state-of-the-art methods predominantly rely on prompting massive, general-purpose models like GPT-4. This work challenges that paradigm by demonstrating that a smaller, specialized LLM, when meticulously fine-tuned, can generate components that surpass expert-crafted heuristics within advanced solvers. We propose RFTHGS, a novel Reinforcement learning (RL) framework for Fine-Tuning a small LLM to generate high-performance crossover operators for the Hybrid Genetic Search (HGS) solver, applied to the Capacitated VRP (CVRP). Our method employs a multi-tiered, curriculum-based reward function that progressively guides the LLM to master generating first compilable, then executable, and finally, superior-performing operators that exceed human expert designs. This is coupled with an operator caching mechanism that discourages plagiarism and promotes diversity during training. Comprehensive experiments show that our fine-tuned LLM produces crossover operators which significantly outperform the expert-designed ones in HGS. The performance advantage remains consistent, generalizing from small-scale instances to large-scale problems with up to 1000 nodes. Furthermore, RFTHGS exceeds the performance of leading neuro-combinatorial baselines, prompt-based methods, and commercial LLMs such as GPT-4o and GPT-4o-mini.
Abstract（参考訳）: 大型言語モデル(LLM)は、車両ルーティング問題(VRP)の自動化ヒューリスティックデザイナとしてますます使われているが、現在の最先端の手法は、GPT-4のような大規模で汎用的なモデルに大きく依存している。この研究は、小さくて専門的なLSMが微調整された場合、先進的な解法の中で専門家によるヒューリスティックを超越したコンポーネントを生成できることを実証することによって、パラダイムに挑戦する。本稿では,Hybrid Genetic Search (HGS)ソルバのための高性能クロスオーバー演算子を生成するために,小型LCMを微調整する新しい強化学習(RL)フレームワークRFTHGSを提案する。提案手法では,LLMを段階的に指導し,コンパイル可能で,実行可能で,最終的には人間の設計を超える優れた演算子を生成する多階層型カリキュラムベースの報酬関数を用いる。これは、プラジャリズムを回避し、トレーニング中に多様性を促進するオペレータキャッシング機構と結合される。総合的な実験により、我々の微調整 LLM は、HGS のエキスパート設計の演算子よりも大幅に優れるクロスオーバー演算子を生成することが示された。パフォーマンス上の優位性は相変わらず維持され、1000ノードまでの大規模インスタンスから大規模問題へと一般化される。さらに, RFTHGS は, GPT-4o や GPT-4o-mini などの商業用 LLM など, 主要な神経組換えベースライン, プロンプトベース法, および商業用 LLM の性能を上回っている。

論文の概要: Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM

関連論文リスト