Fugu-MT 論文翻訳(概要): EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models

論文の概要: EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models

arxiv url: http://arxiv.org/abs/2605.11859v1
Date: Tue, 12 May 2026 09:43:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.763637
Title: EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models
Title（参考訳）: EvoNav:大規模言語モデルを用いたロボットナビゲーションのための進化的リワード関数設計
Authors: Zhikai Zhao, Chuanbo Hua, Federico Berto, Zihan Ma, Kanghoon Lee, Jiachen Li, Jinkyoo Park,
Abstract要約: EvoNavは、大型言語モデル(LLM)によるロボットナビゲーション報酬関数の設計を自動化する進化的フレームワークである。実験結果から,EvoNavは手作業で設計したRL報酬や最先端の報酬設計手法よりも効果的なナビゲーションポリシーを生成することがわかった。
参考スコア（独自算出の注目度）: 29.561015132869173
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robot navigation is a crucial task with applications to social robots in dynamic human environments. While Reinforcement Learning (RL) has shown great promise for this problem, the policy quality is highly sensitive to the specification of reward functions. Hand-crafted rewards require substantial domain expertise and embed inductive biases that are difficult to audit or adapt, limiting their effectiveness and leading to suboptimal performance. In this paper, we propose EvoNav, an evolutionary framework that automates the design of robot navigation reward functions via large language models (LLMs). To overcome prohibitively costly policy training, EvoNav evaluates each candidate proposal from the LLM via a progressive three-stage warm-up-boost procedure. EvoNav advances from analytical proxies with low-cost surrogates, such as small datasets and analytic rules, to lightweight rollouts and, finally, to full policy training, enabling computationally efficient exploration under effective feedback. Experiment results show that EvoNav produces more effective navigation policies than manually designed RL rewards and state-of-the-art reward design methods.
Abstract（参考訳）: ロボットナビゲーションは、動的な人間環境における社会ロボットへの応用にとって重要なタスクである。強化学習(Reinforcement Learning, RL)はこの問題に対して大きな期待を示しているが、政策品質は報酬関数の仕様に非常に敏感である。手作りの報酬は、かなりのドメインの専門知識を必要とし、監査や適応が難しい帰納的バイアスを埋め込んで、その効果を制限し、最適以下のパフォーマンスをもたらす。本稿では,大規模言語モデル(LLM)を用いたロボットナビゲーション報酬関数の設計を自動化する進化的フレームワークであるEvoNavを提案する。費用のかかる政策トレーニングを克服するために、EvoNavは、進歩的な3段階ウォームアップブースト手順を通じて、LSMからの候補提案を評価する。 EvoNavは、小さなデータセットや分析規則のような低コストのサロゲートを持つ分析プロキシから軽量なロールアウト、そして最後に完全なポリシートレーニングへと進化し、効果的なフィードバックの下で計算効率の良い探索を可能にしている。実験結果から,EvoNavは手作業で設計したRL報酬や最先端の報酬設計手法よりも効果的なナビゲーションポリシーを生成することがわかった。

論文の概要: EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models

関連論文リスト