Fugu-MT 論文翻訳(概要): COVLM-RL: Critical Object-Oriented Reasoning for Autonomous Driving Using VLM-Guided Reinforcement Learning

論文の概要: COVLM-RL: Critical Object-Oriented Reasoning for Autonomous Driving Using VLM-Guided Reinforcement Learning

arxiv url: http://arxiv.org/abs/2512.09349v1
Date: Wed, 10 Dec 2025 06:18:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:40.250314
Title: COVLM-RL: Critical Object-Oriented Reasoning for Autonomous Driving Using VLM-Guided Reinforcement Learning
Title（参考訳）: COVLM-RL:VLM誘導強化学習を用いた自律走行のための臨界オブジェクト指向推論
Authors: Lin Li, Yuxin Cai, Jianwu Fang, Jianru Xue, Chen Lv,
Abstract要約: 批判的オブジェクト指向(CO)推論と強化学習(RL)を統合した新しいエンドツーエンド駆動フレームワークCOVLM-RLを提案する。 CARLAシミュレータで行った実験により、COVLM-RLはトレーニング運転環境における成功率を30%向上することが示された。
参考スコア（独自算出の注目度）: 55.83415345423854
License: http://creativecommons.org/licenses/by/4.0/
Abstract: End-to-end autonomous driving frameworks face persistent challenges in generalization, training efficiency, and interpretability. While recent methods leverage Vision-Language Models (VLMs) through supervised learning on large-scale datasets to improve reasoning, they often lack robustness in novel scenarios. Conversely, reinforcement learning (RL)-based approaches enhance adaptability but remain data-inefficient and lack transparent decision-making. % contribution To address these limitations, we propose COVLM-RL, a novel end-to-end driving framework that integrates Critical Object-oriented (CO) reasoning with VLM-guided RL. Specifically, we design a Chain-of-Thought (CoT) prompting strategy that enables the VLM to reason over critical traffic elements and generate high-level semantic decisions, effectively transforming multi-view visual inputs into structured semantic decision priors. These priors reduce the input dimensionality and inject task-relevant knowledge into the RL loop, accelerating training and improving policy interpretability. However, bridging high-level semantic guidance with continuous low-level control remains non-trivial. To this end, we introduce a consistency loss that encourages alignment between the VLM's semantic plans and the RL agent's control outputs, enhancing interpretability and training stability. Experiments conducted in the CARLA simulator demonstrate that COVLM-RL significantly improves the success rate by 30\% in trained driving environments and by 50\% in previously unseen environments, highlighting its strong generalization capability.
Abstract（参考訳）: エンドツーエンドの自動運転フレームワークは、一般化、トレーニング効率、解釈可能性において永続的な課題に直面している。最近の手法では、大規模データセットの教師付き学習を通じてビジョンランゲージモデル(VLM)を活用して推論を改善するが、新しいシナリオでは堅牢性に欠けることが多い。逆に、強化学習(RL)に基づくアプローチは適応性を高めるが、データ非効率であり、透明な意思決定を欠いている。提案するCOVLM-RLは,批判的オブジェクト指向(CO)推論とVLM誘導RLを統合した,新しいエンドツーエンド駆動フレームワークである。具体的には、VLMが重要なトラフィック要素を推論し、高レベルのセマンティックな決定を生成し、マルチビューの視覚的入力を構造化されたセマンティックな決定に効果的に変換する、チェーン・オブ・ソート(CoT)の促進戦略を設計する。これらの先行は、入力次元を減少させ、RLループにタスク関連知識を注入し、トレーニングを加速し、ポリシー解釈性を向上させる。しかし、連続的な低レベル制御による高レベルのセマンティックガイダンスのブリッジは、依然として簡単ではない。この目的のために、VLMのセマンティックプランとRLエージェントの制御出力との整合性を促進し、解釈可能性と訓練安定性を向上させる一貫性損失を導入する。 CARLAシミュレータで行った実験では、COVLM-RLはトレーニングされた運転環境における成功率を30倍、これまで見えない環境では50倍改善し、その強力な一般化能力を強調している。

論文の概要: COVLM-RL: Critical Object-Oriented Reasoning for Autonomous Driving Using VLM-Guided Reinforcement Learning

関連論文リスト