Fugu-MT 論文翻訳(概要): CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine

論文の概要: CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine

arxiv url: http://arxiv.org/abs/2509.15968v1
Date: Fri, 19 Sep 2025 13:25:56 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-22 18:18:11.181028
Title: CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine
Title（参考訳）: CoReVLA: コレクション・アンド・リファインによる長距離シナリオのための2段階のエンドツーエンド自動運転フレームワーク
Authors: Shiyu Fang, Yiming Cui, Haoyang Liang, Chen Lv, Peng Hang, Jian Sun,
Abstract要約: CoReVLAは、自動運転のための継続的学習フレームワークである。データコレクションとビヘイビアリファインメントの2段階プロセスを通じて、ロングテールシナリオのパフォーマンスを改善する。 CoReVLAは72.18のドライビングスコア(DS)と50%の成功率(SR)を達成し、7.96DSの最先端手法と15%SRの長期的安全クリティカルシナリオで性能を向上する。
参考スコア（独自算出の注目度）: 73.74077186298523
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Autonomous Driving (AD) systems have made notable progress, but their performance in long-tail, safety-critical scenarios remains limited. These rare cases contribute a disproportionate number of accidents. Vision-Language Action (VLA) models have strong reasoning abilities and offer a potential solution, but their effectiveness is limited by the lack of high-quality data and inefficient learning in such conditions. To address these challenges, we propose CoReVLA, a continual learning end-to-end autonomous driving framework that improves the performance in long-tail scenarios through a dual-stage process of data Collection and behavior Refinement. First, the model is jointly fine-tuned on a mixture of open-source driving QA datasets, allowing it to acquire a foundational understanding of driving scenarios. Next, CoReVLA is deployed within the Cave Automatic Virtual Environment (CAVE) simulation platform, where driver takeover data is collected from real-time interactions. Each takeover indicates a long-tail scenario that CoReVLA fails to handle reliably. Finally, the model is refined via Direct Preference Optimization (DPO), allowing it to learn directly from human preferences and thereby avoid reward hacking caused by manually designed rewards. Extensive open-loop and closed-loop experiments demonstrate that the proposed CoReVLA model can accurately perceive driving scenarios and make appropriate decisions. On the Bench2Drive benchmark, CoReVLA achieves a Driving Score (DS) of 72.18 and a Success Rate (SR) of 50%, outperforming state-of-the-art methods by 7.96 DS and 15% SR under long-tail, safety-critical scenarios. Furthermore, case studies demonstrate the model's ability to continually improve its performance in similar failure-prone scenarios by leveraging past takeover experiences. All codea and preprocessed datasets are available at: https://github.com/FanGShiYuu/CoReVLA
Abstract（参考訳）: 自律運転(AD)システムは目覚ましい進歩を遂げているが、長距離かつ安全クリティカルなシナリオにおける性能は依然として限られている。これらの稀なケースは、不均等な数の事故に寄与する。 VLA(Vision-Language Action)モデルは、強力な推論能力を持ち、潜在的な解決策を提供するが、その効果は高品質なデータがないことと、そのような条件下での非効率な学習によって制限される。このような課題に対処するため,我々は,データ収集と行動再定義の2段階プロセスを通じて,長期シナリオのパフォーマンスを向上させる継続的学習エンドツーエンドの自動運転フレームワークであるCoReVLAを提案する。第一に、このモデルはオープンソース駆動QAデータセットの混合に基づいて、共同で微調整され、運転シナリオの基本的な理解を得ることができる。次に、CoReVLAはCave Automatic Virtual Environment (CAVE)シミュレーションプラットフォームにデプロイされる。各テイクオーバは、CoReVLAが確実に処理できない長いシナリオを示している。最後に、モデルはDPO(Direct Preference Optimization)によって洗練され、人間の好みから直接学習し、手動で設計した報酬による報酬のハッキングを避けることができる。大規模なオープンループおよびクローズループ実験により、提案されたCoReVLAモデルは、運転シナリオを正確に認識し、適切な決定を下すことができることを示した。 Bench2Driveベンチマークでは、CoReVLAは72.18のドライビングスコア(DS)と50%の成功率(SR)を達成し、7.96 DS、15% SRをロングテールで安全クリティカルなシナリオで達成している。さらに、ケーススタディでは、過去のテイクオーバー体験を活用することで、同様の障害が発生しやすいシナリオでモデルの性能を継続的に改善する能力を示す。 https://github.com/FanGShiYuu/CoReVLA

論文の概要: CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine

関連論文リスト