Fugu-MT 論文翻訳(概要): Bench2Drive-VL: Benchmarks for Closed-Loop Autonomous Driving with Vision-Language Models

論文の概要: Bench2Drive-VL: Benchmarks for Closed-Loop Autonomous Driving with Vision-Language Models

arxiv url: http://arxiv.org/abs/2604.01259v1
Date: Wed, 01 Apr 2026 11:38:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-03 14:21:09.57746
Title: Bench2Drive-VL: Benchmarks for Closed-Loop Autonomous Driving with Vision-Language Models
Title（参考訳）: Bench2Drive-VL:視覚言語モデルを用いた閉ループ自動運転のベンチマーク
Authors: Xiaosong Jia, Yuqian Shao, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, Junchi Yan,
Abstract要約: 自律運転においては、閉ループ評価はオープンループ評価よりも信頼性の高い検証方法として広く認識されている。本稿では,VLM駆動における閉ループ評価を実現するBench2Drive-VLについて述べる。
参考スコア（独自算出の注目度）: 50.22099309218635
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: With the rise of vision-language models (VLM), their application for autonomous driving (VLM4AD) has gained significant attention. Meanwhile, in autonomous driving, closed-loop evaluation has become widely recognized as a more reliable validation method than open-loop evaluation, as it can evaluate the performance of the model under cumulative errors and out-of-distribution inputs. However, existing VLM4AD benchmarks evaluate the model`s scene understanding ability under open-loop, i.e., via static question-answer (QA) dataset. This kind of evaluation fails to assess the VLMs performance under out-of-distribution states rarely appeared in the human collected datasets.To this end, we present Bench2Drive-VL, an extension of Bench2Drive that brings closed-loop evaluation to VLM-based driving, which introduces: (1) DriveCommenter, a closed-loop generator that automatically generates diverse, behavior-grounded question-answer pairs for all driving situations in CARLA,including severe off-route and off-road deviations previously unassessable in simulation. (2) A unified protocol and interface that allows modern VLMs to be directly plugged into the Bench2Drive closed-loop environment to compare with traditional agents. (3) A flexible reasoning and control framework, supporting multi-format visual inputs and configurable graph-based chain-of-thought execution. (4) A complete development ecosystem. Together, these components form a comprehensive closed-loop benchmark for VLM4AD. All codes and annotated datasets are open sourced.
Abstract（参考訳）: 視覚言語モデル(VLM)の台頭に伴い、自律運転(VLM4AD)への応用が注目されている。一方、自律運転においては、累積誤差やアウト・オブ・ディストリビューション入力下でのモデルの性能を評価することができるため、クローズドループ評価はオープンループ評価よりも信頼性の高い検証方法として広く認識されている。しかしながら、既存のVLM4ADベンチマークでは、静的質問応答(QA)データセットを通じて、オープンループ下でのモデルのシーン理解能力を評価している。この種の評価は,人為的データセットにはほとんど現れない,分布外状態下でのVLMの性能評価に失敗するが,Bench2Drive-VLは,VLMベースの運転に閉ループ評価をもたらすBench2Driveの拡張であり,(1)CARLAにおける全運転状況に対して,多様な行動的質問応答ペアを自動生成するクローズループジェネレータであるDriveCommenterは,これまでシミュレーションでは評価できなかった厳密なオフルートとオフロードの偏差を含む。 2) 最新のVLMをBench2Driveのクローズドループ環境に直接接続して従来のエージェントと比較できる統一されたプロトコルとインターフェース。 (3)マルチフォーマット視覚入力とグラフベースのチェーン・オブ・思想実行をサポートするフレキシブルな推論・制御フレームワーク。 (4) 完全な開発エコシステム。これらのコンポーネントは、VLM4ADの包括的なクローズドループベンチマークを構成する。すべてのコードと注釈付きデータセットはオープンソースである。

論文の概要: Bench2Drive-VL: Benchmarks for Closed-Loop Autonomous Driving with Vision-Language Models

関連論文リスト