Fugu-MT 論文翻訳(概要): The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic

論文の概要: The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic

arxiv url: http://arxiv.org/abs/2511.02563v1
Date: Tue, 04 Nov 2025 13:36:03 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-05 18:47:06.001174
Title: The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic
Title（参考訳）: 都市ビジョンハッカソンデータセットとモデル:インド交通における画像アノテーションと正確なビジョンモデルを目指して
Authors: Akash Sharma, Chinmay Mhatre, Sankalp Gawali, Ruthvik Bokkasam, Brij Kishore, Vishwajeet Pattanaik, Tarun Rambha, Abdul R. Pinjari, Vijay Kovvali, Anirban Chakraborty, Punit Rathore, Raghu Krishnapuram, Yogesh Simmhan,
Abstract要約: UVH-26は、AIM@IIScによるインドからの注釈付きトラフィックカメラ画像の大規模なデータセットの最初のパブリックリリースである。データセットは、バンガロールの2800台の安全都市CCTVカメラから4週間にわたって採取された高解像度(1080p)の画像26,646枚からなる。合計で、インド固有の14の車種に180万のバウンディングボックスがラベル付けされた。
参考スコア（独自算出の注目度）: 6.346576275272361
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This report describes the UVH-26 dataset, the first public release by AIM@IISc of a large-scale dataset of annotated traffic-camera images from India. The dataset comprises 26,646 high-resolution (1080p) images sampled from 2800 Bengaluru's Safe-City CCTV cameras over a 4-week period, and subsequently annotated through a crowdsourced hackathon involving 565 college students from across India. In total, 1.8 million bounding boxes were labeled across 14 vehicle classes specific to India: Cycle, 2-Wheeler (Motorcycle), 3-Wheeler (Auto-rickshaw), LCV (Light Commercial Vehicles), Van, Tempo-traveller, Hatchback, Sedan, SUV, MUV, Mini-bus, Bus, Truck and Other. Of these, 283k-316k consensus ground truth bounding boxes and labels were derived for distinct objects in the 26k images using Majority Voting and STAPLE algorithms. Further, we train multiple contemporary detectors, including YOLO11-S/X, RT-DETR-S/X, and DAMO-YOLO-T/L using these datasets, and report accuracy based on mAP50, mAP75 and mAP50:95. Models trained on UVH-26 achieve 8.4-31.5% improvements in mAP50:95 over equivalent baseline models trained on COCO dataset, with RT-DETR-X showing the best performance at 0.67 (mAP50:95) as compared to 0.40 for COCO-trained weights for common classes (Car, Bus, and Truck). This demonstrates the benefits of domain-specific training data for Indian traffic scenarios. The release package provides the 26k images with consensus annotations based on Majority Voting (UVH-26-MV) and STAPLE (UVH-26-ST) and the 6 fine-tuned YOLO and DETR models on each of these datasets. By capturing the heterogeneity of Indian urban mobility directly from operational traffic-camera streams, UVH-26 addresses a critical gap in existing global benchmarks, and offers a foundation for advancing detection, classification, and deployment of intelligent transportation systems in emerging nations with complex traffic conditions.
Abstract（参考訳）: 本報告では、インドからの注釈付きトラフィックカメラ画像の大規模なデータセットである、AIM@IIScによる最初の公開リリースであるUVH-26データセットについて述べる。このデータセットは、ベンガルルのCCTVカメラ2800台から4週間にわたって採取された高解像度画像26,646枚(1080p)からなり、その後、インド全土から565人の大学生が参加するクラウドソースのハッカソンを通じて注釈が付された。合計180万台のバウンディングボックスは、インド固有の14種類の車種(Cycle, 2-Wheeler (Motorcycle), 3-Wheeler (Auto-rickshaw), LCV (Light Commercial Vehicles), Van, Tempo-Traveller, Hatchback, Sedan, SUV, MUV, Mini-bus, Bus, Truckなど)にラベルが付けられている。これらのうち283k-316kのコンセンサス基底真理境界ボックスとラベルは、Majority VotingとSTAPLEアルゴリズムを用いて26k画像の異なる対象に対して導出された。さらに、これらのデータセットを用いて、YOLO11-S/X、RT-DETR-S/X、DAMO-YOLO-T/Lを含む複数の現代の検出器を訓練し、mAP50、mAP75、mAP50:95に基づいて精度を報告する。 UVH-26でトレーニングされたモデルは、COCOデータセットでトレーニングされた同等のベースラインモデルよりも8.4-31.5%改善された。これは、インドのトラフィックシナリオに対するドメイン固有のトレーニングデータの利点を示しています。リリースパッケージは、多数決投票(UVH-26-MV)とSTAPLE(UVH-26-ST)に基づくコンセンサスアノテーションを備えた26kイメージと、これらのデータセットの6つの微調整されたYOLOとDETRモデルを提供する。 UVH-26は、運用中の交通カメラストリームから直接インドの都市モビリティの異質性を捉えることで、既存のグローバルベンチマークにおける重要なギャップに対処し、複雑な交通状況の新興国におけるインテリジェントな交通システムの検出、分類、展開を促進する基盤を提供する。

論文の概要: The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic

関連論文リスト