A Fine-Grained Vehicle Detection (FGVD) Dataset for Unconstrained Roads
- URL: http://arxiv.org/abs/2212.14569v1
- Date: Fri, 30 Dec 2022 06:50:15 GMT
- Title: A Fine-Grained Vehicle Detection (FGVD) Dataset for Unconstrained Roads
- Authors: Prafful Kumar Khoba, Chirag Parikh, Rohit Saluja, Ravi Kiran
Sarvadevabhatla, C. V. Jawahar
- Abstract summary: We introduce the first Fine-Grained Vehicle Detection dataset in the wild, captured from a moving camera mounted on a car.
It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy.
While previous classification datasets also include makes for different kinds of cars, the FGVD dataset introduces new class labels for categorizing two-wheelers, autorickshaws, and trucks.
- Score: 29.09167268252761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The previous fine-grained datasets mainly focus on classification and are
often captured in a controlled setup, with the camera focusing on the objects.
We introduce the first Fine-Grained Vehicle Detection (FGVD) dataset in the
wild, captured from a moving camera mounted on a car. It contains 5502 scene
images with 210 unique fine-grained labels of multiple vehicle types organized
in a three-level hierarchy. While previous classification datasets also include
makes for different kinds of cars, the FGVD dataset introduces new class labels
for categorizing two-wheelers, autorickshaws, and trucks. The FGVD dataset is
challenging as it has vehicles in complex traffic scenarios with intra-class
and inter-class variations in types, scale, pose, occlusion, and lighting
conditions. The current object detectors like yolov5 and faster RCNN perform
poorly on our dataset due to a lack of hierarchical modeling. Along with
providing baseline results for existing object detectors on FGVD Dataset, we
also present the results of a combination of an existing detector and the
recent Hierarchical Residual Network (HRN) classifier for the FGVD task.
Finally, we show that FGVD vehicle images are the most challenging to classify
among the fine-grained datasets.
Related papers
- Argoverse 2: Next Generation Datasets for Self-Driving Perception and
Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain.
The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose.
The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z) - Weakly Supervised Training of Monocular 3D Object Detectors Using Wide
Baseline Multi-view Traffic Camera Data [19.63193201107591]
7DoF prediction of vehicles at an intersection is an important task for assessing potential conflicts between road users.
We develop an approach using a weakly supervised method of fine tuning 3D object detectors for traffic observation cameras.
Our method achieves vehicle 7DoF pose prediction accuracy on our dataset comparable to the top performing monocular 3D object detectors on autonomous vehicle datasets.
arXiv Detail & Related papers (2021-10-21T08:26:48Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - TICaM: A Time-of-flight In-car Cabin Monitoring Dataset [10.845284058153837]
TICaM is a Time-of-flight In-car Cabin Monitoring dataset for vehicle interior monitoring using a single wide-angle depth camera.
We record an exhaustive list of actions performed while driving and provide for them multi-modal labeled images.
Additional to real recordings, we provide a synthetic dataset of in-car cabin images with same multi-modality of images and annotations.
arXiv Detail & Related papers (2021-03-22T10:48:45Z) - TJU-DHD: A Diverse High-Resolution Dataset for Object Detection [48.94731638729273]
Large-scale, rich-diversity, and high-resolution datasets play an important role in developing better object detection methods.
We build a diverse high-resolution dataset (called TJU-DHD)
The dataset contains 115,354 high-resolution images and 709,330 labeled objects with a large variance in scale and appearance.
arXiv Detail & Related papers (2020-11-18T09:32:24Z) - Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images.
We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z) - Vehicle Detection of Multi-source Remote Sensing Data Using Active
Fine-tuning Network [26.08837467340853]
The proposed Ms-AFt framework integrates transfer learning, segmentation, and active classification into a unified framework for auto-labeling and detection.
The proposed Ms-AFt employs a fine-tuning network to firstly generate a vehicle training set from an unlabeled dataset.
Extensive experimental results conducted on two open ISPRS benchmark datasets, demonstrate the superiority and effectiveness of the proposed Ms-AFt for vehicle detection.
arXiv Detail & Related papers (2020-07-16T17:46:46Z) - EAGLE: Large-scale Vehicle Detection Dataset in Real-World Scenarios
using Aerial Imagery [3.8902657229395894]
We introduce a large-scale dataset for multi-class vehicle detection with object orientation information in aerial imagery.
It features high-resolution aerial images composed of different real-world situations with a wide variety of camera sensor, resolution, flight altitude, weather, illumination, haze, shadow, time, city, country, occlusion, and camera angle.
It contains 215,986 instances annotated with oriented bounding boxes defined by four points and orientation, making it by far the largest dataset to date in this task.
It also supports researches on the haze and shadow removal as well as super-resolution and in-painting applications.
arXiv Detail & Related papers (2020-07-12T23:00:30Z) - VehicleNet: Learning Robust Visual Representation for Vehicle
Re-identification [116.1587709521173]
We propose to build a large-scale vehicle dataset (called VehicleNet) by harnessing four public vehicle datasets.
We design a simple yet effective two-stage progressive approach to learning more robust visual representation from VehicleNet.
We achieve the state-of-art accuracy of 86.07% mAP on the private test set of AICity Challenge.
arXiv Detail & Related papers (2020-04-14T05:06:38Z) - Towards Accurate Vehicle Behaviour Classification With Multi-Relational
Graph Convolutional Networks [22.022759283770377]
We propose a pipeline for understanding vehicle behaviour from a monocular image sequence or video.
A temporal sequence of such encodings is fed to a recurrent network to label vehicle behaviours.
The proposed framework can classify a variety of vehicle behaviours to high fidelity on datasets that are diverse.
arXiv Detail & Related papers (2020-02-03T14:34:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.