Image-Based Vehicle Classification by Synergizing Features from
Supervised and Self-Supervised Learning Paradigms
- URL: http://arxiv.org/abs/2302.00648v1
- Date: Wed, 1 Feb 2023 18:22:23 GMT
- Title: Image-Based Vehicle Classification by Synergizing Features from
Supervised and Self-Supervised Learning Paradigms
- Authors: Shihan Ma and Jidong J. Yang
- Abstract summary: Two state-of-the-art self-supervised learning methods, DINO and data2vec, were evaluated and compared for their representation learning of vehicle images.
The representations learned from these self-supervised learning methods were combined with the wheel positional features for the vehicle classification task.
Our experiments show that the data2vec-distilled representations, which are consistent with our wheel masking strategy, outperformed the DINO counterpart.
- Score: 0.913755431537592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces a novel approach to leverage features learned from both
supervised and self-supervised paradigms, to improve image classification
tasks, specifically for vehicle classification. Two state-of-the-art
self-supervised learning methods, DINO and data2vec, were evaluated and
compared for their representation learning of vehicle images. The former
contrasts local and global views while the latter uses masked prediction on
multi-layered representations. In the latter case, supervised learning is
employed to finetune a pretrained YOLOR object detector for detecting vehicle
wheels, from which definitive wheel positional features are retrieved. The
representations learned from these self-supervised learning methods were
combined with the wheel positional features for the vehicle classification
task. Particularly, a random wheel masking strategy was utilized to finetune
the previously learned representations in harmony with the wheel positional
features during the training of the classifier. Our experiments show that the
data2vec-distilled representations, which are consistent with our wheel masking
strategy, outperformed the DINO counterpart, resulting in a celebrated Top-1
classification accuracy of 97.2% for classifying the 13 vehicle classes defined
by the Federal Highway Administration.
Related papers
- Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning [13.613407983544427]
We introduce a robust model designed to withstand changes in camera position within the vehicle.
Our Driver Behavior Monitoring Network (DBMNet) relies on a lightweight backbone and integrates a disentanglement module.
Experiments conducted on the daytime and nighttime subsets of the 100-Driver dataset validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-11-20T10:27:12Z) - Self-Supervised Representation Learning from Temporal Ordering of
Automated Driving Sequences [49.91741677556553]
We propose TempO, a temporal ordering pretext task for pre-training region-level feature representations for perception tasks.
We embed each frame by an unordered set of proposal feature vectors, a representation that is natural for object detection or tracking systems.
Extensive evaluations on the BDD100K, nuImages, and MOT17 datasets show that our TempO pre-training approach outperforms single-frame self-supervised learning methods.
arXiv Detail & Related papers (2023-02-17T18:18:27Z) - Unsupervised Driving Event Discovery Based on Vehicle CAN-data [62.997667081978825]
This work presents a simultaneous clustering and segmentation approach for vehicle CAN-data that identifies common driving events in an unsupervised manner.
We evaluate our approach with a dataset of real Tesla Model 3 vehicle CAN-data and a two-hour driving session that we annotated with different driving events.
arXiv Detail & Related papers (2023-01-12T13:10:47Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Unsupervised Pretraining for Object Detection by Patch Reidentification [72.75287435882798]
Unsupervised representation learning achieves promising performances in pre-training representations for object detectors.
This work proposes a simple yet effective representation learning method for object detection, named patch re-identification (Re-ID)
Our method significantly outperforms its counterparts on COCO in all settings, such as different training iterations and data percentages.
arXiv Detail & Related papers (2021-03-08T15:13:59Z) - Discovering Discriminative Geometric Features with Self-Supervised
Attention for Vehicle Re-Identification and Beyond [23.233398760777494]
em first to successfully learn discriminative geometric features for vehicle ReID based on self-supervised attention.
We implement an end-to-end trainable deep network architecture consisting of three branches.
We conduct comprehensive experiments on three benchmark datasets for vehicle ReID, ie VeRi-776, CityFlow-ReID, and VehicleID, and demonstrate our state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T04:43:56Z) - Vehicle Attribute Recognition by Appearance: Computer Vision Methods for
Vehicle Type, Make and Model Classification [0.9645196221785693]
We survey a number of algorithms that identify vehicle properties ranging from coarse-grained level (vehicle type) to fine-grained level (vehicle make and model)
We discuss two alternative approaches for these tasks, including straightforward classification and a more flexible metric learning method.
We design a simulated real-world scenario for vehicle attribute recognition and present an experimental comparison of the two approaches.
arXiv Detail & Related papers (2020-06-29T21:33:06Z) - Learning predictive representations in autonomous driving to improve
deep reinforcement learning [9.919972770800822]
Reinforcement learning using a novel predictive representation is applied to autonomous driving.
The novel predictive representation is learned by general value functions (GVFs) to provide out-of-policy, or counter-factual, predictions of future lane centeredness and road angle.
Experiments in both simulation and the real-world demonstrate that predictive representations in reinforcement learning improve learning efficiency, smoothness of control and generalization to roads that the agent was never shown during training.
arXiv Detail & Related papers (2020-06-26T17:17:47Z) - VehicleNet: Learning Robust Visual Representation for Vehicle
Re-identification [116.1587709521173]
We propose to build a large-scale vehicle dataset (called VehicleNet) by harnessing four public vehicle datasets.
We design a simple yet effective two-stage progressive approach to learning more robust visual representation from VehicleNet.
We achieve the state-of-art accuracy of 86.07% mAP on the private test set of AICity Challenge.
arXiv Detail & Related papers (2020-04-14T05:06:38Z) - Parsing-based View-aware Embedding Network for Vehicle Re-Identification [138.11983486734576]
We propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID.
The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-04-10T13:06:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.