Self-aligned Spatial Feature Extraction Network for UAV Vehicle
Re-identification
- URL: http://arxiv.org/abs/2201.02836v1
- Date: Sat, 8 Jan 2022 14:25:54 GMT
- Title: Self-aligned Spatial Feature Extraction Network for UAV Vehicle
Re-identification
- Authors: Aihuan Yao, Jiahao Qi, Ping Zhong
- Abstract summary: Vehicles with same color and type show extremely similar appearance from the UAV's perspective.
Recent works tend to extract distinguishing information by regional features and component features.
In order to extract efficient fine-grained features and avoid tedious annotating work, this letter develops an unsupervised self-aligned network.
- Score: 3.449626476434765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compared with existing vehicle re-identification (ReID) tasks conducted with
datasets collected by fixed surveillance cameras, vehicle ReID for unmanned
aerial vehicle (UAV) is still under-explored and could be more challenging.
Vehicles with the same color and type show extremely similar appearance from
the UAV's perspective so that mining fine-grained characteristics becomes
necessary. Recent works tend to extract distinguishing information by regional
features and component features. The former requires input images to be aligned
and the latter entails detailed annotations, both of which are difficult to
meet in UAV application. In order to extract efficient fine-grained features
and avoid tedious annotating work, this letter develops an unsupervised
self-aligned network consisting of three branches. The network introduced a
self-alignment module to convert the input images with variable orientations to
a uniform orientation, which is implemented under the constraint of triple loss
function designed with spatial features. On this basis, spatial features,
obtained by vertical and horizontal segmentation methods, and global features
are integrated to improve the representation ability in embedded space.
Extensive experiments are conducted on UAV-VeID dataset, and our method
achieves the best performance compared with recent ReID works.
Related papers
- V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric
Heterogenous Distillation Network [13.248981195106069]
We propose a multi-view vehicle-road cooperation perception system, vehicle-to-everything cooperative perception (V2X-AHD)
The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study.
arXiv Detail & Related papers (2023-10-10T13:12:03Z) - A Novel Dual-pooling Attention Module for UAV Vehicle Re-identification [7.9782462757515455]
Vehicle re-identification (Re-ID) involves identifying the same vehicle captured by other cameras, given a vehicle image.
Due to the high altitude of UAVs, the shooting angle of vehicle images sometimes approximates vertical, resulting in fewer local features for Re-ID.
This paper proposes a novel dual-pooling attention (DpA) module, which achieves the extraction and enhancement of locally important information about vehicles.
arXiv Detail & Related papers (2023-06-25T02:46:12Z) - CROVIA: Seeing Drone Scenes from Car Perspective via Cross-View
Adaptation [20.476683921252867]
We propose a novel Cross-View Adaptation (CROVIA) approach to adapt the knowledge learned from on-road vehicle views to UAV views.
First, a novel geometry-based constraint to cross-view adaptation is introduced based on the geometry correlation between views.
Second, cross-view correlations from image space are effectively transferred to segmentation space without any requirement of paired on-road and UAV view data.
arXiv Detail & Related papers (2023-04-14T15:20:40Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Unifying Voxel-based Representation with Transformer for 3D Object
Detection [143.91910747605107]
We present a unified framework for multi-modality 3D object detection, named UVTR.
The proposed method aims to unify multi-modality representations in the voxel space for accurate and robust single- or cross-modality 3D detection.
UVTR achieves leading performance in the nuScenes test set with 69.7%, 55.1%, and 71.1% NDS for LiDAR, camera, and multi-modality inputs, respectively.
arXiv Detail & Related papers (2022-06-01T17:02:40Z) - Discriminative-Region Attention and Orthogonal-View Generation Model for
Vehicle Re-Identification [7.5366501970852955]
Multiple challenges hamper the applications of vision-based vehicle Re-ID methods.
The proposed DRA model can automatically extract the discriminative region features, which can distinguish similar vehicles.
And the OVG model can generate multi-view features based on the input view features to reduce the impact of viewpoint mismatches.
arXiv Detail & Related papers (2022-04-28T07:46:03Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking [72.65494220685525]
We propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data.
We generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism.
arXiv Detail & Related papers (2021-07-22T03:10:51Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Parsing-based View-aware Embedding Network for Vehicle Re-Identification [138.11983486734576]
We propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID.
The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-04-10T13:06:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.