Object Re-Identification from Point Clouds
- URL: http://arxiv.org/abs/2305.10210v3
- Date: Fri, 11 Aug 2023 20:09:56 GMT
- Title: Object Re-Identification from Point Clouds
- Authors: Benjamin Th\'erien, Chengjie Huang, Adrian Chow, Krzysztof Czarnecki
- Abstract summary: We provide the first large-scale study of object ReID from point clouds and establish its performance relative to image ReID.
To our knowledge, we are the first to study object re-identification from real point cloud observations.
- Score: 3.6308236424346694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object re-identification (ReID) from images plays a critical role in
application domains of image retrieval (surveillance, retail analytics, etc.)
and multi-object tracking (autonomous driving, robotics, etc.). However,
systems that additionally or exclusively perceive the world from depth sensors
are becoming more commonplace without any corresponding methods for object
ReID. In this work, we fill the gap by providing the first large-scale study of
object ReID from point clouds and establishing its performance relative to
image ReID. To enable such a study, we create two large-scale ReID datasets
with paired image and LiDAR observations and propose a lightweight matching
head that can be concatenated to any set or sequence processing backbone (e.g.,
PointNet or ViT), creating a family of comparable object ReID networks for both
modalities. Run in Siamese style, our proposed point cloud ReID networks can
make thousands of pairwise comparisons in real-time ($10$ Hz). Our findings
demonstrate that their performance increases with higher sensor resolution and
approaches that of image ReID when observations are sufficiently dense. Our
strongest network trained at the largest scale achieves ReID accuracy exceeding
$90\%$ for rigid objects and $85\%$ for deformable objects (without any
explicit skeleton normalization). To our knowledge, we are the first to study
object re-identification from real point cloud observations.
Related papers
- Towards Global Localization using Multi-Modal Object-Instance Re-Identification [23.764646800085977]
We propose a novel re-identification transformer architecture that integrates multimodal RGB and depth information.
We demonstrate improvements in ReID across scenes that are cluttered or have varying illumination conditions.
We also develop a ReID-based localization framework that enables accurate camera localization and pose identification across different viewpoints.
arXiv Detail & Related papers (2024-09-18T14:15:10Z) - PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest [65.48057241587398]
PoIFusion is a framework to fuse information of RGB images and LiDAR point clouds at the points of interest (PoIs)
Our approach maintains the view of each modality and obtains multi-modal features by computation-friendly projection and computation.
We conducted extensive experiments on nuScenes and Argoverse2 datasets to evaluate our approach.
arXiv Detail & Related papers (2024-03-14T09:28:12Z) - PointOBB: Learning Oriented Object Detection via Single Point
Supervision [55.88982271340328]
This paper proposes PointOBB, the first single Point-based OBB generation method, for oriented object detection.
PointOBB operates through the collaborative utilization of three distinctive views: an original view, a resized view, and a rotated/flipped (rot/flp) view.
Experimental results on the DIOR-R and DOTA-v1.0 datasets demonstrate that PointOBB achieves promising performance.
arXiv Detail & Related papers (2023-11-23T15:51:50Z) - Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - I2P-Rec: Recognizing Images on Large-scale Point Cloud Maps through
Bird's Eye View Projections [18.7557037030769]
Place recognition is an important technique for autonomous cars to achieve full autonomy.
We propose the I2P-Rec method to solve the problem by transforming the cross-modal data into the same modality.
With only a small set of training data, I2P-Rec achieves recall rates at Top-1% over 80% and 90%, when localizing monocular and stereo images on point cloud maps.
arXiv Detail & Related papers (2023-03-02T07:56:04Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z) - 3D Object Detection From LiDAR Data Using Distance Dependent Feature
Extraction [7.04185696830272]
This work proposes an improvement for 3D object detectors by taking into account the properties of LiDAR point clouds over distance.
Results show that training separate networks for close-range and long-range objects boosts performance for all KITTI benchmark difficulties.
arXiv Detail & Related papers (2020-03-02T13:16:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.