Pillar-Voxel Fusion Network for 3D Object Detection in Airborne Hyperspectral Point Clouds
- URL: http://arxiv.org/abs/2504.09506v1
- Date: Sun, 13 Apr 2025 10:13:48 GMT
- Title: Pillar-Voxel Fusion Network for 3D Object Detection in Airborne Hyperspectral Point Clouds
- Authors: Yanze Jiang, Yanfeng Gu, Xian Li,
- Abstract summary: We propose PiV-A HPC, a 3D object detection network for airborne HPCs.<n>We first develop a pillar-voxel dual-branch encoder, where the former captures spectral and vertical structural features from HPCs to overcome spectral distortion.<n>A multi-level feature fusion mechanism is devised to enhance information interaction between the two branches.
- Score: 35.24778377226701
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperspectral point clouds (HPCs) can simultaneously characterize 3D spatial and spectral information of ground objects, offering excellent 3D perception and target recognition capabilities. Current approaches for generating HPCs often involve fusion techniques with hyperspectral images and LiDAR point clouds, which inevitably lead to geometric-spectral distortions due to fusion errors and obstacle occlusions. These adverse effects limit their performance in downstream fine-grained tasks across multiple scenarios, particularly in airborne applications. To address these issues, we propose PiV-AHPC, a 3D object detection network for airborne HPCs. To the best of our knowledge, this is the first attempt at this HPCs task. Specifically, we first develop a pillar-voxel dual-branch encoder, where the former captures spectral and vertical structural features from HPCs to overcome spectral distortion, while the latter emphasizes extracting accurate 3D spatial features from point clouds. A multi-level feature fusion mechanism is devised to enhance information interaction between the two branches, achieving neighborhood feature alignment and channel-adaptive selection, thereby organically integrating heterogeneous features and mitigating geometric distortion. Extensive experiments on two airborne HPCs datasets demonstrate that PiV-AHPC possesses state-of-the-art detection performance and high generalization capability.
Related papers
- Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection [53.2590751089607]
Real-IAD D3 is a high-precision multimodal dataset that incorporates an additional pseudo3D modality generated through photometric stereo.
We introduce an effective approach that integrates RGB, point cloud, and pseudo-3D depth information to leverage the complementary strengths of each modality.
Our experiments highlight the importance of these modalities in boosting detection robustness and overall IAD performance.
arXiv Detail & Related papers (2025-04-19T08:05:47Z) - DiffFormer: a Differential Spatial-Spectral Transformer for Hyperspectral Image Classification [3.271106943956333]
Hyperspectral image classification (HSIC) has gained significant attention because of its potential in analyzing high-dimensional data with rich spectral and spatial information.
We propose the Differential Spatial-Spectral Transformer (DiffFormer) to address the inherent challenges of HSIC, such as spectral redundancy and spatial discontinuity.
Experiments on benchmark hyperspectral datasets demonstrate the superiority of DiffFormer in terms of classification accuracy, computational efficiency, and generalizability.
arXiv Detail & Related papers (2024-12-23T07:21:41Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive Learning [45.21394843710758]
High-resolution point clouds(HRPCD) anomaly detection(AD) plays a critical role in precision machining and high-end equipment manufacturing.
Despite considerable 3D-AD methods that have been proposed recently, they still cannot meet the requirements of the HRPCD-AD task.
We propose a novel group-level feature-based network, called Group3AD, which has a significantly efficient representation ability.
arXiv Detail & Related papers (2024-08-08T17:24:03Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - PointHPS: Cascaded 3D Human Pose and Shape Estimation from Point Clouds [99.60575439926963]
We propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings.
PointHPS iteratively refines point features through a cascaded architecture.
Extensive experiments demonstrate that PointHPS, with its powerful point feature extraction and processing scheme, outperforms State-of-the-Art methods.
arXiv Detail & Related papers (2023-08-28T11:10:14Z) - OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object
Detection [51.153003057515754]
OPA-3D is a single-stage, end-to-end, Occlusion-Aware Pixel-Wise Aggregation network.
It jointly estimates dense scene depth with depth-bounding box residuals and object bounding boxes.
It outperforms state-of-the-art methods on the main Car category.
arXiv Detail & Related papers (2022-11-02T14:19:13Z) - Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object
Detection [16.198358858773258]
Multi-modal 3D object detection has been an active research topic in autonomous driving.
It is non-trivial to explore the cross-modal feature fusion between sparse 3D points and dense 2D pixels.
Recent approaches either fuse the image features with the point cloud features that are projected onto the 2D image plane or combine the sparse point cloud with dense image pixels.
arXiv Detail & Related papers (2022-10-18T06:15:56Z) - On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation [56.97699793236174]
We study two kinds of robust cross-view consistency in this paper.
We exploit the temporal coherence in both depth feature space and 3D voxel space for self-supervised monocular depth estimation.
Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-19T03:46:13Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.