StreakNet-Arch: An Anti-scattering Network-based Architecture for Underwater Carrier LiDAR-Radar Imaging
- URL: http://arxiv.org/abs/2404.09158v4
- Date: Wed, 16 Jul 2025 04:00:55 GMT
- Title: StreakNet-Arch: An Anti-scattering Network-based Architecture for Underwater Carrier LiDAR-Radar Imaging
- Authors: Xuelong Li, Hongjun An, Haofei Zhao, Guangying Li, Bo Liu, Xing Wang, Guanghua Cheng, Guojun Wu, Zhe Sun,
- Abstract summary: We introduce StreakNet-Arch, a real-time, end-to-end binary-classification framework based on our self-developed Underwater Carrier LiDAR-Radar (UCLR)<n>Under controlled water tank validation conditions, StreakNet-Arch with Self-Attention or DBC-Attention outperforms traditional bandpass filtering.<n>We validate our UCLR system in a South China Sea trial, reaching an error of 46mm for 3D target at 1,000 m depth and 20 m range.
- Score: 44.96583097079915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce StreakNet-Arch, a real-time, end-to-end binary-classification framework based on our self-developed Underwater Carrier LiDAR-Radar (UCLR) that embeds Self-Attention and our novel Double Branch Cross Attention (DBC-Attention) to enhance scatter suppression. Under controlled water tank validation conditions, StreakNet-Arch with Self-Attention or DBC-Attention outperforms traditional bandpass filtering and achieves higher $F_1$ scores than learning-based MP networks and CNNs at comparable model size and complexity. Real-time benchmarks on an NVIDIA RTX 3060 show a constant Average Imaging Time (54 to 84 ms) regardless of frame count, versus a linear increase (58 to 1,257 ms) for conventional methods. To facilitate further research, we contribute a publicly available streak-tube camera image dataset contains 2,695,168 real-world underwater 3D point cloud data. More importantly, we validate our UCLR system in a South China Sea trial, reaching an error of 46mm for 3D target at 1,000 m depth and 20 m range. Source code and data are available at https://github.com/BestAnHongjun/StreakNet .
Related papers
- PAN: Pillars-Attention-Based Network for 3D Object Detection [3.3274570204477922]
This work presents a novel 3D object detection algorithm using cameras and radars in the bird's-eye-view (BEV)<n>Our algorithm exploits the advantages of radar before fusing the features into a detection head.<n>A new backbone is introduced, which maps the radar pillar features into an embedded dimension.
arXiv Detail & Related papers (2025-09-19T12:40:49Z) - Towards Foundational Models for Single-Chip Radar [49.896124982717716]
mmWave radars are compact, inexpensive, and durable sensors that work regardless of environmental conditions, such as weather and darkness.<n>This comes at the cost of poor angular resolution, especially for inexpensive single-chip radars, which are typically used in automotive and indoor sensing applications.<n>We train a foundational model for 4D single-chip radar, which can predict 3D occupancy and semantic segmentation with quality that is typically only possible with much higher resolution sensors.
arXiv Detail & Related papers (2025-09-15T22:06:17Z) - Cross3DReg: Towards a Large-scale Real-world Cross-source Point Cloud Registration Benchmark [57.42211080221526]
Cross-source point cloud registration, which aims to align point cloud data from different sensors, is a fundamental task in 3D vision.<n>The lack of publicly available large-scale real-world datasets for training the deep registration models, and the inherent differences in point clouds captured by multiple sensors pose challenges.<n>We construct Cross3DReg, the currently largest and real-world multi-modal cross-source point cloud registration dataset.<n>A visual-geometric attention guided matching module is proposed to enhance the consistency of cross-source point cloud features.
arXiv Detail & Related papers (2025-09-08T09:01:13Z) - You Sense Only Once Beneath: Ultra-Light Real-Time Underwater Object Detection [2.5249064981269296]
We propose an Ultra-Light Real-Time Underwater Object Detection framework, You Sense Only Once Beneath (YSOOB)<n>Specifically, we utilize a Multi-Spectrum Wavelet (MSWE) to perform frequency-domain encoding on the input image, minimizing the semantic loss caused by underwater optical color distortion.<n>We also eliminate model redundancy through a simple yet effective channel compression and reconstructed large kernel convolution (RLKC) to achieve model lightweight.
arXiv Detail & Related papers (2025-04-22T08:26:35Z) - Image-Goal Navigation Using Refined Feature Guidance and Scene Graph Enhancement [28.716326030924474]
In this paper, we introduce a novel image-goal navigation approach, named RFSG.
Our focus lies in leveraging the fine-grained connections between goals, observations, and the environment within limited image data.
We propose the spatial-channel attention mechanism, enabling the network to learn the importance of multi-dimensional features to fuse the goal and observation features.
arXiv Detail & Related papers (2025-03-14T01:15:24Z) - RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance.<n>Radar suffers from noise and positional ambiguity.<n>We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z) - Blind Underwater Image Restoration using Co-Operational Regressor Networks [15.853520058218042]
We propose a novel machine learning model, Co-Operational Regressor Networks (CoRe-Nets)
A CoRe-Net consists of two co-operating networks: the Apprentice Regressor (AR), responsible for image transformation, and the Master Regressor (MR), which evaluates the Peak Signal-to-Noise Ratio (PSNR) of the images generated by the AR and feeds it back to AR.
Our results and the optimized PyTorch implementation of the proposed approach are now publicly shared on GitHub.
arXiv Detail & Related papers (2024-12-05T09:15:21Z) - V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection [64.93675471780209]
We present V2X-R, the first simulated V2X dataset incorporating LiDAR, camera, and 4D radar.<n>V2X-R contains 12,079 scenarios with 37,727 frames of LiDAR and 4D radar point clouds, 150,908 images, and 170,859 annotated 3D vehicle bounding boxes.<n>We propose a novel cooperative LiDAR-4D radar fusion pipeline for 3D object detection and implement it with various fusion strategies.
arXiv Detail & Related papers (2024-11-13T07:41:47Z) - Sparse Multi-baseline SAR Cross-modal 3D Reconstruction of Vehicle Targets [5.6680936716261705]
We propose a Cross-Modal Reconstruction Network (CMR-Net), which integrates differentiable render and cross-modal supervision with optical images.
CMR-Net, trained solely on simulated data, demonstrates high-resolution reconstruction capabilities on both publicly available simulation datasets and real measured datasets.
arXiv Detail & Related papers (2024-06-06T15:18:59Z) - Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images.
In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data.
We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception [20.824179713013734]
We propose Camera Radar Net (CRN), a novel camera-radar fusion framework.
CRN generates semantically rich and spatially accurate bird's-eye-view (BEV) feature map for various tasks.
CRN with real-time setting operates at 20 FPS while achieving comparable performance to LiDAR detectors on nuScenes.
arXiv Detail & Related papers (2023-04-03T00:47:37Z) - Ultra-low Power Deep Learning-based Monocular Relative Localization
Onboard Nano-quadrotors [64.68349896377629]
This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones.
To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, including dataset augmentation, quantization, and system optimizations.
Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to 2m distance.
arXiv Detail & Related papers (2023-03-03T14:14:08Z) - Unpaired Overwater Image Defogging Using Prior Map Guided CycleGAN [60.257791714663725]
We propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes.
The proposed method outperforms the state-of-the-art supervised, semi-supervised, and unsupervised defogging approaches.
arXiv Detail & Related papers (2022-12-23T03:00:28Z) - K-Radar: 4D Radar Object Detection for Autonomous Driving in Various
Weather Conditions [9.705678194028895]
KAIST-Radar is a novel large-scale object detection dataset and benchmark.
It contains 35K frames of 4D Radar tensor (4DRT) data with power measurements along the Doppler, range, azimuth, and elevation dimensions.
We provide auxiliary measurements from carefully calibrated high-resolution Lidars, surround stereo cameras, and RTK-GPS.
arXiv Detail & Related papers (2022-06-16T13:39:21Z) - Planetary UAV localization based on Multi-modal Registration with
Pre-existing Digital Terrain Model [0.5156484100374058]
We propose a multi-modal registration based SLAM algorithm, which estimates the location of a planet UAV using a nadir view camera on the UAV.
To overcome the scale and appearance difference between on-board UAV images and pre-installed digital terrain model, a theoretical model is proposed to prove that topographic features of UAV image and DEM can be correlated in frequency domain via cross power spectrum.
To test the robustness and effectiveness of the proposed localization algorithm, a new cross-source drone-based localization dataset for planetary exploration is proposed.
arXiv Detail & Related papers (2021-06-24T02:54:01Z) - Wavelength-based Attributed Deep Neural Network for Underwater Image
Restoration [9.378355457555319]
This paper shows that attributing the right receptive field size (context) based on the traversing range of the color channel may lead to a substantial performance gain.
As a second novelty, we have incorporated an attentive skip mechanism to adaptively refine the learned multi-contextual features.
The proposed framework, called Deep WaveNet, is optimized using the traditional pixel-wise and feature-based cost functions.
arXiv Detail & Related papers (2021-06-15T06:47:51Z) - Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic
Skip Connection Network [80.67717076541956]
Under-Display Camera (UDC) systems provide a true bezel-less and notch-free viewing experience on smartphones.
In a typical UDC system, the pixel array attenuates and diffracts the incident light on the camera, resulting in significant image quality degradation.
In this work, we aim to analyze and tackle the aforementioned degradation problems.
arXiv Detail & Related papers (2021-04-19T18:41:45Z) - Non-local Channel Aggregation Network for Single Image Rain Removal [3.7679182997120066]
We propose a non-local channel aggregation network (NCANet) to address the single image rain removal problem.
NCANet models 2D rainy images as sequences of vectors in three directions, namely vertical direction, transverse direction and channel direction.
Recurrently aggregating information from all three directions enables our model to capture the long-range dependencies in both channels and spaitials locations.
arXiv Detail & Related papers (2021-03-03T15:57:37Z) - Dense Attention Fluid Network for Salient Object Detection in Optical
Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs)
A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships.
We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z) - Lightweight Single-Image Super-Resolution Network with Attentive
Auxiliary Feature Learning [73.75457731689858]
We develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$2$F) for SISR.
Experimental results on large-scale dataset demonstrate the effectiveness of the proposed model against the state-of-the-art (SOTA) SR methods.
arXiv Detail & Related papers (2020-11-13T06:01:46Z) - Radar+RGB Attentive Fusion for Robust Object Detection in Autonomous
Vehicles [0.5801044612920815]
The proposed architecture aims to use radar signal data along with RGB camera images to form a robust detection network.
BIRANet yields 72.3/75.3% average AP/AR on the NuScenes dataset.
RANet gives 69.6/71.9% average AP/AR on the same dataset, which is reasonably acceptable performance.
arXiv Detail & Related papers (2020-08-31T14:27:02Z) - Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture.
We show consistent improvements in accuracy and learning convergence over the baseline.
Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z) - RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar
Image Formation [0.0]
We train a deep neural network that performs both the image formation and image processing tasks, integrating the SAR processing pipeline.
Results show that our integrated pipeline can output accurately classified SAR imagery with image quality comparable to those formed using a traditional algorithm.
arXiv Detail & Related papers (2020-01-22T18:44:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.