SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater
Robots
- URL: http://arxiv.org/abs/2011.06252v2
- Date: Thu, 14 Apr 2022 15:51:39 GMT
- Title: SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater
Robots
- Authors: Md Jahidul Islam, Ruobing Wang and Junaed Sattar
- Abstract summary: This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots.
Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images.
- Score: 16.242924916178282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a holistic approach to saliency-guided visual attention
modeling (SVAM) for use by autonomous underwater robots. Our proposed model,
named SVAM-Net, integrates deep visual features at various scales and semantics
for effective salient object detection (SOD) in natural underwater images. The
SVAM-Net architecture is configured in a unique way to jointly accommodate
bottom-up and top-down learning within two separate branches of the network
while sharing the same encoding layers. We design dedicated spatial attention
modules (SAMs) along these learning pathways to exploit the coarse-level and
fine-level semantic features for SOD at four stages of abstractions. The
bottom-up branch performs a rough yet reasonably accurate saliency estimation
at a fast rate, whereas the deeper top-down branch incorporates a residual
refinement module (RRM) that provides fine-grained localization of the salient
objects. Extensive performance evaluation of SVAM-Net on benchmark datasets
clearly demonstrates its effectiveness for underwater SOD. We also validate its
generalization performance by several ocean trials' data that include test
images of diverse underwater scenes and waterbodies, and also images with
unseen natural objects. Moreover, we analyze its computational feasibility for
robotic deployments and demonstrate its utility in several important use cases
of visual attention modeling.
Related papers
- UW-SDF: Exploiting Hybrid Geometric Priors for Neural SDF Reconstruction from Underwater Multi-view Monocular Images [63.32490897641344]
We propose a framework for reconstructing target objects from multi-view underwater images based on neural SDF.
We introduce hybrid geometric priors to optimize the reconstruction process, markedly enhancing the quality and efficiency of neural SDF reconstruction.
arXiv Detail & Related papers (2024-10-10T16:33:56Z) - FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation [65.01601309903971]
We introduce FAFA, a Frequency-Aware Flow-Aided self-supervised framework for 6D pose estimation of unmanned underwater vehicles (UUVs)
Our framework relies solely on the 3D model and RGB images, alleviating the need for any real pose annotations or other-modality data like depths.
We evaluate the effectiveness of FAFA on common underwater object pose benchmarks and showcase significant performance improvements compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-09-25T03:54:01Z) - On Vision Transformers for Classification Tasks in Side-Scan Sonar Imagery [0.0]
Side-scan sonar (SSS) imagery presents unique challenges in the classification of man-made objects on the seafloor.
This paper rigorously compares the performance of ViT models alongside commonly used CNN architectures for binary classification tasks in SSS imagery.
ViT-based models exhibit superior classification performance across f1-score, precision, recall, and accuracy metrics.
arXiv Detail & Related papers (2024-09-18T14:36:50Z) - Introducing VaDA: Novel Image Segmentation Model for Maritime Object Segmentation Using New Dataset [3.468621550644668]
The maritime shipping industry is undergoing rapid evolution driven by advancements in computer vision artificial intelligence (AI)
object recognition in maritime environments faces challenges such as light reflection, interference, intense lighting, and various weather conditions.
Existing AI recognition models and datasets have limited suitability for composing autonomous navigation systems.
arXiv Detail & Related papers (2024-07-12T05:48:53Z) - Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset [60.14089302022989]
Underwater vision tasks often suffer from low segmentation accuracy due to the complex underwater circumstances.
We construct the first large-scale underwater salient instance segmentation dataset (USIS10K)
We propose an Underwater Salient Instance architecture based on Segment Anything Model (USIS-SAM) specifically for the underwater domain.
arXiv Detail & Related papers (2024-06-10T06:17:33Z) - Semantic-aware Texture-Structure Feature Collaboration for Underwater
Image Enhancement [58.075720488942125]
Underwater image enhancement has become an attractive topic as a significant technology in marine engineering and aquatic robotics.
We develop an efficient and compact enhancement network in collaboration with a high-level semantic-aware pretrained model.
We also apply the proposed algorithm to the underwater salient object detection task to reveal the favorable semantic-aware ability for high-level vision tasks.
arXiv Detail & Related papers (2022-11-19T07:50:34Z) - Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial
System Applications [0.0]
Multi-object tracking (MOT) is a crucial component of situational awareness in military defense applications.
We present a robust object tracking architecture aimed to accommodate for the noise in real-time situations.
We propose a kinematic prediction model, called Deep Extended Kalman Filter (DeepEKF), in which a sequence-to-sequence architecture is used to predict entity trajectories in latent space.
arXiv Detail & Related papers (2021-10-05T13:50:38Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - Semantic Segmentation of Underwater Imagery: Dataset and Benchmark [13.456412091502527]
We present the first large-scale dataset for semantic analysis of Underwater IMagery (SUIM)
It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor.
We also present a benchmark evaluation of state-of-the-art semantic segmentation approaches based on standard performance metrics.
arXiv Detail & Related papers (2020-04-02T19:53:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.