Uncertainty Guided Refinement for Fine-Grained Salient Object Detection
- URL: http://arxiv.org/abs/2504.09666v1
- Date: Sun, 13 Apr 2025 17:34:37 GMT
- Title: Uncertainty Guided Refinement for Fine-Grained Salient Object Detection
- Authors: Yao Yuan, Pan Gao, Qun Dai, Jie Qin, Wei Xiang,
- Abstract summary: We introduce the uncertainty guidance learning approach to salient object detection (SOD)<n>We design a novel Uncertainty Guided Refinement Attention Network (UGRAN)<n> Experiments on seven benchmark datasets demonstrate the superiority of the proposed UGRAN over the state-of-the-art methodologies.
- Score: 35.6505339733396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, salient object detection (SOD) methods have achieved impressive performance. However, salient regions predicted by existing methods usually contain unsaturated regions and shadows, which limits the model for reliable fine-grained predictions. To address this, we introduce the uncertainty guidance learning approach to SOD, intended to enhance the model's perception of uncertain regions. Specifically, we design a novel Uncertainty Guided Refinement Attention Network (UGRAN), which incorporates three important components, i.e., the Multilevel Interaction Attention (MIA) module, the Scale Spatial-Consistent Attention (SSCA) module, and the Uncertainty Refinement Attention (URA) module. Unlike conventional methods dedicated to enhancing features, the proposed MIA facilitates the interaction and perception of multilevel features, leveraging the complementary characteristics among multilevel features. Then, through the proposed SSCA, the salient information across diverse scales within the aggregated features can be integrated more comprehensively and integrally. In the subsequent steps, we utilize the uncertainty map generated from the saliency prediction map to enhance the model's perception capability of uncertain regions, generating a highly-saturated fine-grained saliency prediction map. Additionally, we devise an adaptive dynamic partition (ADP) mechanism to minimize the computational overhead of the URA module and improve the utilization of uncertainty guidance. Experiments on seven benchmark datasets demonstrate the superiority of the proposed UGRAN over the state-of-the-art methodologies. Codes will be released at https://github.com/I2-Multimedia-Lab/UGRAN.
Related papers
- IoUCert: Robustness Verification for Anchor-based Object Detectors [58.35703549470485]
We introduce IoUCert, a novel formal verification framework designed specifically to overcome these bottlenecks in anchor-based object detection architectures.<n>We show that our method enables the robustness verification of realistic, anchor-based models including SSD, YOLOv2, and YOLOv3 variants against various input perturbations.
arXiv Detail & Related papers (2026-03-03T14:36:46Z) - Forward Consistency Learning with Gated Context Aggregation for Video Anomaly Detection [17.79982215633934]
Video anomaly detection (VAD) aims to measure deviations from normal patterns for various events in real-time surveillance systems.<n>Most existing VAD methods rely on large-scale models to pursue extreme accuracy, limiting their feasibility on resource-limited edge devices.<n>We introduce FoGA, a lightweight VAD model that performs Forward consistency learning with Gated context aggregation.
arXiv Detail & Related papers (2026-01-26T04:35:31Z) - Scale-aware Adaptive Supervised Network with Limited Medical Annotations [17.42211316792232]
SASNet is a dual-branch architecture that leverages both low-level and high-level feature representations through novel scale-aware adaptive reweight mechanisms.<n>Our approach introduces three key methodological innovations, including the Scale-aware Adaptive Reweight strategy.<n> SASNet achieves superior performance with limited labeled data, surpassing state-of-the-art semi-supervised methods.
arXiv Detail & Related papers (2026-01-02T23:55:17Z) - Adaptive Dual Uncertainty Optimization: Boosting Monocular 3D Object Detection under Test-Time Shifts [80.32933059529135]
Test-Time Adaptation (TTA) methods have emerged to adapt to target distributions during inference.<n>We propose Dual Uncertainty Optimization (DUO), the first TTA framework designed to jointly minimize both uncertainties for robust M3OD.<n>In parallel, we design a semantic-aware normal field constraint that preserves geometric coherence in regions with clear semantic cues.
arXiv Detail & Related papers (2025-08-28T07:09:21Z) - Graph-Based Uncertainty Modeling and Multimodal Fusion for Salient Object Detection [12.743278093269325]
We propose a dynamic uncertainty propagation and multimodal collaborative reasoning network (DUP-MCRNet)<n>DUGC is designed to propagate uncertainty between layers through a sparse graph constructed based on spatial semantic distance.<n>MCF uses learnable modality gating weights to weightedly fuse the attention maps of RGB, depth, and edge features.
arXiv Detail & Related papers (2025-08-28T04:31:48Z) - Beyond Fully Supervised Pixel Annotations: Scribble-Driven Weakly-Supervised Framework for Image Manipulation Localization [11.10178274806454]
We propose a form of weak supervision that improves the annotation efficiency and detection performance.<n>We re-annotated mainstream IML datasets with scribble labels and propose the first scribble-based IML dataset.<n>We employ self-supervised training with a structural consistency loss to encourage the model to produce consistent predictions.
arXiv Detail & Related papers (2025-07-17T11:45:27Z) - Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods.
We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model.
We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z) - UAHOI: Uncertainty-aware Robust Interaction Learning for HOI Detection [18.25576487115016]
This paper focuses on Human-Object Interaction (HOI) detection.
It addresses the challenge of identifying and understanding the interactions between humans and objects within a given image or video frame.
We propose a novel approach textscUAHOI, Uncertainty-aware Robust Human-Object Interaction Learning.
arXiv Detail & Related papers (2024-08-14T10:06:39Z) - An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - Inter- and intra-uncertainty based feature aggregation model for semi-supervised histopathology image segmentation [21.973620376753594]
hierarchical prediction uncertainty within the student model (intra-uncertainty) and image prediction uncertainty (inter-uncertainty) have not been fully utilized by existing methods.
We propose a novel inter- and intra-uncertainty regularization method to measure and constrain both inter- and intra-inconsistencies in the teacher-student architecture.
We also propose a new two-stage network with pseudo-mask guided feature aggregation (PG-FANet) as the segmentation model.
arXiv Detail & Related papers (2024-03-19T14:32:21Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Spatial Attention-based Distribution Integration Network for Human Pose
Estimation [0.8052382324386398]
We present the Spatial Attention-based Distribution Integration Network (SADI-NET) to improve the accuracy of localization.
Our network consists of three efficient models: the receptive fortified module (RFM), spatial fusion module (SFM), and distribution learning module (DLM)
Our model obtained a remarkable $92.10%$ percent accuracy on the MPII test dataset, demonstrating significant improvements over existing models and establishing state-of-the-art performance.
arXiv Detail & Related papers (2023-11-09T12:43:01Z) - Mutual Information-calibrated Conformal Feature Fusion for
Uncertainty-Aware Multimodal 3D Object Detection at the Edge [1.7898305876314982]
Three-dimensional (3D) object detection, a critical robotics operation, has seen significant advancements.
Our study integrates the principles of conformal inference with information theoretic measures to perform lightweight, Monte Carlo-free uncertainty estimation.
The framework demonstrates comparable or better performance in KITTI 3D object detection benchmarks to similar methods that are not uncertainty-aware.
arXiv Detail & Related papers (2023-09-18T09:02:44Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Boundary-semantic collaborative guidance network with dual-stream
feedback mechanism for salient object detection in optical remote sensing
imagery [22.21644705244091]
We propose boundary-semantic collaborative guidance network (BSCGNet) with dual-stream feedback mechanism.
BSCGNet exhibits distinct advantages in challenging scenarios and outperforms the 17 state-of-the-art (SOTA) approaches proposed in recent years.
arXiv Detail & Related papers (2023-03-06T03:36:06Z) - Collaborative Uncertainty Benefits Multi-Agent Multi-Modal Trajectory Forecasting [61.02295959343446]
This work first proposes a novel concept, collaborative uncertainty (CU), which models the uncertainty resulting from interaction modules.<n>We build a general CU-aware regression framework with an original permutation-equivariant uncertainty estimator to do both tasks of regression and uncertainty estimation.<n>We apply the proposed framework to current SOTA multi-agent trajectory forecasting systems as a plugin module.
arXiv Detail & Related papers (2022-07-11T21:17:41Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z) - CRAUM-Net: Contextual Recursive Attention with Uncertainty Modeling for Salient Object Detection [0.0]
We present a novel framework that integrates multi-scale context aggregation, advanced attention mechanisms, and an uncertainty-aware module for improved SOD performance.<n>Our Adaptive Cross-Scale Context Module effectively fuses features from multiple levels, leveraging Recursive Channel Spatial Attention and Convolutional Block Attention.<n>To train our network robustly, we employ a combination of boundary-sensitive and topology-preserving loss functions, including Boundary IoU, Focal Tversky, and Topological Saliency losses.
arXiv Detail & Related papers (2020-06-04T18:33:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.