Assisted Refinement Network Based on Channel Information Interaction for Camouflaged and Salient Object Detection
- URL: http://arxiv.org/abs/2512.11369v1
- Date: Fri, 12 Dec 2025 08:29:00 GMT
- Title: Assisted Refinement Network Based on Channel Information Interaction for Camouflaged and Salient Object Detection
- Authors: Kuan Wang, Yanjun Qin, Mengge Lu, Liejun Wang, Xiaoming Tao,
- Abstract summary: Camouflaged Object Detection (COD) stands as a significant challenge in computer vision, dedicated to identifying and segmenting objects visually highly integrated with their backgrounds.<n>Current mainstream methods have made progress in cross-layer feature fusion, but two critical issues persist during the decoding stage.<n>The first is insufficient cross-channel information interaction within the same-layer features, limiting feature expressiveness.<n>The second is the inability to effectively co-model boundary and region information, making it difficult to accurately reconstruct complete regions and sharp boundaries of objects.
- Score: 30.393796834241794
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Camouflaged Object Detection (COD) stands as a significant challenge in computer vision, dedicated to identifying and segmenting objects visually highly integrated with their backgrounds. Current mainstream methods have made progress in cross-layer feature fusion, but two critical issues persist during the decoding stage. The first is insufficient cross-channel information interaction within the same-layer features, limiting feature expressiveness. The second is the inability to effectively co-model boundary and region information, making it difficult to accurately reconstruct complete regions and sharp boundaries of objects. To address the first issue, we propose the Channel Information Interaction Module (CIIM), which introduces a horizontal-vertical integration mechanism in the channel dimension. This module performs feature reorganization and interaction across channels to effectively capture complementary cross-channel information. To address the second issue, we construct a collaborative decoding architecture guided by prior knowledge. This architecture generates boundary priors and object localization maps through Boundary Extraction (BE) and Region Extraction (RE) modules, then employs hybrid attention to collaboratively calibrate decoded features, effectively overcoming semantic ambiguity and imprecise boundaries. Additionally, the Multi-scale Enhancement (MSE) module enriches contextual feature representations. Extensive experiments on four COD benchmark datasets validate the effectiveness and state-of-the-art performance of the proposed model. We further transferred our model to the Salient Object Detection (SOD) task and demonstrated its adaptability across downstream tasks, including polyp segmentation, transparent object detection, and industrial and road defect detection. Code and experimental results are publicly available at: https://github.com/akuan1234/ARNet-v2.
Related papers
- CFMD: Dynamic Cross-layer Feature Fusion for Salient Object Detection [7.262250906929891]
Cross-layer feature pyramid networks (CFPNs) have achieved notable progress in multi-scale feature fusion and boundary detail preservation for salient object detection.<n>To address these challenges, we propose CFMD, a novel cross-layer feature pyramid network that introduces two key innovations.<n>First, we design a context-aware feature aggregation module (CFLMA), which incorporates the state-of-the-art Mamba architecture to construct a dynamic weight distribution mechanism.<n>Second, we introduce an adaptive dynamic upsampling unit (CFLMD) that preserves spatial details during resolution recovery.
arXiv Detail & Related papers (2025-04-02T03:22:36Z) - B2Net: Camouflaged Object Detection via Boundary Aware and Boundary Fusion [10.899493419708651]
We propose a novel network named B2Net to enhance the accuracy of obtained boundaries.<n>We present a Residual Feature Enhanced Module (RFEM) with the goal of integrating more discriminative feature representations.<n>After that, the Boundary Aware Module (BAM) is introduced to explore edge cues twice.<n>Finally, we design the Cross-scale Boundary Fusion Module(CBFM) that integrate information across different scales in a top-down manner.
arXiv Detail & Related papers (2024-12-31T13:06:06Z) - Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation [50.433911327489554]
The goal of referring remote sensing image segmentation (RRSIS) is to generate a pixel-level mask of the target object identified by the referring expression.<n>To address the aforementioned challenges, a novel RRSIS framework is proposed, termed the cross-modal bidirectional interaction model (CroBIM)<n>To further forster the research of RRSIS, we also construct RISBench, a new large-scale benchmark dataset comprising 52,472 image-language-label triplets.
arXiv Detail & Related papers (2024-10-11T08:28:04Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.<n>Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.<n>Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Boundary-semantic collaborative guidance network with dual-stream
feedback mechanism for salient object detection in optical remote sensing
imagery [22.21644705244091]
We propose boundary-semantic collaborative guidance network (BSCGNet) with dual-stream feedback mechanism.
BSCGNet exhibits distinct advantages in challenging scenarios and outperforms the 17 state-of-the-art (SOTA) approaches proposed in recent years.
arXiv Detail & Related papers (2023-03-06T03:36:06Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - Saliency Detection via Global Context Enhanced Feature Fusion and Edge
Weighted Loss [6.112591965159383]
We propose a context fusion decoder network (CFDN) and near edge weighted loss (NEWLoss) function.
The CFDN creates an accurate saliency map by integrating global context information and thus suppressing the influence of the unnecessary spatial information.
NewLoss accelerates learning of obscure boundaries without additional modules by generating weight maps on object boundaries.
arXiv Detail & Related papers (2021-10-13T08:04:55Z) - Full-Duplex Strategy for Video Object Segmentation [141.43983376262815]
Full- Strategy Network (FSNet) is a novel framework for video object segmentation (VOS)
Our FSNet performs the crossmodal feature-passing (i.e., transmission and receiving) simultaneously before fusion decoding stage.
We show that our FSNet outperforms other state-of-the-arts for both the VOS and video salient object detection tasks.
arXiv Detail & Related papers (2021-08-06T14:50:50Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.