A Revisit to the Decoder for Camouflaged Object Detection
- URL: http://arxiv.org/abs/2503.14035v1
- Date: Tue, 18 Mar 2025 08:51:50 GMT
- Title: A Revisit to the Decoder for Camouflaged Object Detection
- Authors: Seung Woo Ko, Joopyo Hong, Suyoung Kim, Seungjai Bang, Sungzoon Cho, Nojun Kwak, Hyung-Sin Kim, Joonseok Lee,
- Abstract summary: Camouflaged object detection (COD) aims to generate a fine-grained segmentation map of camouflaged objects hidden in their background.<n>We propose a novel architecture that augments the prevalent decoding strategy in COD with Enrich Decoder and Retouch Decoder.
- Score: 34.886607866949845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Camouflaged object detection (COD) aims to generate a fine-grained segmentation map of camouflaged objects hidden in their background. Due to the hidden nature of camouflaged objects, it is essential for the decoder to be tailored to effectively extract proper features of camouflaged objects and extra-carefully generate their complex boundaries. In this paper, we propose a novel architecture that augments the prevalent decoding strategy in COD with Enrich Decoder and Retouch Decoder, which help to generate a fine-grained segmentation map. Specifically, the Enrich Decoder amplifies the channels of features that are important for COD using channel-wise attention. Retouch Decoder further refines the segmentation maps by spatially attending to important pixels, such as the boundary regions. With extensive experiments, we demonstrate that ENTO shows superior performance using various encoders, with the two novel components playing their unique roles that are mutually complementary.
Related papers
- SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval [82.51117533271517]
Previous works typically only encode RGB videos to obtain high-level semantic features.
Existing RGB-based sign retrieval works suffer from the huge memory cost of dense visual data embedding in end-to-end training.
We propose a novel sign language representation framework called Semantically Enhanced Dual-Stream.
arXiv Detail & Related papers (2024-07-23T11:31:11Z) - SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers [13.173384346104802]
Unsupervised object-centric learning aims to decompose scenes into interpretable object entities, termed slots.
Slot-based auto-encoders stand out as a prominent method for this task.
This work introduces two novel techniques, which distills superior slot-based attention masks from the decoder to the encoder.
arXiv Detail & Related papers (2023-12-01T15:20:58Z) - More complex encoder is not all you need [0.882348769487259]
We introduce neU-Net (i.e., not complex encoder U-Net), which incorporates a novel Sub-pixel Convolution for upsampling to construct a powerful decoder.
Our model design achieves excellent results, surpassing other state-of-the-art methods on both the Synapse and ACDC datasets.
arXiv Detail & Related papers (2023-09-20T08:34:38Z) - Camouflaged Object Detection with Feature Grafting and Distractor Aware [9.791590363932519]
We propose a novel Feature Grafting and Distractor Aware network (FDNet) to handle the Camouflaged Object Detection task.
Specifically, we use CNN and Transformer to encode multi-scale images in parallel.
A Distractor Aware Module is designed to explicitly model the two possible distractors in the COD task to refine the coarse camouflage map.
arXiv Detail & Related papers (2023-07-08T09:37:08Z) - Referring Camouflaged Object Detection [97.90911862979355]
Ref-COD aims to segment specified camouflaged objects based on a small set of referring images with salient target objects.
We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios.
arXiv Detail & Related papers (2023-06-13T04:15:37Z) - Towards Complex Backgrounds: A Unified Difference-Aware Decoder for
Binary Segmentation [4.6932442139663015]
A new unified dual-branch decoder paradigm named the difference-aware decoder is proposed in this paper.
The difference-aware decoder imitates the human eye in three stages using the multi-level features output by the encoder.
The results demonstrate that the difference-aware decoder can achieve a higher accuracy than the other state-of-the-art binary segmentation methods.
arXiv Detail & Related papers (2022-10-27T03:45:29Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Crosslink-Net: Double-branch Encoder Segmentation Network via Fusing
Vertical and Horizontal Convolutions [58.71117402626524]
We present a novel double-branch encoder architecture for medical image segmentation.
Our architecture is inspired by two observations: 1) Since the discrimination of features learned via square convolutional kernels needs to be further improved, we propose to utilize non-square vertical and horizontal convolutional kernels.
The experiments validate the effectiveness of our model on four datasets.
arXiv Detail & Related papers (2021-07-24T02:58:32Z) - Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation [56.44853893149365]
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers.
We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content.
In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
arXiv Detail & Related papers (2020-07-19T18:44:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.