MFFN: Multi-view Feature Fusion Network for Camouflaged Object Detection
- URL: http://arxiv.org/abs/2210.06361v1
- Date: Wed, 12 Oct 2022 16:12:58 GMT
- Title: MFFN: Multi-view Feature Fusion Network for Camouflaged Object Detection
- Authors: Dehua Zheng, Xiaochen Zheng, Laurence T. Yang, Yuan Gao, Chenlu Zhu
and Yiheng Ruan
- Abstract summary: We propose a behavior-inspired framework, called Multi-view Feature Fusion Network (MFFN), which mimics the human behaviors of finding indistinct objects in images.
MFFN captures critical edge and semantic information by comparing and fusing extracted multi-view features.
Our method performs favorably against existing state-of-the-art methods via training with the same data.
- Score: 10.04773536815808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research about camouflaged object detection (COD) aims to segment
highly concealed objects hidden in complex surroundings. The tiny, fuzzy
camouflaged objects result in visually indistinguishable properties. However,
current single-view COD detectors are sensitive to background distractors.
Therefore, blurred boundaries and variable shapes of the camouflaged objects
are challenging to be fully captured with a single-view detector. To overcome
these obstacles, we propose a behavior-inspired framework, called Multi-view
Feature Fusion Network (MFFN), which mimics the human behaviors of finding
indistinct objects in images, i.e., observing from multiple angles, distances,
perspectives. Specifically, the key idea behind it is to generate multiple ways
of observation (multi-view) by data augmentation and apply them as inputs. MFFN
captures critical edge and semantic information by comparing and fusing
extracted multi-view features. In addition, our MFFN exploits the dependence
and interaction between views by the designed hierarchical view and channel
integration modules. Furthermore, our methods leverage the complementary
information between different views through a two-stage attention module called
Co-attention of Multi-view (CAMV). And we designed a local-overall module
called Channel Fusion Unit (CFU) to explore the channel-wise contextual clues
of diverse feature maps in an iterative manner. The experiment results show
that our method performs favorably against existing state-of-the-art methods
via training with the same data. The code will be available at https:
//github.com/dwardzheng/MFFN_COD.
Related papers
- SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy [55.570183323356964]
We propose a novel Surrounding-Aware Network, namely SurANet, for concealed object detection.
We enhance the semantics of feature maps using differential fusion of surrounding features to highlight concealed objects.
Next, a Surrounding-Aware Contrastive Loss is applied to identify the concealed object via learning surrounding feature maps contrastively.
arXiv Detail & Related papers (2024-10-09T13:02:50Z) - Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection.
The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z) - CoFiNet: Unveiling Camouflaged Objects with Multi-Scale Finesse [46.79770062391987]
We introduce a novel method for camouflage object detection, named CoFiNet.
Our approach focuses on multi-scale feature fusion and extraction, with special attention to the model's segmentation effectiveness.
CoFiNet achieves state-of-the-art performance across all datasets.
arXiv Detail & Related papers (2024-02-03T17:24:55Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Camouflaged Object Detection with Feature Grafting and Distractor Aware [9.791590363932519]
We propose a novel Feature Grafting and Distractor Aware network (FDNet) to handle the Camouflaged Object Detection task.
Specifically, we use CNN and Transformer to encode multi-scale images in parallel.
A Distractor Aware Module is designed to explicitly model the two possible distractors in the COD task to refine the coarse camouflage map.
arXiv Detail & Related papers (2023-07-08T09:37:08Z) - Detector Guidance for Multi-Object Text-to-Image Generation [61.70018793720616]
Detector Guidance (DG) integrates a latent object detection model to separate different objects during the generation process.
Human evaluations demonstrate that DG provides an 8-22% advantage in preventing the amalgamation of conflicting concepts.
arXiv Detail & Related papers (2023-06-04T02:33:12Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - Multi-modal Transformers Excel at Class-agnostic Object Detection [105.10403103027306]
We argue that existing methods lack a top-down supervision signal governed by human-understandable semantics.
We develop an efficient and flexible MViT architecture using multi-scale feature processing and deformable self-attention.
We show the significance of MViT proposals in a diverse range of applications.
arXiv Detail & Related papers (2021-11-22T18:59:29Z) - Multiview Detection with Feature Perspective Transformation [59.34619548026885]
We propose a novel multiview detection system, MVDet.
We take an anchor-free approach to aggregate multiview information by projecting feature maps onto the ground plane.
Our entire model is end-to-end learnable and achieves 88.2% MODA on the standard Wildtrack dataset.
arXiv Detail & Related papers (2020-07-14T17:58:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.