MFFN: Multi-view Feature Fusion Network for Camouflaged Object Detection
- URL: http://arxiv.org/abs/2210.06361v1
- Date: Wed, 12 Oct 2022 16:12:58 GMT
- Title: MFFN: Multi-view Feature Fusion Network for Camouflaged Object Detection
- Authors: Dehua Zheng, Xiaochen Zheng, Laurence T. Yang, Yuan Gao, Chenlu Zhu
and Yiheng Ruan
- Abstract summary: We propose a behavior-inspired framework, called Multi-view Feature Fusion Network (MFFN), which mimics the human behaviors of finding indistinct objects in images.
MFFN captures critical edge and semantic information by comparing and fusing extracted multi-view features.
Our method performs favorably against existing state-of-the-art methods via training with the same data.
- Score: 10.04773536815808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research about camouflaged object detection (COD) aims to segment
highly concealed objects hidden in complex surroundings. The tiny, fuzzy
camouflaged objects result in visually indistinguishable properties. However,
current single-view COD detectors are sensitive to background distractors.
Therefore, blurred boundaries and variable shapes of the camouflaged objects
are challenging to be fully captured with a single-view detector. To overcome
these obstacles, we propose a behavior-inspired framework, called Multi-view
Feature Fusion Network (MFFN), which mimics the human behaviors of finding
indistinct objects in images, i.e., observing from multiple angles, distances,
perspectives. Specifically, the key idea behind it is to generate multiple ways
of observation (multi-view) by data augmentation and apply them as inputs. MFFN
captures critical edge and semantic information by comparing and fusing
extracted multi-view features. In addition, our MFFN exploits the dependence
and interaction between views by the designed hierarchical view and channel
integration modules. Furthermore, our methods leverage the complementary
information between different views through a two-stage attention module called
Co-attention of Multi-view (CAMV). And we designed a local-overall module
called Channel Fusion Unit (CFU) to explore the channel-wise contextual clues
of diverse feature maps in an iterative manner. The experiment results show
that our method performs favorably against existing state-of-the-art methods
via training with the same data. The code will be available at https:
//github.com/dwardzheng/MFFN_COD.
Related papers
- CoFiNet: Unveiling Camouflaged Objects with Multi-Scale Finesse [46.79770062391987]
We introduce a novel method for camouflage object detection, named CoFiNet.
Our approach focuses on multi-scale feature fusion and extraction, with special attention to the model's segmentation effectiveness.
CoFiNet achieves state-of-the-art performance across all datasets.
arXiv Detail & Related papers (2024-02-03T17:24:55Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Camouflaged Object Detection with Feature Grafting and Distractor Aware [9.791590363932519]
We propose a novel Feature Grafting and Distractor Aware network (FDNet) to handle the Camouflaged Object Detection task.
Specifically, we use CNN and Transformer to encode multi-scale images in parallel.
A Distractor Aware Module is designed to explicitly model the two possible distractors in the COD task to refine the coarse camouflage map.
arXiv Detail & Related papers (2023-07-08T09:37:08Z) - Detector Guidance for Multi-Object Text-to-Image Generation [61.70018793720616]
Detector Guidance (DG) integrates a latent object detection model to separate different objects during the generation process.
Human evaluations demonstrate that DG provides an 8-22% advantage in preventing the amalgamation of conflicting concepts.
arXiv Detail & Related papers (2023-06-04T02:33:12Z) - CamoFormer: Masked Separable Attention for Camouflaged Object Detection [94.2870722866853]
We present a simple masked separable attention (MSA) for camouflaged object detection.
We first separate the multi-head self-attention into three parts, which are responsible for distinguishing the camouflaged objects from the background using different mask strategies.
We propose to capture high-resolution semantic representations progressively based on a simple top-down decoder with the proposed MSA to attain precise segmentation results.
arXiv Detail & Related papers (2022-12-10T10:03:27Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - Multi-modal Transformers Excel at Class-agnostic Object Detection [105.10403103027306]
We argue that existing methods lack a top-down supervision signal governed by human-understandable semantics.
We develop an efficient and flexible MViT architecture using multi-scale feature processing and deformable self-attention.
We show the significance of MViT proposals in a diverse range of applications.
arXiv Detail & Related papers (2021-11-22T18:59:29Z) - Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion [45.45231015502287]
We propose a novel deep learning based COD approach, which integrates the large receptive field and effective feature fusion into a unified framework.
Our method detects camouflaged objects with an effective fusion strategy, which aggregates the rich context information from a large receptive field.
arXiv Detail & Related papers (2021-01-14T16:06:08Z) - Multiview Detection with Feature Perspective Transformation [59.34619548026885]
We propose a novel multiview detection system, MVDet.
We take an anchor-free approach to aggregate multiview information by projecting feature maps onto the ground plane.
Our entire model is end-to-end learnable and achieves 88.2% MODA on the standard Wildtrack dataset.
arXiv Detail & Related papers (2020-07-14T17:58:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.