CE-FPN: Enhancing Channel Information for Object Detection
- URL: http://arxiv.org/abs/2103.10643v1
- Date: Fri, 19 Mar 2021 05:51:53 GMT
- Title: CE-FPN: Enhancing Channel Information for Object Detection
- Authors: Yihao Luo, Xiang Cao, Juntao Zhang, Xiang Cao, Jingjuan Guo, Haibo
Shen, Tianjiang Wang and Qi Feng
- Abstract summary: Feature pyramid network (FPN) has been an effective framework to extract multi-scale features in object detection.
We present a novel channel enhancement network (CE-FPN) with three simple yet effective modules to alleviate these problems.
Our experiments show that CE-FPN achieves competitive performance compared to state-of-the-art FPN-based detectors on MS COCO benchmark.
- Score: 12.954675966833372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature pyramid network (FPN) has been an effective framework to extract
multi-scale features in object detection. However, current FPN-based methods
mostly suffer from the intrinsic flaw of channel reduction, which brings about
the loss of semantical information. And the miscellaneous fused feature maps
may cause serious aliasing effects. In this paper, we present a novel channel
enhancement feature pyramid network (CE-FPN) with three simple yet effective
modules to alleviate these problems. Specifically, inspired by sub-pixel
convolution, we propose a sub-pixel skip fusion method to perform both channel
enhancement and upsampling. Instead of the original 1x1 convolution and linear
upsampling, it mitigates the information loss due to channel reduction. Then we
propose a sub-pixel context enhancement module for extracting more feature
representations, which is superior to other context methods due to the
utilization of rich channel information by sub-pixel convolution. Furthermore,
a channel attention guided module is introduced to optimize the final
integrated features on each level, which alleviates the aliasing effect only
with a few computational burdens. Our experiments show that CE-FPN achieves
competitive performance compared to state-of-the-art FPN-based detectors on MS
COCO benchmark.
Related papers
- Accurate and lightweight dehazing via multi-receptive-field non-local
network and novel contrastive regularization [9.90146712189936]
This paper presents a multi-receptive-field non-local network (MRFNLN) for image dehazing.
It is designed as a multi-stream feature attention block (MSFAB) and cross non-local block (CNLB)
It outperforms recent state-of-the-art dehazing methods with less than 1.5 Million parameters.
arXiv Detail & Related papers (2023-09-28T14:59:16Z) - Improving Pixel-based MIM by Reducing Wasted Modeling Capability [77.99468514275185]
We propose a new method that explicitly utilizes low-level features from shallow layers to aid pixel reconstruction.
To the best of our knowledge, we are the first to systematically investigate multi-level feature fusion for isotropic architectures.
Our method yields significant performance gains, such as 1.2% on fine-tuning, 2.8% on linear probing, and 2.6% on semantic segmentation.
arXiv Detail & Related papers (2023-08-01T03:44:56Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - Improved-Flow Warp Module for Remote Sensing Semantic Segmentation [9.505303195320023]
We propose a new module, called improved-flow warp module (IFWM), to adjust semantic feature maps across different scales for remote sensing semantic segmentation.
IFWM computes the offsets of pixels by a learnable way, which can alleviate the misalignment of the multi-scale features.
We validate our method on several remote sensing datasets, and the results prove the effectiveness of our method.
arXiv Detail & Related papers (2022-05-09T10:15:18Z) - TRACER: Extreme Attention Guided Salient Object Tracing Network [3.2434811678562676]
We propose TRACER, which detects salient objects with explicit edges by incorporating attention guided tracing modules.
A comparison with 13 existing methods reveals that TRACER achieves state-of-the-art performance on five benchmark datasets.
arXiv Detail & Related papers (2021-12-14T13:20:07Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Feature Flow: In-network Feature Flow Estimation for Video Object
Detection [56.80974623192569]
Optical flow is widely used in computer vision tasks to provide pixel-level motion information.
A common approach is to:forward optical flow to a neural network and fine-tune this network on the task dataset.
We propose a novel network (IFF-Net) with an textbfIn-network textbfFeature textbfFlow estimation module for video object detection.
arXiv Detail & Related papers (2020-09-21T07:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.