Fine-Grained Dynamic Head for Object Detection
- URL: http://arxiv.org/abs/2012.03519v1
- Date: Mon, 7 Dec 2020 08:16:32 GMT
- Title: Fine-Grained Dynamic Head for Object Detection
- Authors: Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Hongbin Sun, Jian Sun,
Nanning Zheng
- Abstract summary: We propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance.
Experiments demonstrate the effectiveness and efficiency of the proposed method on several state-of-the-art detection benchmarks.
- Score: 68.70628757217939
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Feature Pyramid Network (FPN) presents a remarkable approach to alleviate
the scale variance in object representation by performing instance-level
assignments. Nevertheless, this strategy ignores the distinct characteristics
of different sub-regions in an instance. To this end, we propose a fine-grained
dynamic head to conditionally select a pixel-level combination of FPN features
from different scales for each instance, which further releases the ability of
multi-scale feature representation. Moreover, we design a spatial gate with the
new activation function to reduce computational complexity dramatically through
spatially sparse convolutions. Extensive experiments demonstrate the
effectiveness and efficiency of the proposed method on several state-of-the-art
detection benchmarks. Code is available at
https://github.com/StevenGrove/DynamicHead.
Related papers
- DyFADet: Dynamic Feature Aggregation for Temporal Action Detection [70.37707797523723]
We build a novel dynamic feature aggregation (DFA) module that can adapt kernel weights and receptive fields at different timestamps.
Using DFA helps to develop a Dynamic TAD head (DyHead), which adaptively aggregates the multi-scale features with adjusted parameters.
DyFADet, a new dynamic TAD model, achieves promising performance on a series of challenging TAD benchmarks.
arXiv Detail & Related papers (2024-07-03T15:29:10Z) - HPS-Det: Dynamic Sample Assignment with Hyper-Parameter Search for
Object Detection [25.71039912705784]
We propose a novel dynamic sample assignment scheme based on hyper- parameter search.
Experiments demonstrate that the resulting HPS-Det brings improved performance over different object detection baselines.
arXiv Detail & Related papers (2022-07-23T15:13:57Z) - Learning to Aggregate Multi-Scale Context for Instance Segmentation in
Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process.
The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z) - Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution.
We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids.
The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z) - Towards Better Object Detection in Scale Variation with Adaptive Feature
Selection [3.5352273012717044]
We propose a novel adaptive feature selection module (AFSM) to automatically learn the way to fuse multi-level representations in the channel dimension.
It significantly improves the performance of the detectors that have a feature pyramid structure.
A class-aware sampling mechanism (CASM) is proposed to tackle the class imbalance problem.
arXiv Detail & Related papers (2020-12-06T13:41:20Z) - DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic
Convolution [136.7261709896713]
We propose a data-driven approach that generates the appropriate convolution kernels to apply in response to the nature of the instances.
The proposed method achieves promising results on both ScanetNetV2 and S3DIS.
It also improves inference speed by more than 25% over the current state-of-the-art.
arXiv Detail & Related papers (2020-11-26T14:56:57Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Hierarchical Dynamic Filtering Network for RGB-D Salient Object
Detection [91.43066633305662]
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
In this paper, we explore these issues from a new perspective.
We implement a kind of more flexible and efficient multi-scale cross-modal feature processing.
arXiv Detail & Related papers (2020-07-13T07:59:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.