DSIC: Dynamic Sample-Individualized Connector for Multi-Scale Object
Detection
- URL: http://arxiv.org/abs/2011.07774v2
- Date: Thu, 25 Mar 2021 02:14:16 GMT
- Title: DSIC: Dynamic Sample-Individualized Connector for Multi-Scale Object
Detection
- Authors: Zekun Li, Yufan Liu, Bing Li, Weiming Hu
- Abstract summary: We propose a Dynamic Sample-Individualized Connector (DSIC) for multi-scale object detection.
ISG adaptively extracts multi-level features from backbone as the input of feature integration.
CSG automatically activate informative data flow paths based on the multi-level features.
- Score: 33.61001547745264
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although object detection has reached a milestone thanks to the great success
of deep learning, the scale variation is still the key challenge. Integrating
multi-level features is presented to alleviate the problems, like the classic
Feature Pyramid Network (FPN) and its improvements. However, the specifically
designed feature integration modules of these methods may not have the optimal
architecture for feature fusion. Moreover, these models have fixed
architectures and data flow paths, when fed with various samples. They cannot
adjust and be compatible with each kind of data. To overcome the above
limitations, we propose a Dynamic Sample-Individualized Connector (DSIC) for
multi-scale object detection. It dynamically adjusts network connections to fit
different samples. In particular, DSIC consists of two components: Intra-scale
Selection Gate (ISG) and Cross-scale Selection Gate (CSG). ISG adaptively
extracts multi-level features from backbone as the input of feature
integration. CSG automatically activate informative data flow paths based on
the multi-level features. Furthermore, these two components are both
plug-and-play and can be embedded in any backbone. Experimental results
demonstrate that the proposed method outperforms the state-of-the-arts.
Related papers
- Multi-scale Feature Fusion with Point Pyramid for 3D Object Detection [18.41721888099563]
This paper proposes the Point Pyramid RCNN (POP-RCNN), a feature pyramid-based framework for 3D object detection on point clouds.
The proposed method can be applied to a variety of existing frameworks to increase feature richness, especially for long-distance detection.
arXiv Detail & Related papers (2024-09-06T20:13:14Z) - DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with
Competitive Query Selection and Adaptive Feature Fusion [82.2425759608975]
Infrared-visible object detection aims to achieve robust even full-day object detection by fusing the complementary information of infrared and visible images.
We propose a Dynamic Adaptive Multispectral Detection Transformer (DAMSDet) to address these two challenges.
Experiments on four public datasets demonstrate significant improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-03-01T07:03:27Z) - Bi-directional Adapter for Multi-modal Tracking [67.01179868400229]
We propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter.
We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another.
Our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods.
arXiv Detail & Related papers (2023-12-17T05:27:31Z) - Multistep feature aggregation framework for salient object detection [0.0]
We introduce a multistep feature aggregation framework for salient object detection.
It is composed of three modules, including the Diverse Reception (DR) module, multiscale interaction (MSI) module and Feature Enhancement (FE) module.
Experimental results on six benchmark datasets demonstrate that MSFA achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-11-12T16:13:16Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage
Object Detectors [64.93963042395976]
Implicit Instance-Invariant Network (I3Net) is tailored for adapting one-stage detectors.
I3Net implicitly learns instance-invariant features via exploiting the natural characteristics of deep features in different layers.
Experiments reveal that I3Net exceeds the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2021-03-25T11:14:36Z) - Learning to Generate Content-Aware Dynamic Detectors [62.74209921174237]
We introduce a newpective of designing efficient detectors, which is automatically generating sample-adaptive model architecture.
We introduce a course-to-fine strat-egy tailored for object detection to guide the learning of dynamic routing.
Experiments on MS-COCO dataset demonstrate that CADDet achieves 1.8 higher mAP with 10% fewer FLOPs compared with vanilla routing.
arXiv Detail & Related papers (2020-12-08T08:05:20Z) - RGBT Tracking via Multi-Adapter Network with Hierarchical Divergence
Loss [37.99375824040946]
We propose a novel multi-adapter network to jointly perform modality-shared, modality-specific and instance-aware target representation learning.
Experiments on two RGBT tracking benchmark datasets demonstrate the outstanding performance of the proposed tracker.
arXiv Detail & Related papers (2020-11-14T01:50:46Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.