Hit-Detector: Hierarchical Trinity Architecture Search for Object
Detection
- URL: http://arxiv.org/abs/2003.11818v1
- Date: Thu, 26 Mar 2020 10:20:52 GMT
- Title: Hit-Detector: Hierarchical Trinity Architecture Search for Object
Detection
- Authors: Jianyuan Guo, Kai Han, Yunhe Wang, Chao Zhang, Zhaohui Yang, Han Wu,
Xinghao Chen and Chang Xu
- Abstract summary: We propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components of object detector.
We employ a novel scheme to automatically screen different sub search spaces for different components so as to perform the end-to-end search for each component efficiently.
Our searched architecture, namely Hit-Detector, achieves 41.4% mAP on COCO minival set with 27M parameters.
- Score: 67.84976857449263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Architecture Search (NAS) has achieved great success in image
classification task. Some recent works have managed to explore the automatic
design of efficient backbone or feature fusion layer for object detection.
However, these methods focus on searching only one certain component of object
detector while leaving others manually designed. We identify the inconsistency
between searched component and manually designed ones would withhold the
detector of stronger performance. To this end, we propose a hierarchical
trinity search framework to simultaneously discover efficient architectures for
all components (i.e. backbone, neck, and head) of object detector in an
end-to-end manner. In addition, we empirically reveal that different parts of
the detector prefer different operators. Motivated by this, we employ a novel
scheme to automatically screen different sub search spaces for different
components so as to perform the end-to-end search for each component on the
corresponding sub search space efficiently. Without bells and whistles, our
searched architecture, namely Hit-Detector, achieves 41.4\% mAP on COCO minival
set with 27M parameters. Our implementation is available at
https://github.com/ggjy/HitDet.pytorch.
Related papers
- The Impact of Different Backbone Architecture on Autonomous Vehicle
Dataset [120.08736654413637]
The quality of the features extracted by the backbone architecture can have a significant impact on the overall detection performance.
Our study evaluates three well-known autonomous vehicle datasets, namely KITTI, NuScenes, and BDD, to compare the performance of different backbone architectures on object detection tasks.
arXiv Detail & Related papers (2023-09-15T17:32:15Z) - RTMDet: An Empirical Study of Designing Real-Time Object Detectors [13.09100888887757]
We develop an efficient real-time object detector that exceeds the YOLO series and is easily for many object recognition tasks.
Together with better training techniques, the resulting object detector achieves, named RTMDet, 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU.
We hope the experimental results can provide new insights into designing versatile real-time object detectors for many object recognition tasks.
arXiv Detail & Related papers (2022-12-14T18:50:20Z) - Multi-Objective Evolutionary for Object Detection Mobile Architectures
Search [21.14296703753317]
We propose a mobile object detection backbone network architecture search algorithm based on non-dominated sorting for NAS scenarios.
The proposed approach can search the backbone networks with different depths, widths, or expansion sizes via a technique of weight mapping.
Under similar computational complexity, the accuracy of the backbone network architecture we search for is 2.0% mAP higher than MobileDet.
arXiv Detail & Related papers (2022-11-05T00:28:49Z) - Searching a High-Performance Feature Extractor for Text Recognition
Network [92.12492627169108]
We design a domain-specific search space by exploring principles for having good feature extractors.
As the space is huge and complexly structured, no existing NAS algorithms can be applied.
We propose a two-stage algorithm to effectively search in the space.
arXiv Detail & Related papers (2022-09-27T03:49:04Z) - A Unified Transformer Framework for Group-based Segmentation:
Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection [59.21990697929617]
Humans tend to mine objects by learning from a group of images or several frames of video since we live in a dynamic world.
Previous approaches design different networks on similar tasks separately, and they are difficult to apply to each other.
We introduce a unified framework to tackle these issues, term as UFO (UnifiedObject Framework for Co-Object Framework)
arXiv Detail & Related papers (2022-03-09T13:35:19Z) - Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation [144.50154657257605]
We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module.
Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
arXiv Detail & Related papers (2020-10-30T08:34:35Z) - Representation Sharing for Fast Object Detector Search and Beyond [38.18583590914755]
We propose Fast And Diverse (FAD) to better explore the optimal configuration of receptive fields and convolution types in the sub-networks for one-stage detectors.
FAD achieves prominent improvements on two types of one-stage detectors with various backbones.
arXiv Detail & Related papers (2020-07-23T15:39:44Z) - AutoSTR: Efficient Backbone Search for Scene Text Recognition [80.7290173000068]
Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes.
We propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance.
Experiments demonstrate that, by searching data-dependent backbones, AutoSTR can outperform the state-of-the-art approaches on standard benchmarks.
arXiv Detail & Related papers (2020-03-14T06:51:04Z) - Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder [5.371825910267909]
We analyze that different methods detect objects adaptively.
Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information.
This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
arXiv Detail & Related papers (2020-01-04T08:55:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.