Related papers: Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

URL: http://arxiv.org/abs/2003.11818v1
Date: Thu, 26 Mar 2020 10:20:52 GMT
Title: Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
Authors: Jianyuan Guo, Kai Han, Yunhe Wang, Chao Zhang, Zhaohui Yang, Han Wu, Xinghao Chen and Chang Xu
Abstract summary: We propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components of object detector. We employ a novel scheme to automatically screen different sub search spaces for different components so as to perform the end-to-end search for each component efficiently. Our searched architecture, namely Hit-Detector, achieves 41.4% mAP on COCO minival set with 27M parameters.
Score: 67.84976857449263
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Architecture Search (NAS) has achieved great success in image classification task. Some recent works have managed to explore the automatic design of efficient backbone or feature fusion layer for object detection. However, these methods focus on searching only one certain component of object detector while leaving others manually designed. We identify the inconsistency between searched component and manually designed ones would withhold the detector of stronger performance. To this end, we propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components (i.e. backbone, neck, and head) of object detector in an end-to-end manner. In addition, we empirically reveal that different parts of the detector prefer different operators. Motivated by this, we employ a novel scheme to automatically screen different sub search spaces for different components so as to perform the end-to-end search for each component on the corresponding sub search space efficiently. Without bells and whistles, our searched architecture, namely Hit-Detector, achieves 41.4\% mAP on COCO minival set with 27M parameters. Our implementation is available at https://github.com/ggjy/HitDet.pytorch.

Related papers

Toward Realistic Camouflaged Object Detection: Benchmarks and Method [11.279532701331647]
Camouflaged object detection (COD) primarily relies on semantic or instance segmentation methods. We propose a camouflage-aware feature refinement (CAFR) strategy to detect camouflaged objects. CAFR fully utilizes a clear perception of the current object within the prior knowledge of large models to assist detectors in deeply understanding the distinctions between background and foreground.
arXiv Detail & Related papers (2025-01-13T13:04:00Z)
The Impact of Different Backbone Architecture on Autonomous Vehicle Dataset [120.08736654413637]
The quality of the features extracted by the backbone architecture can have a significant impact on the overall detection performance. Our study evaluates three well-known autonomous vehicle datasets, namely KITTI, NuScenes, and BDD, to compare the performance of different backbone architectures on object detection tasks.
arXiv Detail & Related papers (2023-09-15T17:32:15Z)
RTMDet: An Empirical Study of Designing Real-Time Object Detectors [13.09100888887757]
We develop an efficient real-time object detector that exceeds the YOLO series and is easily for many object recognition tasks. Together with better training techniques, the resulting object detector achieves, named RTMDet, 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU. We hope the experimental results can provide new insights into designing versatile real-time object detectors for many object recognition tasks.
arXiv Detail & Related papers (2022-12-14T18:50:20Z)
Multi-Objective Evolutionary for Object Detection Mobile Architectures Search [21.14296703753317]
We propose a mobile object detection backbone network architecture search algorithm based on non-dominated sorting for NAS scenarios. The proposed approach can search the backbone networks with different depths, widths, or expansion sizes via a technique of weight mapping. Under similar computational complexity, the accuracy of the backbone network architecture we search for is 2.0% mAP higher than MobileDet.
arXiv Detail & Related papers (2022-11-05T00:28:49Z)
Searching a High-Performance Feature Extractor for Text Recognition Network [92.12492627169108]
We design a domain-specific search space by exploring principles for having good feature extractors. As the space is huge and complexly structured, no existing NAS algorithms can be applied. We propose a two-stage algorithm to effectively search in the space.
arXiv Detail & Related papers (2022-09-27T03:49:04Z)
A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection [59.21990697929617]
Humans tend to mine objects by learning from a group of images or several frames of video since we live in a dynamic world. Previous approaches design different networks on similar tasks separately, and they are difficult to apply to each other. We introduce a unified framework to tackle these issues, term as UFO (UnifiedObject Framework for Co-Object Framework)
arXiv Detail & Related papers (2022-03-09T13:35:19Z)
Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation [144.50154657257605]
We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module. Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
arXiv Detail & Related papers (2020-10-30T08:34:35Z)
Representation Sharing for Fast Object Detector Search and Beyond [38.18583590914755]
We propose Fast And Diverse (FAD) to better explore the optimal configuration of receptive fields and convolution types in the sub-networks for one-stage detectors. FAD achieves prominent improvements on two types of one-stage detectors with various backbones.
arXiv Detail & Related papers (2020-07-23T15:39:44Z)
AutoSTR: Efficient Backbone Search for Scene Text Recognition [80.7290173000068]
Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes. We propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance. Experiments demonstrate that, by searching data-dependent backbones, AutoSTR can outperform the state-of-the-art approaches on standard benchmarks.
arXiv Detail & Related papers (2020-03-14T06:51:04Z)
Pixel-Semantic Revise of Position Learning A One-Stage Object Detector with A Shared Encoder-Decoder [5.371825910267909]
We analyze that different methods detect objects adaptively. Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information. This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
arXiv Detail & Related papers (2020-01-04T08:55:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.