DetectoRS: Detecting Objects with Recursive Feature Pyramid and
Switchable Atrous Convolution
- URL: http://arxiv.org/abs/2006.02334v2
- Date: Mon, 30 Nov 2020 16:06:38 GMT
- Title: DetectoRS: Detecting Objects with Recursive Feature Pyramid and
Switchable Atrous Convolution
- Authors: Siyuan Qiao, Liang-Chieh Chen, Alan Yuille
- Abstract summary: We explore the mechanism of looking and thinking twice in the backbone design for object detection.
At the macro level, we propose Recursive Feature Pyramid, which incorporates extra feedback connections from Feature Pyramid Networks.
At the micro level, we propose Switchable Atrous Convolution, which convolves the features with different atrous rates and gathers the results.
- Score: 27.67084901207291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many modern object detectors demonstrate outstanding performances by using
the mechanism of looking and thinking twice. In this paper, we explore this
mechanism in the backbone design for object detection. At the macro level, we
propose Recursive Feature Pyramid, which incorporates extra feedback
connections from Feature Pyramid Networks into the bottom-up backbone layers.
At the micro level, we propose Switchable Atrous Convolution, which convolves
the features with different atrous rates and gathers the results using switch
functions. Combining them results in DetectoRS, which significantly improves
the performances of object detection. On COCO test-dev, DetectoRS achieves
state-of-the-art 55.7% box AP for object detection, 48.5% mask AP for instance
segmentation, and 50.0% PQ for panoptic segmentation. The code is made publicly
available.
Related papers
- DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM [81.75988648572347]
We present DetToolChain, a novel prompting paradigm to unleash the zero-shot object detection ability of multimodal large language models (MLLMs)
Our approach consists of a detection prompting toolkit inspired by high-precision detection priors and a new Chain-of-Thought to implement these prompts.
We show that GPT-4V with our DetToolChain improves state-of-the-art object detectors by +21.5% AP50 on MS Novel class set for open-vocabulary detection.
arXiv Detail & Related papers (2024-03-19T06:54:33Z) - Feature Shrinkage Pyramid for Camouflaged Object Detection with
Transformers [34.42710399235461]
Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection.
They suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders.
We propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode locality-enhanced neighboring transformer features.
arXiv Detail & Related papers (2023-03-26T20:50:58Z) - Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - A Tri-Layer Plugin to Improve Occluded Detection [100.99802831241583]
We propose a simple '' module for the detection head of two-stage object detectors to improve the recall of partially occluded objects.
The module predicts a tri-layer of segmentation masks for the target object, the occluder and the occludee, and by doing so is able to better predict the mask of the target object.
We also establish a COCO evaluation dataset to measure the recall performance of partially occluded and separated objects.
arXiv Detail & Related papers (2022-10-18T17:59:51Z) - Towards Accurate Pixel-wise Object Tracking by Attention Retrieval [50.06436600343181]
We propose an attention retrieval network (ARN) to perform soft spatial constraints on backbone features.
We set a new state-of-the-art on recent pixel-wise object tracking benchmark VOT 2020 while running at 40 fps.
arXiv Detail & Related papers (2020-08-06T16:25:23Z) - Segment as Points for Efficient Online Multi-Object Tracking and
Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation.
Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images.
The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z) - Hit-Detector: Hierarchical Trinity Architecture Search for Object
Detection [67.84976857449263]
We propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components of object detector.
We employ a novel scheme to automatically screen different sub search spaces for different components so as to perform the end-to-end search for each component efficiently.
Our searched architecture, namely Hit-Detector, achieves 41.4% mAP on COCO minival set with 27M parameters.
arXiv Detail & Related papers (2020-03-26T10:20:52Z) - Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder [5.371825910267909]
We analyze that different methods detect objects adaptively.
Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information.
This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
arXiv Detail & Related papers (2020-01-04T08:55:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.