Cascade-DETR: Delving into High-Quality Universal Object Detection
- URL: http://arxiv.org/abs/2307.11035v1
- Date: Thu, 20 Jul 2023 17:11:20 GMT
- Title: Cascade-DETR: Delving into High-Quality Universal Object Detection
- Authors: Mingqiao Ye, Lei Ke, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin
Danelljan and Fisher Yu
- Abstract summary: We introduce Cascade-DETR for high-quality universal object detection.
We propose the Cascade Attention layer, which explicitly integrates object-centric information into the detection decoder.
Lastly, we introduce a universal object detection benchmark, UDB10, that contains 10 datasets from diverse domains.
- Score: 99.62131881419143
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object localization in general environments is a fundamental part of vision
systems. While dominating on the COCO benchmark, recent Transformer-based
detection methods are not competitive in diverse domains. Moreover, these
methods still struggle to very accurately estimate the object bounding boxes in
complex environments.
We introduce Cascade-DETR for high-quality universal object detection. We
jointly tackle the generalization to diverse domains and localization accuracy
by proposing the Cascade Attention layer, which explicitly integrates
object-centric information into the detection decoder by limiting the attention
to the previous box prediction. To further enhance accuracy, we also revisit
the scoring of queries. Instead of relying on classification scores, we predict
the expected IoU of the query, leading to substantially more well-calibrated
confidences. Lastly, we introduce a universal object detection benchmark,
UDB10, that contains 10 datasets from diverse domains. While also advancing the
state-of-the-art on COCO, Cascade-DETR substantially improves DETR-based
detectors on all datasets in UDB10, even by over 10 mAP in some cases. The
improvements under stringent quality requirements are even more pronounced. Our
code and models will be released at https://github.com/SysCV/cascade-detr.
Related papers
- Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Semi-Supervised and Long-Tailed Object Detection with CascadeMatch [91.86787064083012]
We propose a novel pseudo-labeling-based detector called CascadeMatch.
Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds.
We show that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches in handling long-tailed object detection.
arXiv Detail & Related papers (2023-05-24T07:09:25Z) - DETR with Additional Global Aggregation for Cross-domain Weakly
Supervised Object Detection [34.14603473160207]
This paper presents a DETR-based method for cross-domain weakly supervised object detection (CDWSOD)
We think DETR has strong potential for CDWSOD due to an insight: the encoder and the decoder in DETR are both based on the attention mechanism.
The aggregation results, i.e., image-level predictions, can naturally exploit the weak supervision for domain alignment.
arXiv Detail & Related papers (2023-04-14T12:16:42Z) - Robust Object Detection via Instance-Level Temporal Cycle Confusion [89.1027433760578]
We study the effectiveness of auxiliary self-supervised tasks to improve the out-of-distribution generalization of object detectors.
Inspired by the principle of maximum entropy, we introduce a novel self-supervised task, instance-level temporal cycle confusion (CycConf)
For each object, the task is to find the most different object proposals in the adjacent frame in a video and then cycle back to itself for self-supervision.
arXiv Detail & Related papers (2021-04-16T21:35:08Z) - VarifocalNet: An IoU-aware Dense Object Detector [11.580759212782812]
We learn an Iou-aware Classification Score (IACS) as a joint representation of object presence confidence and localization accuracy.
We show that dense object detectors can achieve a more accurate ranking of candidate detections based on the IACS.
We build an IoU-aware dense object detector based on the FCOS+ATSS architecture, that we call VarifocalNet or VFNet for short.
arXiv Detail & Related papers (2020-08-31T05:12:21Z) - Dynamic Refinement Network for Oriented and Densely Packed Object
Detection [75.29088991850958]
We present a dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH)
Our FSM enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, whereas the DRH empowers our model to refine the prediction dynamically in an object-aware manner.
We perform quantitative evaluations on several publicly available benchmarks including DOTA, HRSC2016, SKU110K, and our own SKU110K-R dataset.
arXiv Detail & Related papers (2020-05-20T11:35:50Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.