Generalized Focal Loss V2: Learning Reliable Localization Quality
Estimation for Dense Object Detection
- URL: http://arxiv.org/abs/2011.12885v1
- Date: Wed, 25 Nov 2020 17:06:37 GMT
- Title: Generalized Focal Loss V2: Learning Reliable Localization Quality
Estimation for Dense Object Detection
- Authors: Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang
- Abstract summary: GFLV2 (ResNet-101) achieves 46.2 AP at 14.6 FPS, surpassing the previous state-of-the-art ATSS baseline (43.6 AP at 14.6 FPS) by absolute 2.6 AP on COCO tt test-dev.
Code will be available at https://github.com/implus/GFocalV2.
- Score: 78.11775981796367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Localization Quality Estimation (LQE) is crucial and popular in the recent
advancement of dense object detectors since it can provide accurate ranking
scores that benefit the Non-Maximum Suppression processing and improve
detection performance. As a common practice, most existing methods predict LQE
scores through vanilla convolutional features shared with object classification
or bounding box regression. In this paper, we explore a completely novel and
different perspective to perform LQE -- based on the learned distributions of
the four parameters of the bounding box. The bounding box distributions are
inspired and introduced as "General Distribution" in GFLV1, which describes the
uncertainty of the predicted bounding boxes well. Such a property makes the
distribution statistics of a bounding box highly correlated to its real
localization quality. Specifically, a bounding box distribution with a sharp
peak usually corresponds to high localization quality, and vice versa. By
leveraging the close correlation between distribution statistics and the real
localization quality, we develop a considerably lightweight Distribution-Guided
Quality Predictor (DGQP) for reliable LQE based on GFLV1, thus producing GFLV2.
To our best knowledge, it is the first attempt in object detection to use a
highly relevant, statistical representation to facilitate LQE. Extensive
experiments demonstrate the effectiveness of our method. Notably, GFLV2
(ResNet-101) achieves 46.2 AP at 14.6 FPS, surpassing the previous
state-of-the-art ATSS baseline (43.6 AP at 14.6 FPS) by absolute 2.6 AP on COCO
{\tt test-dev}, without sacrificing the efficiency both in training and
inference. Code will be available at https://github.com/implus/GFocalV2.
Related papers
- Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning [44.34372470957298]
We propose a method for boosting the generalization ability of pre-trained vision-language models (VLMs)
The idea is realized by exploiting out-of-distribution (OOD) detection to predict whether a sample belongs to a base distribution or a novel distribution.
With the help of OOD detectors, the harmonic mean of CoOp and ProGrad increase by 2.6 and 1.5 percentage points over 11 recognition datasets in the base-to-novel setting.
arXiv Detail & Related papers (2024-03-31T08:28:42Z) - Being Aware of Localization Accuracy By Generating Predicted-IoU-Guided
Quality Scores [24.086202809990795]
We develop an elegant LQE branch to acquire localization quality score guided by predicted IoU.
A novel one stage detector termed CLQ is proposed.
Experiments show that CLQ achieves state-of-the-arts' performance at an accuracy of 47.8 AP and a speed of 11.5 fps.
arXiv Detail & Related papers (2023-09-23T05:27:59Z) - Distribution-Aware Calibration for Object Detection with Noisy Bounding Boxes [58.2797274877934]
We propose DIStribution-aware CalibratiOn (DISCO) to model the spatial distribution of proposals for calibrating supervision signals.
Three distribution-aware techniques are developed to improve classification, localization, and interpretability.
arXiv Detail & Related papers (2023-08-23T09:20:05Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Source-Free Progressive Graph Learning for Open-Set Domain Adaptation [44.63301903324783]
Open-set domain adaptation (OSDA) has gained considerable attention in many visual recognition tasks.
We propose a Progressive Graph Learning (PGL) framework that decomposes the target hypothesis space into the shared and unknown subspaces.
We also tackle a more realistic source-free open-set domain adaptation (SF-OSDA) setting that makes no assumption about the coexistence of source and target domains.
arXiv Detail & Related papers (2022-02-13T01:19:41Z) - Achieving Statistical Optimality of Federated Learning: Beyond
Stationary Points [19.891597817559038]
Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud.
Recent work raised concerns on two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally.
We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates.
arXiv Detail & Related papers (2021-06-29T09:59:43Z) - Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box
Estimation [85.22775182688798]
This work proposes a novel, flexible, and accurate refinement module called Alpha-Refine.
It can significantly improve the base trackers' box estimation quality.
Experiments on TrackingNet, LaSOT, GOT-10K, and VOT 2020 benchmarks show that our approach significantly improves the base trackers' performance with little extra latency.
arXiv Detail & Related papers (2020-12-12T13:33:25Z) - Learning Calibrated Uncertainties for Domain Shift: A Distributionally
Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts.
In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution.
We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z) - Generalized Focal Loss: Learning Qualified and Distributed Bounding
Boxes for Dense Object Detection [85.53263670166304]
One-stage detector basically formulates object detection as dense classification and localization.
Recent trend for one-stage detectors is to introduce an individual prediction branch to estimate the quality of localization.
This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization.
arXiv Detail & Related papers (2020-06-08T07:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.