Align and Distill: Unifying and Improving Domain Adaptive Object Detection
- URL: http://arxiv.org/abs/2403.12029v2
- Date: Sun, 25 Aug 2024 14:05:18 GMT
- Title: Align and Distill: Unifying and Improving Domain Adaptive Object Detection
- Authors: Justin Kay, Timm Haucke, Suzanne Stathatos, Siqi Deng, Erik Young, Pietro Perona, Sara Beery, Grant Van Horn,
- Abstract summary: Domain adaptive object detection (DAOD) methods have recently demonstrated strong results on addressing this challenge.
We identify systemic benchmarking pitfalls that call past results into question and hamper further progress.
We address these problems by introducing: A unified benchmarking and implementation framework, Align and Distill (ALDI)
A new benchmark dataset, CFC-DAOD, enabling evaluation on diverse real-world data, and (4) A new method, ALDI++, that achieves state-of-the-art results by a large margin.
- Score: 16.879446142847392
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object detectors often perform poorly on data that differs from their training set. Domain adaptive object detection (DAOD) methods have recently demonstrated strong results on addressing this challenge. Unfortunately, we identify systemic benchmarking pitfalls that call past results into question and hamper further progress: (a) Overestimation of performance due to underpowered baselines, (b) Inconsistent implementation practices preventing transparent comparisons of methods, and (c) Lack of generality due to outdated backbones and lack of diversity in benchmarks. We address these problems by introducing: (1) A unified benchmarking and implementation framework, Align and Distill (ALDI), enabling comparison of DAOD methods and supporting future development, (2) A fair and modern training and evaluation protocol for DAOD that addresses benchmarking pitfalls, (3) A new DAOD benchmark dataset, CFC-DAOD, enabling evaluation on diverse real-world data, and (4) A new method, ALDI++, that achieves state-of-the-art results by a large margin. ALDI++ outperforms the previous state-of-the-art by +3.5 AP50 on Cityscapes to Foggy Cityscapes, +5.7 AP50 on Sim10k to Cityscapes (where ours is the only method to outperform a fair baseline), and +0.6 AP50 on CFC Kenai to Channel. Our framework, dataset, and state-of-the-art method offer a critical reset for DAOD and provide a strong foundation for future research. Code and data are available: https://github.com/justinkay/aldi and https://github.com/visipedia/caltech-fish-counting.
Related papers
- Theoretically Achieving Continuous Representation of Oriented Bounding Boxes [64.15627958879053]
This paper endeavors to completely solve the issue of discontinuity in Oriented Bounding Box representation.
We propose a novel representation method called Continuous OBB (COBB) which can be readily integrated into existing detectors.
For fairness and transparency of experiments, we have developed a modularized benchmark based on the open-source deep learning framework Jittor's detection toolbox JDet for OOD evaluation.
arXiv Detail & Related papers (2024-02-29T09:27:40Z) - Towards Evaluating Transfer-based Attacks Systematically, Practically,
and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention.
An increasing number of transfer-based methods have been developed to fool black-box DNN models.
We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - Re-Evaluating LiDAR Scene Flow for Autonomous Driving [80.37947791534985]
Popular benchmarks for self-supervised LiDAR scene flow have unrealistic rates of dynamic motion, unrealistic correspondences, and unrealistic sampling patterns.
We evaluate a suite of top methods on a suite of real-world datasets.
We show that despite the emphasis placed on learning, most performance gains are caused by pre- and post-processing steps.
arXiv Detail & Related papers (2023-04-04T22:45:50Z) - SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object
Detection [27.02391566687007]
Domain adaptive object detection (DAD) aims to alleviate transfer performance degradation caused by the cross-domain discrepancy.
We propose a novel semi-supervised domain adaptive YOLO (SSDA-YOLO) based method to improve cross-domain detection performance.
arXiv Detail & Related papers (2022-11-04T01:50:13Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - SimROD: A Simple Adaptation Method for Robust Object Detection [8.307942341807152]
This paper presents a simple and effective unsupervised adaptation method for Robust Object Detection (SimROD)
Our method integrates a novel domain-centric augmentation method, a gradual self-labeling adaptation procedure, and a teacher-guided fine-tuning mechanism.
When applied to image corruptions and high-level cross-domain adaptation benchmarks, our method outperforms prior baselines.
arXiv Detail & Related papers (2021-07-28T14:28:32Z) - Disentangle Your Dense Object Detector [82.22771433419727]
Deep learning-based dense object detectors have achieved great success in the past few years and have been applied to numerous multimedia applications such as video understanding.
However, the current training pipeline for dense detectors is compromised to lots of conjunctions that may not hold.
We propose Disentangled Dense Object Detector (DDOD), in which simple and effective disentanglement mechanisms are designed and integrated into the current state-of-the-art detectors.
arXiv Detail & Related papers (2021-07-07T00:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.