Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
- URL: http://arxiv.org/abs/2411.02057v1
- Date: Mon, 04 Nov 2024 12:59:13 GMT
- Title: Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
- Authors: Yan Li, Weiwei Guo, Xue Yang, Ning Liao, Shaofeng Zhang, Yi Yu, Wenxian Yu, Junchi Yan,
- Abstract summary: We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD)
We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario.
Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
- Score: 58.37525311718006
- License:
- Abstract: In recent years, aerial object detection has been increasingly pivotal in various earth observation applications. However, current algorithms are limited to detecting a set of pre-defined object categories, demanding sufficient annotated training samples, and fail to detect novel object categories. In this paper, we put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD), which can detect objects beyond training categories without costly collecting new labeled data. We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario, where objects often exhibit weak appearance features and arbitrary orientations. Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects. Additionally, the RemoteCLIP model is adopted as an omniscient teacher, which provides rich knowledge to enhance classification capabilities for novel categories. A dynamic label queue is devised to maintain high-quality pseudo-labels during training. By doing so, the proposed CastDet boosts not only novel object proposals but also classification. Furthermore, we extend our approach from horizontal OVAD to oriented OVAD with tailored algorithm designs to effectively manage bounding box representation and pseudo-label generation. Extensive experiments for both tasks on multiple existing aerial object detection datasets demonstrate the effectiveness of our approach. The code is available at https://github.com/lizzy8587/CastDet.
Related papers
- Debiased Novel Category Discovering and Localization [40.02326438622898]
We focus on the challenging problem of Novel Class Discovery and Localization (NCDL)
We propose an Debiased Region Mining (DRM) approach that combines class-agnostic Region Proposal Network (RPN) and class-aware RPN.
We conduct extensive experiments on the NCDL benchmark, and the results demonstrate that the proposed DRM approach significantly outperforms previous methods.
arXiv Detail & Related papers (2024-02-29T03:09:16Z) - Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning [13.667326007851674]
We propose CastDet, a CLIP-activated student-teacher open-vocabulary object detection framework.
Our approach boosts not only novel object proposals but also classification.
Experimental results demonstrate our CastDet achieving superior open-vocabulary detection performance.
arXiv Detail & Related papers (2023-11-20T10:26:04Z) - Improved Region Proposal Network for Enhanced Few-Shot Object Detection [23.871860648919593]
Few-shot object detection (FSOD) methods have emerged as a solution to the limitations of classic object detection approaches.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during the FSOD training stage.
Our improved hierarchical sampling strategy for the region proposal network (RPN) also boosts the perception of the object detection model for large objects.
arXiv Detail & Related papers (2023-08-15T02:35:59Z) - Identification of Novel Classes for Improving Few-Shot Object Detection [12.013345715187285]
Few-shot object detection (FSOD) methods offer a remedy by realizing robust object detection using only a few training samples per class.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during training to improve FSOD performance.
Our experimental results indicate that our method is effective and outperforms the existing state-of-the-art (SOTA) FSOD methods.
arXiv Detail & Related papers (2023-03-18T14:12:52Z) - MUS-CDB: Mixed Uncertainty Sampling with Class Distribution Balancing
for Active Annotation in Aerial Object Detection [40.94800050576902]
Recent aerial object detection models rely on a large amount of labeled training data.
Active learning effectively reduces the data labeling cost by selectively querying the informative and representative unlabelled samples.
We propose a novel active learning method for cost-effective aerial object detection.
arXiv Detail & Related papers (2022-12-06T07:50:00Z) - Exploiting Unlabeled Data with Vision and Language Models for Object
Detection [64.94365501586118]
Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets.
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images.
We demonstrate the value of the generated pseudo labels in two specific tasks, open-vocabulary detection and semi-supervised object detection.
arXiv Detail & Related papers (2022-07-18T21:47:15Z) - Incremental-DETR: Incremental Few-Shot Object Detection via
Self-Supervised Learning [60.64535309016623]
We propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector.
To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision.
We further introduce a incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without catastrophic forgetting.
arXiv Detail & Related papers (2022-05-09T05:08:08Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - Exploring Bottom-up and Top-down Cues with Attentive Learning for Webly
Supervised Object Detection [76.9756607002489]
We propose a novel webly supervised object detection (WebSOD) method for novel classes.
Our proposed method combines bottom-up and top-down cues for novel class detection.
We demonstrate our proposed method on PASCAL VOC dataset with three different novel/base splits.
arXiv Detail & Related papers (2020-03-22T03:11:24Z) - Incremental Few-Shot Object Detection [96.02543873402813]
OpeN-ended Centre nEt is a detector for incrementally learning to detect class objects with few examples.
ONCE fully respects the incremental learning paradigm, with novel class registration requiring only a single forward pass of few-shot training samples.
arXiv Detail & Related papers (2020-03-10T12:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.