Tiny Object Detection with Single Point Supervision
- URL: http://arxiv.org/abs/2412.05837v1
- Date: Sun, 08 Dec 2024 07:13:17 GMT
- Title: Tiny Object Detection with Single Point Supervision
- Authors: Haoran Zhu, Chang Xu, Ruixiang Zhang, Fang Xu, Wen Yang, Haijian Zhang, Gui-Song Xia,
- Abstract summary: We propose Point Teacher--the first end-to-end point-supervised method for robust tiny object detection in aerial images.
To handle label noise from scale ambiguity and location shifts in point annotations, Point Teacher employs the teacher-student architecture.
In this framework, random masking of image regions facilitates regression learning, enabling the teacher to transform noisy point annotations into coarse pseudo boxes.
In the second phase, these coarse pseudo boxes are refined using dynamic multiple instance learning, which adaptively selects the most reliable instance.
- Score: 48.88814240556747
- License:
- Abstract: Tiny objects, with their limited spatial resolution, often resemble point-like distributions. As a result, bounding box prediction using point-level supervision emerges as a natural and cost-effective alternative to traditional box-level supervision. However, the small scale and lack of distinctive features of tiny objects make point annotations prone to noise, posing significant hurdles for model robustness. To tackle these challenges, we propose Point Teacher--the first end-to-end point-supervised method for robust tiny object detection in aerial images. To handle label noise from scale ambiguity and location shifts in point annotations, Point Teacher employs the teacher-student architecture and decouples the learning into a two-phase denoising process. In this framework, the teacher network progressively denoises the pseudo boxes derived from noisy point annotations, guiding the student network's learning. Specifically, in the first phase, random masking of image regions facilitates regression learning, enabling the teacher to transform noisy point annotations into coarse pseudo boxes. In the second phase, these coarse pseudo boxes are refined using dynamic multiple instance learning, which adaptively selects the most reliable instance from dynamically constructed proposal bags around the coarse pseudo boxes. Extensive experiments on three tiny object datasets (i.e., AI-TOD-v2, SODA-A, and TinyPerson) validate the proposed method's effectiveness and robustness against point location shifts. Notably, relying solely on point supervision, our Point Teacher already shows comparable performance with box-supervised learning methods. Codes and models will be made publicly available.
Related papers
- Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation [58.37525311718006]
We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD)
We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario.
Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
arXiv Detail & Related papers (2024-11-04T12:59:13Z) - Just a Hint: Point-Supervised Camouflaged Object Detection [4.38858748263547]
Camouflaged Object Detection (COD) demands models to expeditiously and accurately distinguish objects seamlessly in the environment.
We propose to fulfill this task with the help of only one point supervision.
Specifically, by swiftly clicking on each object, we first adaptively expand the original point-based annotation to a reasonable hint area.
Then, to avoid partial localization around discriminative parts, we propose an attention regulator to scatter model attention to the whole object.
arXiv Detail & Related papers (2024-08-20T12:17:25Z) - Extreme Point Supervised Instance Segmentation [28.191795758445352]
This paper introduces a novel approach to learning instance segmentation using extreme points, i.e., the topmost, leftmost, bottommost, and rightmost points, of each object.
These points are readily available in the modern bounding box annotation process while offering strong clues for precise segmentation.
Our model generates high-quality masks when a target object is separated into multiple parts, where previous box-supervised methods often fail.
arXiv Detail & Related papers (2024-05-31T09:37:39Z) - Semi-supervised 3D Object Detection with Proficient Teachers [114.54835359657707]
Dominated point cloud-based 3D object detectors in autonomous driving scenarios rely heavily on the huge amount of accurately labeled samples.
Pseudo-Labeling methodology is commonly used for SSL frameworks, however, the low-quality predictions from the teacher model have seriously limited its performance.
We propose a new Pseudo-Labeling framework for semi-supervised 3D object detection, by enhancing the teacher model to a proficient one with several necessary designs.
arXiv Detail & Related papers (2022-07-26T04:54:03Z) - Point-Teaching: Weakly Semi-Supervised Object Detection with Point
Annotations [81.02347863372364]
We present Point-Teaching, a weakly semi-supervised object detection framework.
Specifically, we propose a Hungarian-based point matching method to generate pseudo labels for point annotated images.
We propose a simple-yet-effective data augmentation, termed point-guided copy-paste, to reduce the impact of the unmatched points.
arXiv Detail & Related papers (2022-06-01T07:04:38Z) - From Keypoints to Object Landmarks via Self-Training Correspondence: A
novel approach to Unsupervised Landmark Discovery [37.78933209094847]
This paper proposes a novel paradigm for the unsupervised learning of object landmark detectors.
We validate our method on a variety of difficult datasets, including LS3D, BBCPose, Human3.6M and PennAction.
arXiv Detail & Related papers (2022-05-31T15:44:29Z) - Box2Seg: Learning Semantics of 3D Point Clouds with Box-Level
Supervision [65.19589997822155]
We introduce a neural architecture, termed Box2Seg, to learn point-level semantics of 3D point clouds with bounding box-level supervision.
We show that the proposed network can be trained with cheap, or even off-the-shelf bounding box-level annotations and subcloud-level tags.
arXiv Detail & Related papers (2022-01-09T09:07:48Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.