Related papers: Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization

Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization

URL: http://arxiv.org/abs/2509.26281v2
Date: Wed, 08 Oct 2025 03:36:37 GMT
Title: Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization
Authors: Teng Zhang, Ziqian Fan, Mingxin Liu, Xin Zhang, Xudong Lu, Wentong Li, Yue Zhou, Yi Yu, Xiang Li, Junchi Yan, Xue Yang,
Abstract summary: Point2RBox-v3 is first model to employ dynamic pseudo labels for label assignment.<n>Our solution gives competitive performance, especially in scenarios with large variations in object size or sparse object occurrences.
Score: 58.42853147118086
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Driven by the growing need for Oriented Object Detection (OOD), learning from point annotations under a weakly-supervised framework has emerged as a promising alternative to costly and laborious manual labeling. In this paper, we discuss two deficiencies in existing point-supervised methods: inefficient utilization and poor quality of pseudo labels. Therefore, we present Point2RBox-v3. At the core are two principles: 1) Progressive Label Assignment (PLA). It dynamically estimates instance sizes in a coarse yet intelligent manner at different stages of the training process, enabling the use of label assignment methods. 2) Prior-Guided Dynamic Mask Loss (PGDM-Loss). It is an enhancement of the Voronoi Watershed Loss from Point2RBox-v2, which overcomes the shortcomings of Watershed in its poor performance in sparse scenes and SAM's poor performance in dense scenes. To our knowledge, Point2RBox-v3 is the first model to employ dynamic pseudo labels for label assignment, and it creatively complements the advantages of SAM model with the watershed algorithm, which achieves excellent performance in both sparse and dense scenes. Our solution gives competitive performance, especially in scenarios with large variations in object size or sparse object occurrences: 66.09%/56.86%/41.28%/46.40%/19.60%/45.96% on DOTA-v1.0/DOTA-v1.5/DOTA-v2.0/DIOR/STAR/RSAR.

Related papers

Contextual Range-View Projection for 3D LiDAR Point Clouds [1.529342790344802]
Range-view projection provides efficient method for transforming 3D LiDAR point clouds into 2D range image representations.<n>Existing approaches typically retain the point with the smallest depth (closest to the LiDAR)<n>We introduce two mechanisms: textitCenterness-Aware Projection (CAP) and textitClass-Weighted-Aware Projection (CWAP).
arXiv Detail & Related papers (2026-01-26T09:30:43Z)
ALISE: Annotation-Free LiDAR Instance Segmentation for Autonomous Driving [9.361724251990154]
We introduce ALISE, a novel framework that performs LiDAR instance segmentation without any annotations.<n>Our approach starts by employing Vision Foundation Models (VFMs), guided by text and images, to produce initial pseudo-labels.<n>We then refine these labels through a dedicated manual-temporal voting module, which combines 2D and 3D semantics for both offline and online optimization.<n>This comprehensive design results in significant performance gains, establishing a new state-of-the-art for unsupervised 3D instance segmentation.
arXiv Detail & Related papers (2025-10-07T10:15:18Z)
Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances [50.80161958767447]
We present Point2RBox-v2, an approach to explore the spatial layout among instances for learning point-supervised OOD.<n>Our solution is elegant and lightweight, yet it is expected to give a competitive performance especially in densely packed scenes.
arXiv Detail & Related papers (2025-02-06T18:07:25Z)
Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision [81.60564776995682]
We present Point2RBox, an end-to-end solution for point-supervised object detection. Our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives. In particular, our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives.
arXiv Detail & Related papers (2023-11-23T15:57:41Z)
P2RBox: Point Prompt Oriented Object Detection with SAM [28.96914721062631]
We introduce P2RBox, which employs point prompt to generate rotated box (RBox) annotation for oriented object detection. P2RBox incorporates two advanced guidance cues: Boundary Sensitive Mask guidance, and Centrality guidance, which utilize spatial information to reduce granularity ambiguity. Compared to the state-of-the-art point-annotated generative method PointOBB, P2RBox outperforms by about 29% mAP on DOTA-v1.0 dataset.
arXiv Detail & Related papers (2023-11-22T03:33:00Z)
FreePoint: Unsupervised Point Cloud Instance Segmentation [72.64540130803687]
We propose FreePoint, for underexplored unsupervised class-agnostic instance segmentation on point clouds. We represent point features by combining coordinates, colors, and self-supervised deep features. Based on the point features, we segment point clouds into coarse instance masks as pseudo labels, which are used to train a point cloud instance segmentation model.
arXiv Detail & Related papers (2023-05-11T16:56:26Z)
Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point. With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information. Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z)
LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner. We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.