Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
- URL: http://arxiv.org/abs/2502.04268v2
- Date: Fri, 07 Feb 2025 02:23:19 GMT
- Title: Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
- Authors: Yi Yu, Botao Ren, Peiyuan Zhang, Mingxin Liu, Junwei Luo, Shaofeng Zhang, Feipeng Da, Junchi Yan, Xue Yang,
- Abstract summary: We present Point2RBox-v2, an approach to explore the spatial layout among instances for learning point-supervised OOD.
Our solution is elegant and lightweight, yet it is expected to give a competitive performance especially in densely packed scenes.
- Score: 50.80161958767447
- License:
- Abstract: With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning OOD from point annotations has gained great attention. In this paper, we rethink this challenging task setting with the layout among instances and present Point2RBox-v2. At the core are three principles: 1) Gaussian overlap loss. It learns an upper bound for each instance by treating objects as 2D Gaussian distributions and minimizing their overlap. 2) Voronoi watershed loss. It learns a lower bound for each instance through watershed on Voronoi tessellation. 3) Consistency loss. It learns the size/rotation variation between two output sets with respect to an input image and its augmented view. Supplemented by a few devised techniques, e.g. edge loss and copy-paste, the detector is further enhanced. To our best knowledge, Point2RBox-v2 is the first approach to explore the spatial layout among instances for learning point-supervised OOD. Our solution is elegant and lightweight, yet it is expected to give a competitive performance especially in densely packed scenes: 62.61%/86.15%/34.71% on DOTA/HRSC/FAIR1M. Code is available at https://github.com/VisionXLab/point2rbox-v2.
Related papers
- Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection [57.26265276035267]
Wholly-WOOD is a weakly-supervised OOD framework capable of wholly leveraging various labeling forms.
By only using HBox for training, our Wholly-WOOD achieves performance very close to that of the RBox-trained counterpart on remote sensing.
arXiv Detail & Related papers (2025-02-13T16:34:59Z) - FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection [10.655167287088368]
We propose a novel metric for arbitrary shapes comparison based on minimum points distance.
$FPDIoU$ loss has been applied to state-of-the-art rotated object detection.
arXiv Detail & Related papers (2024-05-16T09:44:00Z) - NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth
Supervision for Indoor Multi-View 3D Detection [72.0098999512727]
NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by utilizing NeRF to enhance representation learning.
We present three corresponding solutions, including semantic enhancement, perspective-aware sampling, and ordinal depth supervision.
The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and AR KITScenes datasets.
arXiv Detail & Related papers (2024-02-22T11:48:06Z) - Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision [81.60564776995682]
We present Point2RBox, an end-to-end solution for point-supervised object detection.
Our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives.
In particular, our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives.
arXiv Detail & Related papers (2023-11-23T15:57:41Z) - SOOD: Towards Semi-Supervised Oriented Object Detection [57.05141794402972]
This paper proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD, built upon the mainstream pseudo-labeling framework.
Our experiments show that when trained with the two proposed losses, SOOD surpasses the state-of-the-art SSOD methods under various settings on the DOTA-v1.5 benchmark.
arXiv Detail & Related papers (2023-04-10T11:10:42Z) - H2RBox: Horizonal Box Annotation is All You Need for Oriented Object
Detection [63.66553556240689]
Oriented object detection emerges in many applications from aerial images to autonomous driving.
Many existing detection benchmarks are annotated with horizontal bounding box only which is also less costive than fine-grained rotated box.
This paper proposes a simple yet effective oriented object detection approach called H2RBox.
arXiv Detail & Related papers (2022-10-13T05:12:45Z) - Depth Is All You Need for Monocular 3D Detection [29.403235118234747]
We propose to align depth representation with the target domain in unsupervised fashions.
Our methods leverage commonly available LiDAR or RGB videos during training time to fine-tune the depth representation, which leads to improved 3D detectors.
arXiv Detail & Related papers (2022-10-05T18:12:30Z) - Graph R-CNN: Towards Accurate 3D Object Detection with
Semantic-Decorated Local Graph [26.226885108862735]
Two-stage detectors have gained much popularity in 3D object detection.
Most two-stage 3D detectors utilize grid points, voxel grids, or sampled keypoints for RoI feature extraction in the second stage.
This paper solves this problem in three aspects.
arXiv Detail & Related papers (2022-08-07T02:56:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.