Related papers: P2RBox: Point Prompt Oriented Object Detection with SAM

P2RBox: Point Prompt Oriented Object Detection with SAM

URL: http://arxiv.org/abs/2311.13128v2
Date: Thu, 23 May 2024 15:18:08 GMT
Title: P2RBox: Point Prompt Oriented Object Detection with SAM
Authors: Guangming Cao, Xuehui Yu, Wenwen Yu, Xumeng Han, Xue Yang, Guorong Li, Jianbin Jiao, Zhenjun Han,
Abstract summary: We introduce P2RBox, which employs point prompt to generate rotated box (RBox) annotation for oriented object detection. P2RBox incorporates two advanced guidance cues: Boundary Sensitive Mask guidance, and Centrality guidance, which utilize spatial information to reduce granularity ambiguity. Compared to the state-of-the-art point-annotated generative method PointOBB, P2RBox outperforms by about 29% mAP on DOTA-v1.0 dataset.
Score: 28.96914721062631
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Single-point annotation in oriented object detection of remote sensing scenarios is gaining increasing attention due to its cost-effectiveness. However, due to the granularity ambiguity of points, there is a significant performance gap between previous methods and those with fully supervision. In this study, we introduce P2RBox, which employs point prompt to generate rotated box (RBox) annotation for oriented object detection. P2RBox employs the SAM model to generate high-quality mask proposals. These proposals are then refined using the semantic and spatial information from annotation points. The best masks are converted into oriented boxes based on the feature directions suggested by the model. P2RBox incorporates two advanced guidance cues: Boundary Sensitive Mask guidance, which leverages semantic information, and Centrality guidance, which utilizes spatial information to reduce granularity ambiguity. This combination enhances detection capabilities significantly. To demonstrate the effectiveness of this method, enhancements based on the baseline were observed by integrating three different detectors. Furthermore, compared to the state-of-the-art point-annotated generative method PointOBB, P2RBox outperforms by about 29% mAP (62.43% vs 33.31%) on DOTA-v1.0 dataset, which provides possibilities for the practical application of point annotations.

Related papers

P2Object: Single Point Supervised Object Detection and Instance Segmentation [58.778288785355]
We introduce Point-to-Box Network (P2BNet), which constructs balanced textbftextitinstance-level proposal bags P2MNet can generate more precise bounding boxes and generalize to segmentation tasks. Our method largely surpasses the previous methods in terms of the mean average precision on COCO, VOC, and Cityscapes.
arXiv Detail & Related papers (2025-04-10T14:51:08Z)
Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision [81.60564776995682]
We present Point2RBox, an end-to-end solution for point-supervised object detection. Our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives. In particular, our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives.
arXiv Detail & Related papers (2023-11-23T15:57:41Z)
H2RBox: Horizonal Box Annotation is All You Need for Oriented Object Detection [63.66553556240689]
Oriented object detection emerges in many applications from aerial images to autonomous driving. Many existing detection benchmarks are annotated with horizontal bounding box only which is also less costive than fine-grained rotated box. This paper proposes a simple yet effective oriented object detection approach called H2RBox.
arXiv Detail & Related papers (2022-10-13T05:12:45Z)
Detecting Rotated Objects as Gaussian Distributions and Its 3-D Generalization [81.29406957201458]
Existing detection methods commonly use a parameterized bounding box (BBox) to model and detect (horizontal) objects. We argue that such a mechanism has fundamental limitations in building an effective regression loss for rotation detection. We propose to model the rotated objects as Gaussian distributions. We extend our approach from 2-D to 3-D with a tailored algorithm design to handle the heading estimation.
arXiv Detail & Related papers (2022-09-22T07:50:48Z)
Point-to-Box Network for Accurate Object Detection via Single Point Supervision [51.95993495703855]
We introduce a lightweight alternative to the off-the-shelf proposal (OTSP) method. P2BNet can construct an inter-objects balanced proposal bag by generating proposals in an anchor-like way. The code will be released at COCO.com/ucas-vg/P2BNet.
arXiv Detail & Related papers (2022-07-14T11:32:00Z)
SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA) Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling. In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z)
Oriented RepPoints for Aerial Object Detection [10.818838437018682]
In this paper, we propose a novel approach to aerial object detection, named Oriented RepPoints. Specifically, we suggest to employ a set of adaptive points to capture the geometric and spatial information of the arbitrary-oriented objects. To facilitate the supervised learning, the oriented conversion function is proposed to explicitly map the adaptive point set into an oriented bounding box.
arXiv Detail & Related papers (2021-05-24T06:18:23Z)
SIENet: Spatial Information Enhancement Network for 3D Object Detection from Point Cloud [20.84329063509459]
LiDAR-based 3D object detection pushes forward an immense influence on autonomous vehicles. Due to the limitation of the intrinsic properties of LiDAR, fewer points are collected at the objects farther away from the sensor. To address the challenge, we propose a novel two-stage 3D object detection framework, named SIENet.
arXiv Detail & Related papers (2021-03-29T07:45:09Z)
Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection. The whole architecture facilitates two-stage fusion. Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.