DPPD: Deformable Polar Polygon Object Detection
- URL: http://arxiv.org/abs/2304.02250v1
- Date: Wed, 5 Apr 2023 06:43:41 GMT
- Title: DPPD: Deformable Polar Polygon Object Detection
- Authors: Yang Zheng, Oles Andrienko, Yonglei Zhao, Minwoo Park, Trung Pham
- Abstract summary: We develop a novel Deformable Polar Polygon Object Detection method (DPPD) to detect objects in polygon shapes.
DPPD has been demonstrated successfully in various object detection tasks for autonomous driving.
- Score: 3.9236649268347765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Regular object detection methods output rectangle bounding boxes, which are
unable to accurately describe the actual object shapes. Instance segmentation
methods output pixel-level labels, which are computationally expensive for
real-time applications. Therefore, a polygon representation is needed to
achieve precise shape alignment, while retaining low computation cost. We
develop a novel Deformable Polar Polygon Object Detection method (DPPD) to
detect objects in polygon shapes. In particular, our network predicts, for each
object, a sparse set of flexible vertices to construct the polygon, where each
vertex is represented by a pair of angle and distance in the Polar coordinate
system. To enable training, both ground truth and predicted polygons are
densely resampled to have the same number of vertices with equal-spaced
raypoints. The resampling operation is fully differentable, allowing gradient
back-propagation. Sparse polygon predicton ensures high-speed runtime inference
while dense resampling allows the network to learn object shapes with high
precision. The polygon detection head is established on top of an anchor-free
and NMS-free network architecture. DPPD has been demonstrated successfully in
various object detection tasks for autonomous driving such as traffic-sign,
crosswalk, vehicle and pedestrian objects.
Related papers
- Progressive Evolution from Single-Point to Polygon for Scene Text [79.29097971932529]
We introduce Point2Polygon, which can efficiently transform single-points into compact polygons.
Our method uses a coarse-to-fine process, starting with creating anchor points based on recognition confidence, then vertically and horizontally refining the polygon.
In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons.
arXiv Detail & Related papers (2023-12-21T12:08:27Z) - PolyBuilding: Polygon Transformer for End-to-End Building Extraction [9.196604757138825]
PolyBuilding predicts vector representation of buildings from remote sensing images.
Model learns the relations among them and encodes context information from the image to predict the final set of building polygons.
It also achieves a new state-of-the-art in terms of pixel-level coverage, instance-level precision and recall, and geometry-level properties.
arXiv Detail & Related papers (2022-11-03T04:53:17Z) - Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space.
We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K.
Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z) - PolarFormer: Multi-camera 3D Object Detection with Polar Transformers [93.49713023975727]
3D object detection in autonomous driving aims to reason "what" and "where" the objects of interest present in a 3D world.
Existing methods often adopt the canonical Cartesian coordinate system with perpendicular axis.
We propose a new Polar Transformer (PolarFormer) for more accurate 3D object detection in the bird's-eye-view (BEV) taking as input only multi-camera 2D images.
arXiv Detail & Related papers (2022-06-30T16:32:48Z) - RBGNet: Ray-based Grouping for 3D Object Detection [104.98776095895641]
We propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds.
We propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays.
Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains.
arXiv Detail & Related papers (2022-04-05T14:42:57Z) - PolyWorld: Polygonal Building Extraction with Graph Neural Networks in
Satellite Images [10.661430927191205]
This paper introduces PolyWorld, a neural network that directly extracts building vertices from an image and connects them correctly to create precise polygons.
PolyWorld significantly outperforms the state-of-the-art in building polygonization.
arXiv Detail & Related papers (2021-11-30T15:23:17Z) - PolyNet: Polynomial Neural Network for 3D Shape Recognition with
PolyShape Representation [51.147664305955495]
3D shape representation and its processing have substantial effects on 3D shape recognition.
We propose a deep neural network-based method (PolyNet) and a specific polygon representation (PolyShape)
Our experiments demonstrate the strength and the advantages of PolyNet on both 3D shape classification and retrieval tasks.
arXiv Detail & Related papers (2021-10-15T06:45:59Z) - CenterPoly: real-time instance segmentation using bounding polygons [11.365829102707014]
We present a novel method, called CenterPoly, for real-time instance segmentation using bounding polygons.
We apply it to detect road users in dense urban environments, making it suitable for applications in intelligent transportation systems like automated vehicles.
Most of the network parameters are shared by the network heads, making it fast and lightweight enough to run at real-time speed.
arXiv Detail & Related papers (2021-08-19T21:31:30Z) - Polylidar3D -- Fast Polygon Extraction from 3D Data [0.0]
Flat surfaces captured by 3D point cloud processing are often used for localization, and modeling.
We demonstrate autonomous multi-th and speed segmentation for rooftop mapping, road surface detection, and RGBD cameras for wall detection.
Results consistently show excellent accuracy.
arXiv Detail & Related papers (2020-07-23T15:22:43Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.