MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images
- URL: http://arxiv.org/abs/2406.01356v1
- Date: Mon, 3 Jun 2024 14:20:34 GMT
- Title: MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images
- Authors: Ke-Lei Wang, Pin-Hsuan Chou, Young-Ching Chou, Chia-Jen Liu, Cheng-Kuan Lin, Yu-Chee Tseng,
- Abstract summary: PolarMask is a unique model that represents an object by a Polar coordinate system.
There are two deficiencies associated with PolarMask: (i) inability of representing concave objects and (ii) inefficiency in using ray regression.
We propose MP-PolarMask (Multi-Point PolarMask) by taking advantage of multiple Polar systems.
- Score: 4.977034524493568
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While there are a lot of models for instance segmentation, PolarMask stands out as a unique one that represents an object by a Polar coordinate system. With an anchor-box-free design and a single-stage framework that conducts detection and segmentation at one time, PolarMask is proved to be able to balance efficiency and accuracy. Hence, it can be easily connected with other downstream real-time applications. In this work, we observe that there are two deficiencies associated with PolarMask: (i) inability of representing concave objects and (ii) inefficiency in using ray regression. We propose MP-PolarMask (Multi-Point PolarMask) by taking advantage of multiple Polar systems. The main idea is to extend from one main Polar system to four auxiliary Polar systems, thus capable of representing more complicated convex-and-concave-mixed shapes. We validate MP-PolarMask on both general objects and food objects of the COCO dataset, and the results demonstrate significant improvement of 13.69% in AP_L and 7.23% in AP over PolarMask with 36 rays.
Related papers
- PolarBEVDet: Exploring Polar Representation for Multi-View 3D Object Detection in Bird's-Eye-View [5.0458717114406975]
We propose to employ the polar BEV representation to substitute the Cartesian BEV representation.
Experiments on nuScenes show that PolarBEVDet achieves the superior performance.
arXiv Detail & Related papers (2024-08-29T01:42:38Z) - PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection [81.16859686137435]
We present PARTNER, a novel 3D object detector in the polar coordinate.
Our method outperforms the previous polar-based works with remarkable margins of 3.68% and 9.15% on and ONCE validation set.
arXiv Detail & Related papers (2023-08-08T01:59:20Z) - Polarimetric Multi-View Inverse Rendering [13.391866136230165]
A polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) and the degree of polarization (DoP) of reflected light are related to an object's surface normal.
We propose a novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering (Polarimetric MVIR) that effectively exploits geometric, photometric, and polarimetric cues extracted from input multi-view color-polarization images.
arXiv Detail & Related papers (2022-12-24T12:12:12Z) - PolarFormer: Multi-camera 3D Object Detection with Polar Transformers [93.49713023975727]
3D object detection in autonomous driving aims to reason "what" and "where" the objects of interest present in a 3D world.
Existing methods often adopt the canonical Cartesian coordinate system with perpendicular axis.
We propose a new Polar Transformer (PolarFormer) for more accurate 3D object detection in the bird's-eye-view (BEV) taking as input only multi-camera 2D images.
arXiv Detail & Related papers (2022-06-30T16:32:48Z) - Shape from Polarization for Complex Scenes in the Wild [93.65746187211958]
We present a new data-driven approach with physics-based priors to scene-level normal estimation from a single polarization image.
We contribute the first real-world scene-level SfP dataset with paired input polarization images and ground-truth normal maps.
Our trained model can be generalized to far-field outdoor scenes as the relationship between polarized light and surface normals is not affected by distance.
arXiv Detail & Related papers (2021-12-21T17:30:23Z) - SiamPolar: Semi-supervised Realtime Video Object Segmentation with Polar
Representation [6.108508667949229]
We propose a semi-supervised real-time method based on the Siamese network using a new polar representation.
The polar representation could reduce the parameters for encoding masks with subtle accuracy loss.
An asymmetric siamese network is also developed to extract the features from different spatial scales.
arXiv Detail & Related papers (2021-10-27T21:10:18Z) - PolarMask++: Enhanced Polar Representation for Single-Shot Instance
Segmentation and Beyond [47.518550130850755]
PolarMask reformulates the instance segmentation problem as predicting the contours of objects in the polar coordinate.
Two modules are carefully designed (i.e. soft polar centerness and polar IoU loss) to sample high-quality center examples.
PolarMask is fully convolutional and can be easily embedded into most off-the-shelf detection methods.
arXiv Detail & Related papers (2021-05-05T16:55:53Z) - PointINS: Point-based Instance Segmentation [117.38579097923052]
Mask representation in instance segmentation with Point-of-Interest (PoI) features is challenging because learning a high-dimensional mask feature for each instance requires a heavy computing burden.
We propose an instance-aware convolution, which decomposes this mask representation learning task into two tractable modules.
Along with instance-aware convolution, we propose PointINS, a simple and practical instance segmentation approach.
arXiv Detail & Related papers (2020-03-13T08:24:58Z) - BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation [103.74690082121079]
In this work, we achieve improved mask prediction by effectively combining instance-level information with semantic information with lower-level fine-granularity.
Our main contribution is a blender module which draws inspiration from both top-down and bottom-up instance segmentation approaches.
BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer.
arXiv Detail & Related papers (2020-01-02T03:30:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.