Learning Universal Shape Dictionary for Realtime Instance Segmentation
- URL: http://arxiv.org/abs/2012.01050v1
- Date: Wed, 2 Dec 2020 09:44:49 GMT
- Title: Learning Universal Shape Dictionary for Realtime Instance Segmentation
- Authors: Tutian Tang, Wenqiang Xu, Ruolin Ye, Lixin Yang, Cewu Lu
- Abstract summary: We present a novel explicit shape representation for instance segmentation.
Based on how to model the object shape, current instance segmentation systems can be divided into two categories, implicit and explicit models.
The proposed USD-Seg adopts a linear model, sparse coding with dictionary, for object shapes.
- Score: 40.27913339054021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel explicit shape representation for instance segmentation.
Based on how to model the object shape, current instance segmentation systems
can be divided into two categories, implicit and explicit models. The implicit
methods, which represent the object mask/contour by intractable network
parameters, and produce it through pixel-wise classification, are predominant.
However, the explicit methods, which parameterize the shape with simple and
explainable models, are less explored. Since the operations to generate the
final shape are light-weighted, the explicit methods have a clear speed
advantage over implicit methods, which is crucial for real-world applications.
The proposed USD-Seg adopts a linear model, sparse coding with dictionary, for
object shapes.
First, it learns a dictionary from a large collection of shape datasets,
making any shape being able to be decomposed into a linear combination through
the dictionary.
Hence the name "Universal Shape Dictionary".
Then it adds a simple shape vector regression head to ordinary object
detector, giving the detector segmentation ability with minimal overhead.
For quantitative evaluation, we use both average precision (AP) and the
proposed Efficiency of AP (AP$_E$) metric, which intends to also measure the
computational consumption of the framework to cater to the requirements of
real-world applications. We report experimental results on the challenging COCO
dataset, in which our single model on a single Titan Xp GPU achieves 35.8 AP
and 27.8 AP$_E$ at 65 fps with YOLOv4 as base detector, 34.1 AP and 28.6 AP$_E$
at 12 fps with FCOS as base detector.
Related papers
- Instance-aware 3D Semantic Segmentation powered by Shape Generators and
Classifiers [28.817905887080293]
We propose a novel instance-aware approach for 3D semantic segmentation.
Our method combines several geometry processing tasks supervised at instance-level to promote the consistency of the learned feature representation.
arXiv Detail & Related papers (2023-11-21T02:14:16Z) - PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Number-Adaptive Prototype Learning for 3D Point Cloud Semantic
Segmentation [46.610620464184926]
We propose to use an adaptive number of prototypes to dynamically describe the different point patterns within a semantic class.
Our method achieves 2.3% mIoU improvement over the baseline model based on the point-wise classification paradigm.
arXiv Detail & Related papers (2022-10-18T15:57:20Z) - PointInst3D: Segmenting 3D Instances by Points [136.7261709896713]
We propose a fully-convolutional 3D point cloud instance segmentation method that works in a per-point prediction fashion.
We find the key to its success is assigning a suitable target to each sampled point.
Our approach achieves promising results on both ScanNet and S3DIS benchmarks.
arXiv Detail & Related papers (2022-04-25T02:41:46Z) - Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes.
Our framework yields compelling results over several datasets.
We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z) - Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes [17.217954254022573]
We introduce a meta-learning-based method for few-shot 3D shape segmentation where only a few labeled samples are provided for the unseen classes.
We demonstrate the superior performance of our proposed on the ShapeNet part dataset under the few-shot scenario, compared with well-established baseline and state-of-the-art semi-supervised methods.
arXiv Detail & Related papers (2021-07-07T01:47:00Z) - Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU [87.48110331544885]
We propose a novel training methodology to train and scale the existing semantic segmentation models.
We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
arXiv Detail & Related papers (2020-12-14T13:12:38Z) - EOLO: Embedded Object Segmentation only Look Once [0.0]
We introduce an anchor-free and single-shot instance segmentation method, which is conceptually simple with 3 independent branches, fully convolutional and can be used by easily embedding it into mobile and embedded devices.
Our method, refer as EOLO, reformulates the instance segmentation problem as predicting semantic segmentation and distinguishing overlapping objects problem, through instance center classification and 4D distance regression on each pixel.
Without any bells and whistles, EOLO achieves 27.7$%$ in mask mAP under IoU50 and reaches 30 FPS on 1080Ti GPU, with a single-model and single-scale training/testing on
arXiv Detail & Related papers (2020-03-31T21:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.