E2EC: An End-to-End Contour-based Method for High-Quality High-Speed
Instance Segmentation
- URL: http://arxiv.org/abs/2203.04074v1
- Date: Tue, 8 Mar 2022 13:36:23 GMT
- Title: E2EC: An End-to-End Contour-based Method for High-Quality High-Speed
Instance Segmentation
- Authors: Tao Zhang, Shiqing Wei, Shunping Ji
- Abstract summary: We introduce a novel contour-based method, named E2EC, for high-quality instance segmentation.
E2EC is efficient for use in real-time applications, with an inference speed of 36 fps for 512*512 images on an NVIDIA A6000 GPU.
- Score: 4.74225248496056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contour-based instance segmentation methods have developed rapidly recently
but feature rough and hand-crafted front-end contour initialization, which
restricts the model performance, and an empirical and fixed backend
predicted-label vertex pairing, which contributes to the learning difficulty.
In this paper, we introduce a novel contour-based method, named E2EC, for
high-quality instance segmentation. Firstly, E2EC applies a novel learnable
contour initialization architecture instead of hand-crafted contour
initialization. This consists of a contour initialization module for
constructing more explicit learning goals and a global contour deformation
module for taking advantage of all of the vertices' features better. Secondly,
we propose a novel label sampling scheme, named multi-direction alignment, to
reduce the learning difficulty. Thirdly, to improve the quality of the boundary
details, we dynamically match the most appropriate predicted-ground truth
vertex pairs and propose the corresponding loss function named dynamic matching
loss. The experiments showed that E2EC can achieve a state-of-the-art
performance on the KITTI INStance (KINS) dataset, the Semantic Boundaries
Dataset (SBD), the Cityscapes and the COCO dataset. E2EC is also efficient for
use in real-time applications, with an inference speed of 36 fps for 512*512
images on an NVIDIA A6000 GPU. Code will be released at
https://github.com/zhang-tao-whu/e2ec.
Related papers
- SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation [14.214197948110115]
This paper introduces a novel method, named SGIFormer, for 3D instance segmentation.
It is composed of the Semantic-guided Mix Query (SMQ) and the Geometric-enhanced Interleaving Transformer (GIT) decoder.
It attains state-of-the-art performance on ScanNet V2, ScanNet200, and the challenging high-fidelity ScanNet++ benchmark.
arXiv Detail & Related papers (2024-07-16T10:17:28Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - Edge-aware Plug-and-play Scheme for Semantic Segmentation [4.297988192695948]
The proposed method can be seamlessly integrated into any state-of-the-art (SOTA) models with zero modification.
The experimental results indicate that the proposed method can be seamlessly integrated into any state-of-the-art (SOTA) models with zero modification.
arXiv Detail & Related papers (2023-03-18T02:17:37Z) - Parallel Vertex Diffusion for Unified Visual Grounding [38.94276071029081]
Unified visual grounding pursues a simple and generic technical route to leverage multi-task data with less task-specific design.
Most advanced methods typically present boxes and masks as a sequence to model referring detection and segmentation.
arXiv Detail & Related papers (2023-03-13T15:51:38Z) - Deep Manifold Learning with Graph Mining [80.84145791017968]
We propose a novel graph deep model with a non-gradient decision layer for graph mining.
The proposed model has achieved state-of-the-art performance compared to the current models.
arXiv Detail & Related papers (2022-07-18T04:34:08Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - EV-VGCNN: A Voxel Graph CNN for Event-based Object Classification [18.154951807178943]
Event cameras report sparse intensity changes and hold noticeable advantages of low power consumption, high dynamic range, and high response speed for visual perception and understanding on portable devices.
Event-based learning methods have recently achieved massive success on object recognition by integrating events into dense frame-based representations to apply traditional 2D learning algorithms.
These approaches introduce much redundant information during the sparse-to-dense conversion and necessitate models with heavy-weight and large capacities, limiting the potential of event cameras on real-life applications.
arXiv Detail & Related papers (2021-06-01T04:07:03Z) - Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels.
We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - 1st Place Solutions for OpenImage2019 -- Object Detection and Instance
Segmentation [116.25081559037872]
This article introduces the solutions of the two champion teams, MMfruit' for the detection track and MMfruitSeg' for the segmentation track, in OpenImage Challenge 2019.
It is commonly known that for an object detector, the shared feature at the end of the backbone is not appropriate for both classification and regression.
We propose the Decoupling Head (DH) to disentangle the object classification and regression via the self-learned optimal feature extraction.
arXiv Detail & Related papers (2020-03-17T06:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.