K-Net: Towards Unified Image Segmentation
- URL: http://arxiv.org/abs/2106.14855v1
- Date: Mon, 28 Jun 2021 17:18:21 GMT
- Title: K-Net: Towards Unified Image Segmentation
- Authors: Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
- Abstract summary: The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels.
K-Net can be trained in an end-to-end manner with bipartite matching, and its training and inference are naturally NMS-free and box-free.
- Score: 78.32096542571257
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Semantic, instance, and panoptic segmentations have been addressed using
different and specialized frameworks despite their underlying connections. This
paper presents a unified, simple, and effective framework for these essentially
similar tasks. The framework, named K-Net, segments both instances and semantic
categories consistently by a group of learnable kernels, where each kernel is
responsible for generating a mask for either a potential instance or a stuff
class. To remedy the difficulties of distinguishing various instances, we
propose a kernel update strategy that enables each kernel dynamic and
conditional on its meaningful group in the input image. K-Net can be trained in
an end-to-end manner with bipartite matching, and its training and inference
are naturally NMS-free and box-free. Without bells and whistles, K-Net
surpasses all previous state-of-the-art single-model results of panoptic
segmentation on MS COCO and semantic segmentation on ADE20K with 52.1% PQ and
54.3% mIoU, respectively. Its instance segmentation performance is also on par
with Cascade Mask R-CNNon MS COCO with 60%-90% faster inference speeds. Code
and models will be released at https://github.com/open-mmlab/mmdetection.
Related papers
- MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence [97.93517982908007]
In cross-domain few-shot classification, NCC aims to learn representations to construct a metric space where few-shot classification can be performed.
In this paper, we find that there exist high similarities between NCC-learned representations of two samples from different classes.
We propose a bi-level optimization framework, emphmaximizing optimized kernel dependence (MOKD) to learn a set of class-specific representations that match the cluster structures indicated by labeled data.
arXiv Detail & Related papers (2024-05-29T05:59:52Z) - OneFormer3D: One Transformer for Unified Point Cloud Segmentation [5.530212768657545]
This paper presents a unified, simple, and effective model addressing semantic, instance, and panoptic segmentation tasks jointly.
The model, named OneFormer3D, performs instance and semantic segmentation consistently, using a group of learnable kernels.
We also demonstrate the state-of-the-art results in semantic, instance, and panoptic segmentation of ScanNet, ScanNet200, and S3DIS datasets.
arXiv Detail & Related papers (2023-11-24T10:56:27Z) - Local Sample-weighted Multiple Kernel Clustering with Consensus
Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.
This paper proposes a novel local sample-weighted multiple kernel clustering model.
Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z) - Video K-Net: A Simple, Strong, and Unified Baseline for Video
Segmentation [85.08156742410527]
Video K-Net is a framework for end-to-end video panoptic segmentation.
It unifies image segmentation via a group of learnable kernels.
Video K-Net learns to simultaneously segment and track "things" and "stuff"
arXiv Detail & Related papers (2022-04-10T11:24:47Z) - Unifying Instance and Panoptic Segmentation with Dynamic Rank-1
Convolutions [109.2706837177222]
DR1Mask is the first panoptic segmentation framework that exploits a shared feature map for both instance and semantic segmentation.
As a byproduct, DR1Mask is 10% faster and 1 point in mAP more accurate than previous state-of-the-art instance segmentation network BlendMask.
arXiv Detail & Related papers (2020-11-19T12:42:10Z) - Towards Bounding-Box Free Panoptic Segmentation [16.4548904544277]
We introduce a new Bounding-Box Free Network (BBFNet) for panoptic segmentation.
BBFNet predicts coarse watershed levels and uses them to detect large instance candidates where boundaries are well defined.
For smaller instances, whose boundaries are less reliable, BBFNet also predicts instance centers by means of Hough voting followed by mean-shift to reliably detect small objects.
arXiv Detail & Related papers (2020-02-18T16:34:01Z) - Unifying Training and Inference for Panoptic Segmentation [111.44758195510838]
We present an end-to-end network to bridge the gap between training and inference for panoptic segmentation.
Our system sets new records on the popular street scene dataset, Cityscapes, achieving 61.4 PQ with a ResNet-50 backbone.
Our network flexibly works with and without object mask cues, performing competitively under both settings.
arXiv Detail & Related papers (2020-01-14T18:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.