Improving Semantic Segmentation via Decoupled Body and Edge Supervision
- URL: http://arxiv.org/abs/2007.10035v2
- Date: Tue, 18 Aug 2020 03:41:10 GMT
- Title: Improving Semantic Segmentation via Decoupled Body and Edge Supervision
- Authors: Xiangtai Li, Xia Li, Li Zhang, Guangliang Cheng, Jianping Shi,
Zhouchen Lin, Shaohua Tan, Yunhai Tong
- Abstract summary: Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
- Score: 89.57847958016981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing semantic segmentation approaches either aim to improve the object's
inner consistency by modeling the global context, or refine objects detail
along their boundaries by multi-scale feature fusion. In this paper, a new
paradigm for semantic segmentation is proposed. Our insight is that appealing
performance of semantic segmentation requires \textit{explicitly} modeling the
object \textit{body} and \textit{edge}, which correspond to the high and low
frequency of the image. To do so, we first warp the image feature by learning a
flow field to make the object part more consistent. The resulting body feature
and the residual edge feature are further optimized under decoupled supervision
by explicitly sampling different parts (body or edge) pixels. We show that the
proposed framework with various baselines or backbone networks leads to better
object inner consistency and object boundaries. Extensive experiments on four
major road scene semantic segmentation benchmarks including
\textit{Cityscapes}, \textit{CamVid}, \textit{KIITI} and \textit{BDD} show that
our proposed approach establishes new state of the art while retaining high
efficiency in inference. In particular, we achieve 83.7 mIoU \% on Cityscape
with only fine-annotated data. Code and models are made available to foster any
further research (\url{https://github.com/lxtGH/DecoupleSegNets}).
Related papers
- ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View
Semantic Consistency [126.88107868670767]
We propose multi-textbfView textbfConsistent learning (ViewCo) for text-supervised semantic segmentation.
We first propose text-to-views consistency modeling to learn correspondence for multiple views of the same input image.
We also propose cross-view segmentation consistency modeling to address the ambiguity issue of text supervision.
arXiv Detail & Related papers (2023-01-31T01:57:52Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic
Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios.
Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z) - Attention-based fusion of semantic boundary and non-boundary information
to improve semantic segmentation [9.518010235273783]
This paper introduces a method for image semantic segmentation grounded on a novel fusion scheme.
The main goal of our proposal is to explore object boundary information to improve the overall segmentation performance.
Our proposed model achieved the best mIoU on the CityScapes, CamVid, and Pascal Context data sets, and the second best on Mapillary Vistas.
arXiv Detail & Related papers (2021-08-05T20:46:53Z) - A Unified Efficient Pyramid Transformer for Semantic Segmentation [40.20512714144266]
We advocate a unified framework(UN-EPT) to segment objects by considering both context information and boundary artifacts.
We first adapt a sparse sampling strategy to incorporate the transformer-based attention mechanism for efficient context modeling.
We demonstrate promising performance on three popular benchmarks for semantic segmentation with low memory footprint.
arXiv Detail & Related papers (2021-07-29T17:47:32Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z) - Locate then Segment: A Strong Pipeline for Referring Image Segmentation [73.19139431806853]
Referring image segmentation aims to segment the objects referred by a natural language expression.
Previous methods usually focus on designing an implicit and recurrent interaction mechanism to fuse the visual-linguistic features to directly generate the final segmentation mask.
We present a "Then-Then-Segment" scheme to tackle these problems.
Our framework is simple but surprisingly effective.
arXiv Detail & Related papers (2021-03-30T12:25:27Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.