Semantic Segmentation by Early Region Proxy
- URL: http://arxiv.org/abs/2203.14043v1
- Date: Sat, 26 Mar 2022 10:48:32 GMT
- Title: Semantic Segmentation by Early Region Proxy
- Authors: Yifan Zhang, Bo Pang, Cewu Lu
- Abstract summary: We present a novel and efficient modeling that starts from interpreting the image as a tessellation of learnable regions.
To model region-wise context, we exploit Transformer to encode regions in a sequence-to-sequence manner.
Semantic segmentation is now carried out as per-region prediction on top of the encoded region embeddings.
- Score: 53.594035639400616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Typical vision backbones manipulate structured features. As a compromise,
semantic segmentation has long been modeled as per-point prediction on dense
regular grids. In this work, we present a novel and efficient modeling that
starts from interpreting the image as a tessellation of learnable regions, each
of which has flexible geometrics and carries homogeneous semantics. To model
region-wise context, we exploit Transformer to encode regions in a
sequence-to-sequence manner by applying multi-layer self-attention on the
region embeddings, which serve as proxies of specific regions. Semantic
segmentation is now carried out as per-region prediction on top of the encoded
region embeddings using a single linear classifier, where a decoder is no
longer needed. The proposed RegProxy model discards the common Cartesian
feature layout and operates purely at region level. Hence, it exhibits the most
competitive performance-efficiency trade-off compared with the conventional
dense prediction methods. For example, on ADE20K, the small-sized RegProxy-S/16
outperforms the best CNN model using 25% parameters and 4% computation, while
the largest RegProxy-L/16 achieves 52.9mIoU which outperforms the
state-of-the-art by 2.1% with fewer resources. Codes and models are available
at https://github.com/YiF-Zhang/RegionProxy.
Related papers
- Differentiable Reasoning about Knowledge Graphs with Region-based Graph Neural Networks [62.93577376960498]
Methods for knowledge graph (KG) completion need to capture semantic regularities and use these regularities to infer plausible knowledge that is not explicitly stated.
Most embedding-based methods are opaque in the kinds of regularities they can capture, although region-based KG embedding models have emerged as a more transparent alternative.
We propose RESHUFFLE, a simple model based on ordering constraints that can faithfully capture a much larger class of rule bases than existing approaches.
arXiv Detail & Related papers (2024-06-13T18:37:24Z) - Adaptive Region Selection for Active Learning in Whole Slide Image
Semantic Segmentation [3.1392713791311766]
Region-based active learning (AL) involves training the model on a limited number of annotated image regions.
We introduce a novel technique for selecting annotation regions adaptively, mitigating the reliance on this AL hyper parameter.
We evaluate our method using the task of breast cancer segmentation on the public CAMELYON16 dataset.
arXiv Detail & Related papers (2023-07-14T05:34:13Z) - Region-Enhanced Feature Learning for Scene Semantic Segmentation [19.20735517821943]
We propose using regions as the intermediate representation of point clouds instead of fine-grained points or voxels to reduce the computational burden.
We design a region-based feature enhancement (RFE) module, which consists of a Semantic-Spatial Region Extraction stage and a Region Dependency Modeling stage.
Our REFL-Net achieves 1.8% mIoU gain on ScanNetV2 and 1.7% mIoU gain on S3DIS datasets with negligible computational cost.
arXiv Detail & Related papers (2023-04-15T06:35:06Z) - Semantic Diffusion Network for Semantic Segmentation [1.933681537640272]
We introduce an operator-level approach to enhance semantic boundary awareness.
We propose a novel learnable approach called semantic diffusion network (SDN)
Our SDN aims to construct a differentiable mapping from the original feature to the inter-class boundary-enhanced feature.
arXiv Detail & Related papers (2023-02-04T01:39:16Z) - Dense Siamese Network [86.23741104851383]
We present Dense Siamese Network (DenseSiam), a simple unsupervised learning framework for dense prediction tasks.
It learns visual representations by maximizing the similarity between two views of one image with two types of consistency, i.e., pixel consistency and region consistency.
It surpasses state-of-the-art segmentation methods by 2.1 mIoU with 28% training costs.
arXiv Detail & Related papers (2022-03-21T15:55:23Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z) - Consistency-Regularized Region-Growing Network for Semantic Segmentation
of Urban Scenes with Point-Level Annotations [17.13291434132985]
We propose a consistency-regularized region-growing network (CRGNet) to reduce the annotation burden.
CRGNet iteratively selects unlabeled pixels with high confidence to expand the annotated area from the original sparse points.
We find such a simple regularization strategy is yet very useful to control the quality of the region-growing mechanism.
arXiv Detail & Related papers (2022-02-08T09:27:01Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z) - Boundary-assisted Region Proposal Networks for Nucleus Segmentation [89.69059532088129]
Machine learning models cannot perform well because of large amount of crowded nuclei.
We devise a Boundary-assisted Region Proposal Network (BRP-Net) that achieves robust instance-level nucleus segmentation.
arXiv Detail & Related papers (2020-06-04T08:26:38Z) - FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale
Context Aggregation and Feature Space Super-resolution [14.226301825772174]
We introduce a novel and efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP)
It is a lightweight cascaded structure for Convolutional Neural Networks (CNNs) to efficiently leverage context information.
We achieve 68.4% mIoU at 84 fps on the Cityscapes test set with a single Nivida Titan X (Maxwell) GPU card.
arXiv Detail & Related papers (2020-03-09T03:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.