Region-Aware Metric Learning for Open World Semantic Segmentation via
Meta-Channel Aggregation
- URL: http://arxiv.org/abs/2205.08083v1
- Date: Tue, 17 May 2022 04:12:47 GMT
- Title: Region-Aware Metric Learning for Open World Semantic Segmentation via
Meta-Channel Aggregation
- Authors: Hexin Dong, Zifan Chen, Mingze Yuan, Yutong Xie, Jie Zhao, Fei Yu, Bin
Dong, Li Zhang
- Abstract summary: We propose a method called region-aware metric learning (RAML)
RAML separates the regions of the images and generates region-aware features for further metric learning.
We show that the proposed RAML achieves SOTA performance in both stages of open world segmentation.
- Score: 19.584457251137252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As one of the most challenging and practical segmentation tasks, open-world
semantic segmentation requires the model to segment the anomaly regions in the
images and incrementally learn to segment out-of-distribution (OOD) objects,
especially under a few-shot condition. The current state-of-the-art (SOTA)
method, Deep Metric Learning Network (DMLNet), relies on pixel-level metric
learning, with which the identification of similar regions having different
semantics is difficult. Therefore, we propose a method called region-aware
metric learning (RAML), which first separates the regions of the images and
generates region-aware features for further metric learning. RAML improves the
integrity of the segmented anomaly regions. Moreover, we propose a novel
meta-channel aggregation (MCA) module to further separate anomaly regions,
forming high-quality sub-region candidates and thereby improving the model
performance for OOD objects. To evaluate the proposed RAML, we have conducted
extensive experiments and ablation studies on Lost And Found and Road Anomaly
datasets for anomaly segmentation and the CityScapes dataset for incremental
few-shot learning. The results show that the proposed RAML achieves SOTA
performance in both stages of open world segmentation. Our code and appendix
are available at https://github.com/czifan/RAML.
Related papers
- SAM-Assisted Remote Sensing Imagery Semantic Segmentation with Object
and Boundary Constraints [9.238103649037951]
We present a framework aimed at leveraging the raw output of SAM by exploiting two novel concepts called SAM-Generated Object (SGO) and SAM-Generated Boundary (SGB)
Taking into account the content characteristics of SGO, we introduce the concept of object consistency to leverage segmented regions lacking semantic information.
The boundary loss capitalizes on the distinctive features of SGB by directing the model's attention to the boundary information of the object.
arXiv Detail & Related papers (2023-12-05T03:33:47Z) - Optimization Efficient Open-World Visual Region Recognition [55.76437190434433]
RegionSpot integrates position-aware localization knowledge from a localization foundation model with semantic information from a ViL model.
Experiments in open-world object recognition show that our RegionSpot achieves significant performance gain over prior alternatives.
arXiv Detail & Related papers (2023-11-02T16:31:49Z) - Region Generation and Assessment Network for Occluded Person
Re-Identification [43.49129366128688]
Person Re-identification (ReID) plays a more and more crucial role in recent years with a wide range of applications.
Most methods tackle such challenges by utilizing external tools to locate body parts or exploiting matching strategies.
We propose a Region Generation and Assessment Network (RGANet) to effectively and efficiently detect the human body regions.
arXiv Detail & Related papers (2023-09-07T08:41:47Z) - R-MAE: Regions Meet Masked Autoencoders [113.73147144125385]
We explore regions as a potential visual analogue of words for self-supervised image representation learning.
Inspired by Masked Autoencoding (MAE), a generative pre-training baseline, we propose masked region autoencoding to learn from groups of pixels or regions.
arXiv Detail & Related papers (2023-06-08T17:56:46Z) - Region-Enhanced Feature Learning for Scene Semantic Segmentation [19.20735517821943]
We propose using regions as the intermediate representation of point clouds instead of fine-grained points or voxels to reduce the computational burden.
We design a region-based feature enhancement (RFE) module, which consists of a Semantic-Spatial Region Extraction stage and a Region Dependency Modeling stage.
Our REFL-Net achieves 1.8% mIoU gain on ScanNetV2 and 1.7% mIoU gain on S3DIS datasets with negligible computational cost.
arXiv Detail & Related papers (2023-04-15T06:35:06Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly
Supervised Object Detection [54.24966006457756]
We propose a WSOD framework called the Spatial Likelihood Voting with Self-knowledge Distillation Network (SLV-SD Net)
SLV-SD Net converges region proposal localization without bounding box annotations.
Experiments on the PASCAL VOC 2007/2012 and MS-COCO datasets demonstrate the excellent performance of SLV-SD Net.
arXiv Detail & Related papers (2022-04-14T11:56:19Z) - Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z) - Remote Sensing Images Semantic Segmentation with General Remote Sensing
Vision Model via a Self-Supervised Contrastive Learning Method [13.479068312825781]
We propose Global style and Local matching Contrastive Learning Network (GLCNet) for remote sensing semantic segmentation.
Specifically, the global style contrastive module is used to learn an image-level representation better.
The local features matching contrastive module is designed to learn representations of local regions which is beneficial for semantic segmentation.
arXiv Detail & Related papers (2021-06-20T03:03:40Z) - Rethinking Semantic Segmentation Evaluation for Explainability and Model
Selection [12.786648212233116]
We introduce a new metric to assess region-based over- and under-segmentation.
We analyze and compare it to other metrics, demonstrating that the use of our metric lends greater explainability to semantic segmentation model performance in real-world applications.
arXiv Detail & Related papers (2021-01-21T03:12:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.