CSL: Class-Agnostic Structure-Constrained Learning for Segmentation
Including the Unseen
- URL: http://arxiv.org/abs/2312.05538v2
- Date: Thu, 8 Feb 2024 17:20:38 GMT
- Title: CSL: Class-Agnostic Structure-Constrained Learning for Segmentation
Including the Unseen
- Authors: Hao Zhang, Fang Li, Lu Qi, Ming-Hsuan Yang, and Narendra Ahuja
- Abstract summary: Class-Agnostic Structure-Constrained Learning is a plug-in framework that can integrate with existing methods.
We propose soft assignment and mask split methodologies that enhance OOD object segmentation.
Empirical evaluations demonstrate CSL's prowess in boosting the performance of existing algorithms spanning OOD segmentation, ZS3, and DA segmentation.
- Score: 62.72636247006293
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Addressing Out-Of-Distribution (OOD) Segmentation and Zero-Shot Semantic
Segmentation (ZS3) is challenging, necessitating segmenting unseen classes.
Existing strategies adapt the class-agnostic Mask2Former (CA-M2F) tailored to
specific tasks. However, these methods cater to singular tasks, demand training
from scratch, and we demonstrate certain deficiencies in CA-M2F, which affect
performance. We propose the Class-Agnostic Structure-Constrained Learning
(CSL), a plug-in framework that can integrate with existing methods, thereby
embedding structural constraints and achieving performance gain, including the
unseen, specifically OOD, ZS3, and domain adaptation (DA) tasks. There are two
schemes for CSL to integrate with existing methods (1) by distilling knowledge
from a base teacher network, enforcing constraints across training and
inference phrases, or (2) by leveraging established models to obtain per-pixel
distributions without retraining, appending constraints during the inference
phase. We propose soft assignment and mask split methodologies that enhance OOD
object segmentation. Empirical evaluations demonstrate CSL's prowess in
boosting the performance of existing algorithms spanning OOD segmentation, ZS3,
and DA segmentation, consistently transcending the state-of-art across all
three tasks.
Related papers
- PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - Task-Specific Adaptation of Segmentation Foundation Model via Prompt Learning [7.6136466242670435]
We propose a task-specific adaptation of the segmentation foundation model via prompt learning tailored to the Segment Anything Model (SAM)
Our method involves a prompt learning module which adjusts input prompts into the embedding space to better align with peculiarities of the target task.
Experimental results on various customized segmentation scenarios demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-03-14T09:13:51Z) - Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation [21.345548821276097]
Class activation maps (CAMs) are commonly employed in weakly supervised semantic segmentation (WSSS) to produce pseudo-labels.
We propose an end-to-end WSSS model incorporating guided CAMs, wherein our segmentation model is trained while concurrently optimizing CAMs online.
CoSA is the first single-stage approach to outperform all existing multi-stage methods including those with additional supervision.
arXiv Detail & Related papers (2024-02-27T21:08:23Z) - Semantic Connectivity-Driven Pseudo-labeling for Cross-domain
Segmentation [89.41179071022121]
Self-training is a prevailing approach in cross-domain semantic segmentation.
We propose a novel approach called Semantic Connectivity-driven pseudo-labeling.
This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics.
arXiv Detail & Related papers (2023-12-11T12:29:51Z) - AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation.
We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z) - Integrative Few-Shot Learning for Classification and Segmentation [37.50821005917126]
We introduce the integrative task of few-shot classification and segmentation (FS-CS)
FS-CS aims to classify and segment target objects in a query image when the target classes are given with a few examples.
We propose the integrative few-shot learning framework for FS-CS, which trains a learner to construct class-wise foreground maps.
arXiv Detail & Related papers (2022-03-29T16:14:40Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - SASO: Joint 3D Semantic-Instance Segmentation via Multi-scale Semantic
Association and Salient Point Clustering Optimization [8.519716460338518]
We propose a novel 3D point cloud segmentation framework named SASO, which jointly performs semantic and instance segmentation tasks.
For semantic segmentation task, inspired by the inherent correlation among objects in spatial context, we propose a Multi-scale Semantic Association (MSA) module.
For instance segmentation task, different from previous works that utilize clustering only in inference procedure, we propose a Salient Point Clustering Optimization (SPCO) module.
arXiv Detail & Related papers (2020-06-25T08:55:25Z) - Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes.
The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.