Delving into Transformer for Incremental Semantic Segmentation
- URL: http://arxiv.org/abs/2211.10253v1
- Date: Fri, 18 Nov 2022 14:16:04 GMT
- Title: Delving into Transformer for Incremental Semantic Segmentation
- Authors: Zekai Xu, Mingyi Zhang, Jiayue Hou, Xing Gong, Chuan Wen, Chengjie
Wang, Junge Zhang
- Abstract summary: Incremental semantic segmentation (ISS) is an emerging task where old model is updated by adding new classes.
In this work, we propose a Transformer based method for ISS, and accordingly propose TISS.
Under extensive experimental settings, our method significantly outperforms state-of-the-art incremental semantic segmentation methods.
- Score: 24.811247377533178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incremental semantic segmentation(ISS) is an emerging task where old model is
updated by incrementally adding new classes. At present, methods based on
convolutional neural networks are dominant in ISS. However, studies have shown
that such methods have difficulty in learning new tasks while maintaining good
performance on old ones (catastrophic forgetting). In contrast, a Transformer
based method has a natural advantage in curbing catastrophic forgetting due to
its ability to model both long-term and short-term tasks. In this work, we
explore the reasons why Transformer based architecture are more suitable for
ISS, and accordingly propose propose TISS, a Transformer based method for
Incremental Semantic Segmentation. In addition, to better alleviate
catastrophic forgetting while preserving transferability on ISS, we introduce
two patch-wise contrastive losses to imitate similar features and enhance
feature diversity respectively, which can further improve the performance of
TISS. Under extensive experimental settings with Pascal-VOC 2012 and ADE20K
datasets, our method significantly outperforms state-of-the-art incremental
semantic segmentation methods.
Related papers
- Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula [23.071384759427072]
State space models (SSMs) offer advantages over Transformers but struggle with tasks requiring long-range in-context retrieval-like text copying, associative recall, and question answering over long contexts.
We propose a novel training procedure, Birdie, that significantly enhances the in-context retrieval capabilities of SSMs without altering their architecture.
arXiv Detail & Related papers (2024-11-01T21:01:13Z) - ConSept: Continual Semantic Segmentation via Adapter-based Vision
Transformer [65.32312196621938]
We propose Continual semantic benchmarks via Adapter-based ViT, namely ConSept.
ConSept integrates lightweight attention-based adapters into vanilla ViTs.
We propose two key strategies: distillation with a deterministic old-classes boundary for improved anti-catastrophic forgetting, and dual dice losses to regularize segmentation maps.
arXiv Detail & Related papers (2024-02-26T15:51:45Z) - Waypoint Transformer: Reinforcement Learning via Supervised Learning
with Intermediate Targets [30.044393664203483]
We present a novel approach to enhance RvS methods by integrating intermediate targets.
We introduce the Waypoint Transformer (WT), using an architecture that builds upon the DT framework and conditioned on automatically-generated waypoints.
The results show a significant increase in the final return compared to existing RvS methods, with performance on par or greater than existing state-of-the-art temporal difference learning-based methods.
arXiv Detail & Related papers (2023-06-24T22:25:29Z) - ISTRBoost: Importance Sampling Transfer Regression using Boosting [4.319090388509148]
Current Instance Transfer Learning (ITL) methodologies use domain adaptation and sub-space transformation to achieve successful transfer learning.
Boosting methodologies have been shown to reduce the risk of overfitting by iteratively re-weighing instances with high-residual weights.
We introduce a simpler and more robust fix to this problem by building upon the popular boosting ITL regression methodology, two-stage TrAdaBoost.R2.
arXiv Detail & Related papers (2022-04-26T02:48:56Z) - Representation Compensation Networks for Continual Semantic Segmentation [79.05769734989164]
We study the continual semantic segmentation problem, where the deep neural networks are required to incorporate new classes continually without catastrophic forgetting.
We propose to use a structural re- parameterization mechanism, named representation compensation (RC) module, to decouple the representation learning of both old and new knowledge.
We conduct experiments on two challenging continual semantic segmentation scenarios, continual class segmentation and continual domain segmentation.
arXiv Detail & Related papers (2022-03-10T14:48:41Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - Continuous Transition: Improving Sample Efficiency for Continuous
Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition.
Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions.
To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z) - Evolving Metric Learning for Incremental and Decremental Features [45.696514400861275]
We develop a new online Evolving Metric Learning model for incremental and decremental features.
Our model can handle the instance and feature evolutions simultaneously by incorporating with a smoothed Wasserstein metric distance.
In addition to tackling the challenges in one-shot case, we also extend our model into multishot scenario.
arXiv Detail & Related papers (2020-06-27T10:29:38Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.