Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
- URL: http://arxiv.org/abs/2405.16486v1
- Date: Sun, 26 May 2024 08:51:39 GMT
- Title: Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
- Authors: Rongyu Zhang, Aosong Cheng, Yulin Luo, Gaole Dai, Huanrui Yang, Jiaming Liu, Ran Xu, Li Du, Yuan Du, Yanbing Jiang, Shanghang Zhang,
- Abstract summary: Continual Test-Time Adaptation (CTTA) aims to adapt the pre-trained model to ever-evolving target domains.
We explore the integration of a Mixture-of-Activation-Sparsity-Experts (MoASE) as an adapter for the CTTA task.
- Score: 37.79819260918366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual Test-Time Adaptation (CTTA), which aims to adapt the pre-trained model to ever-evolving target domains, emerges as an important task for vision models. As current vision models appear to be heavily biased towards texture, continuously adapting the model from one domain distribution to another can result in serious catastrophic forgetting. Drawing inspiration from the human visual system's adeptness at processing both shape and texture according to the famous Trichromatic Theory, we explore the integration of a Mixture-of-Activation-Sparsity-Experts (MoASE) as an adapter for the CTTA task. Given the distinct reaction of neurons with low/high activation to domain-specific/agnostic features, MoASE decomposes the neural activation into high-activation and low-activation components with a non-differentiable Spatial Differentiate Dropout (SDD). Based on the decomposition, we devise a multi-gate structure comprising a Domain-Aware Gate (DAG) that utilizes domain information to adaptive combine experts that process the post-SDD sparse activations of different strengths, and the Activation Sparsity Gate (ASG) that adaptively assigned feature selection threshold of the SDD for different experts for more precise feature decomposition. Finally, we introduce a Homeostatic-Proximal (HP) loss to bypass the error accumulation problem when continuously adapting the model. Extensive experiments on four prominent benchmarks substantiate that our methodology achieves state-of-the-art performance in both classification and segmentation CTTA tasks. Our code is now available at https://github.com/RoyZry98/MoASE-Pytorch.
Related papers
- PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - MedMAP: Promoting Incomplete Multi-modal Brain Tumor Segmentation with Alignment [20.358300924109162]
In clinical practice, certain modalities of MRI may be missing, which presents a more difficult scenario.
Knowledge Distillation, Domain Adaption, and Shared Latent Space have emerged as commonly promising strategies.
We propose a novel paradigm that aligns latent features of involved modalities to a well-defined distribution anchor as the substitution of the pre-trained model.
arXiv Detail & Related papers (2024-08-18T13:16:30Z) - Diffusion Models with Ensembled Structure-Based Anomaly Scoring for Unsupervised Anomaly Detection [35.46541584018842]
unsupervised anomaly detection (UAD) emerges as a viable alternative for pathology segmentation.
Recent UAD anomaly scoring functions often focus on intensity only and neglect structural differences, which impedes the segmentation performance.
Structural Similarity (SSIM) captures both intensity and structural disparities and can be advantageous over the classical $l1$ error.
arXiv Detail & Related papers (2024-03-21T09:50:39Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Domain-Adaptive Learning: Unsupervised Adaptation for Histology Images
with Improved Loss Function Combination [3.004632712148892]
This paper presents a novel approach for unsupervised domain adaptation (UDA) targeting H&E stained histology images.
Our approach proposes a novel loss function along with carefully selected existing loss functions tailored to address the challenges specific to histology images.
The proposed method is extensively evaluated in accuracy, robustness, and generalization, surpassing state-of-the-art techniques for histology images.
arXiv Detail & Related papers (2023-09-29T12:11:16Z) - ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation [48.039156140237615]
A Continual Test-Time Adaptation task is proposed to adapt the pre-trained model to continually changing target domains.
We design a Visual Domain Adapter (ViDA) for CTTA, explicitly handling both domain-specific and domain-shared knowledge.
Our proposed method achieves state-of-the-art performance in both classification and segmentation CTTA tasks.
arXiv Detail & Related papers (2023-06-07T11:18:53Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Consistent Posterior Distributions under Vessel-Mixing: A Regularization
for Cross-Domain Retinal Artery/Vein Classification [30.30848090813239]
We propose a vessel-mixing based consistency regularization framework, for cross-domain learning in retinal A/V classification.
Our method achieves the state-of-the-art cross-domain performance, which is also close to the upper bound obtained by fully supervised learning on target domain.
arXiv Detail & Related papers (2021-03-16T14:18:35Z) - Simple and Effective Prevention of Mode Collapse in Deep One-Class
Classification [93.2334223970488]
We propose two regularizers to prevent hypersphere collapse in deep SVDD.
The first regularizer is based on injecting random noise via the standard cross-entropy loss.
The second regularizer penalizes the minibatch variance when it becomes too small.
arXiv Detail & Related papers (2020-01-24T03:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.