EcoTTA: Memory-Efficient Continual Test-time Adaptation via
Self-distilled Regularization
- URL: http://arxiv.org/abs/2303.01904v4
- Date: Tue, 23 May 2023 05:33:02 GMT
- Title: EcoTTA: Memory-Efficient Continual Test-time Adaptation via
Self-distilled Regularization
- Authors: Junha Song, Jungsoo Lee, In So Kweon, Sungha Choi
- Abstract summary: TTA may primarily be conducted on edge devices with limited memory.
Long-term adaptation often leads to catastrophic forgetting and error accumulation.
We present lightweight meta networks that can adapt the frozen original networks to the target domain.
- Score: 71.70414291057332
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a simple yet effective approach that improves continual
test-time adaptation (TTA) in a memory-efficient manner. TTA may primarily be
conducted on edge devices with limited memory, so reducing memory is crucial
but has been overlooked in previous TTA studies. In addition, long-term
adaptation often leads to catastrophic forgetting and error accumulation, which
hinders applying TTA in real-world deployments. Our approach consists of two
components to address these issues. First, we present lightweight meta networks
that can adapt the frozen original networks to the target domain. This novel
architecture minimizes memory consumption by decreasing the size of
intermediate activations required for backpropagation. Second, our novel
self-distilled regularization controls the output of the meta networks not to
deviate significantly from the output of the frozen original networks, thereby
preserving well-trained knowledge from the source domain. Without additional
memory, this regularization prevents error accumulation and catastrophic
forgetting, resulting in stable performance even in long-term test-time
adaptation. We demonstrate that our simple yet effective strategy outperforms
other state-of-the-art methods on various benchmarks for image classification
and semantic segmentation tasks. Notably, our proposed method with ResNet-50
and WideResNet-40 takes 86% and 80% less memory than the recent
state-of-the-art method, CoTTA.
Related papers
- SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - UniPTS: A Unified Framework for Proficient Post-Training Sparsity [67.16547529992928]
Post-training Sparsity (PTS) is a newly emerged avenue that chases efficient network sparsity with limited data in need.
In this paper, we attempt to reconcile this disparity by transposing three cardinal factors that profoundly alter the performance of conventional sparsity into the context of PTS.
Our framework, termed UniPTS, is validated to be much superior to existing PTS methods across extensive benchmarks.
arXiv Detail & Related papers (2024-05-29T06:53:18Z) - Layer-wise Auto-Weighting for Non-Stationary Test-Time Adaptation [40.03897994619606]
We introduce a layer-wise auto-weighting algorithm for continual and gradual TTA.
We propose an exponential min-max scaler to make certain layers nearly frozen while mitigating outliers.
Experiments on CIFAR-10C, CIFAR-100C, and ImageNet-C show our method outperforms conventional continual and gradual TTA approaches.
arXiv Detail & Related papers (2023-11-10T03:54:40Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - UniPT: Universal Parallel Tuning for Transfer Learning with Efficient
Parameter and Memory [69.33445217944029]
PETL is an effective strategy for adapting pre-trained models to downstream domains.
Recent PETL works focus on the more valuable memory-efficient characteristic.
We propose a new memory-efficient PETL strategy, Universal Parallel Tuning (UniPT)
arXiv Detail & Related papers (2023-08-28T05:38:43Z) - Fused Depthwise Tiling for Memory Optimization in TinyML Deep Neural
Network Inference [1.6094180182513644]
Memory optimization for deep neural network (DNN) inference gains high relevance with the emergence of TinyML.
DNN inference requires large intermediate run-time buffers to store activations and other intermediate data, which leads to high memory usage.
We propose a new Fused Depthwise Tiling (FDT) method for the memory optimization of DNNs.
arXiv Detail & Related papers (2023-03-31T08:26:17Z) - Self-Attentive Pooling for Efficient Deep Learning [6.822466048176652]
We propose a novel non-local self-attentive pooling method that can be used as a drop-in replacement to the standard pooling layers.
We surpass the test accuracy of existing pooling techniques on different variants of MobileNet-V2 on ImageNet by an average of 1.2%.
Our approach achieves 1.43% higher test accuracy compared to SOTA techniques with iso-memory footprints.
arXiv Detail & Related papers (2022-09-16T00:35:14Z) - LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer
Learning [82.93130407930762]
It is costly to update the entire parameter set of large pre-trained models.
PETL techniques allow updating a small subset of parameters inside a pre-trained backbone network for a new task.
We propose Ladder Side-Tuning (LST), a new PETL technique that reduces training memory requirements by more substantial amounts.
arXiv Detail & Related papers (2022-06-13T23:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.