SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity
- URL: http://arxiv.org/abs/2503.20354v1
- Date: Wed, 26 Mar 2025 09:27:09 GMT
- Title: SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity
- Authors: Ke Ma, Jiaqi Tang, Bin Guo, Fan Dang, Sicong Liu, Zhui Zhu, Lei Wu, Cheng Fang, Ying-Cong Chen, Zhiwen Yu, Yunhao Liu,
- Abstract summary: Test-time adaptation (TTA) has emerged to improve the performance of deep models by adapting them to unlabeled target data online.<n>Yet, the significant memory cost, particularly in resource-constrained terminals, impedes the effective deployment of most backward-propagation-based TTA methods.<n>To tackle memory constraints, we introduce SURGEON, a method that substantially reduces memory cost while preserving comparable accuracy improvements.
- Score: 30.260783715373382
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the growing integration of deep models into mobile terminals, the accuracy of these models declines significantly due to various deployment interferences. Test-time adaptation (TTA) has emerged to improve the performance of deep models by adapting them to unlabeled target data online. Yet, the significant memory cost, particularly in resource-constrained terminals, impedes the effective deployment of most backward-propagation-based TTA methods. To tackle memory constraints, we introduce SURGEON, a method that substantially reduces memory cost while preserving comparable accuracy improvements during fully test-time adaptation (FTTA) without relying on specific network architectures or modifications to the original training procedure. Specifically, we propose a novel dynamic activation sparsity strategy that directly prunes activations at layer-specific dynamic ratios during adaptation, allowing for flexible control of learning ability and memory cost in a data-sensitive manner. Among this, two metrics, Gradient Importance and Layer Activation Memory, are considered to determine the layer-wise pruning ratios, reflecting accuracy contribution and memory efficiency, respectively. Experimentally, our method surpasses the baselines by not only reducing memory usage but also achieving superior accuracy, delivering SOTA performance across diverse datasets, architectures, and tasks.
Related papers
- LeanTTA: A Backpropagation-Free and Stateless Approach to Quantized Test-Time Adaptation on Edge Devices [13.355021314836852]
We present LeanTTA, a novel backpropagation-free and stateless framework for quantized test-time adaptation tailored to edge devices.
Our approach minimizes computational costs by dynamically updating normalization statistics without backpropagation.
We validate our framework across sensor modalities, demonstrating significant improvements over state-of-the-art TTA methods.
arXiv Detail & Related papers (2025-03-20T06:27:09Z) - Structure-Activation Synergy: A Dual Efficiency Framework for Parameter-Memory Optimized Transfer Learning [8.602744958104969]
We present Structure-Activation Synergy (S2A), an innovative framework achieving dual optimization of parameters and memory.<n>We show S2A's superior efficiency, reducing GPU memory consumption by 75% (4.2 average reduction) while maintaining 98.7% of full fine-tuning accuracy with only 0.9% tunable parameters.
arXiv Detail & Related papers (2025-03-11T08:10:03Z) - Memory Efficient Transformer Adapter for Dense Predictions [42.413108132475855]
We propose META, a memory-efficient ViT adapter that can improve the model's memory efficiency and decrease memory time consumption.
Within the proposed block, the cross-shaped self-attention is employed to reduce the model's frequent reshaping operations.
META substantially enhances the predicted quality, while achieving a new state-of-the-art accuracy-efficiency trade-off.
arXiv Detail & Related papers (2025-02-04T03:19:33Z) - Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - CAME: Confidence-guided Adaptive Memory Efficient Optimization [20.009302737137787]
Adaptive gradient methods have demonstrated excellent performance in the training of large language models.
The need for maintaining second-moment estimates requires maintaining a high cost of extra memory overheads.
Several memory-efficients have been proposed to obtain a drastic reduction in auxiliary memory usage, but with a performance penalty.
We propose CAME to simultaneously achieve two goals: fast convergence as in traditional adaptive methods, and low memory usage as in memory-efficient methods.
arXiv Detail & Related papers (2023-07-05T06:05:36Z) - EcoTTA: Memory-Efficient Continual Test-time Adaptation via
Self-distilled Regularization [71.70414291057332]
TTA may primarily be conducted on edge devices with limited memory.
Long-term adaptation often leads to catastrophic forgetting and error accumulation.
We present lightweight meta networks that can adapt the frozen original networks to the target domain.
arXiv Detail & Related papers (2023-03-03T13:05:30Z) - Contextual Squeeze-and-Excitation for Efficient Few-Shot Image
Classification [57.36281142038042]
We present a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance.
We also present a new training protocol based on Coordinate-Descent called UpperCaSE that exploits meta-trained CaSE blocks and fine-tuning routines for efficient adaptation.
arXiv Detail & Related papers (2022-06-20T15:25:08Z) - Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution.
Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x.
We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.