Related papers: Adaptive Test-Time Augmentation for Low-Power CPU

Adaptive Test-Time Augmentation for Low-Power CPU

URL: http://arxiv.org/abs/2105.06183v1
Date: Thu, 13 May 2021 10:50:13 GMT
Title: Adaptive Test-Time Augmentation for Low-Power CPU
Authors: Luca Mocerino, Roberto G. Rizzo, Valentino Peluso, Andrea Calimera, Enrico Macii
Abstract summary: Test-Time Augmentation (TTA) techniques aim to alleviate such common side effect at inference-time. We propose AdapTTA, an adaptive implementation of TTA that controls the number of feed-forward passes dynamically. Experimental results on state-of-the-art ConvNets for image classification deployed on a commercial ARM Cortex-A CPU demonstrate AdapTTA reaches remarkable latency savings.
Score: 3.5473686344971416
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional Neural Networks (ConvNets) are trained offline using the few available data and may therefore suffer from substantial accuracy loss when ported on the field, where unseen input patterns received under unpredictable external conditions can mislead the model. Test-Time Augmentation (TTA) techniques aim to alleviate such common side effect at inference-time, first running multiple feed-forward passes on a set of altered versions of the same input sample, and then computing the main outcome through a consensus of the aggregated predictions. Unfortunately, the implementation of TTA on embedded CPUs introduces latency penalties that limit its adoption on edge applications. To tackle this issue, we propose AdapTTA, an adaptive implementation of TTA that controls the number of feed-forward passes dynamically, depending on the complexity of the input. Experimental results on state-of-the-art ConvNets for image classification deployed on a commercial ARM Cortex-A CPU demonstrate AdapTTA reaches remarkable latency savings, from 1.49X to 2.21X, and hence a higher frame rate compared to static TTA, still preserving the same accuracy gain.

Related papers

Neutralizing Token Aggregation via Information Augmentation for Efficient Test-Time Adaptation [59.1067331268383]
Test-Time Adaptation (TTA) has emerged as an effective solution for adapting Vision Transformers (ViT) to distribution shifts without additional training data.<n>To reduce inference cost, plug-and-play token aggregation methods merge redundant tokens in ViTs to reduce total processed tokens.<n>We formalize this problem as Efficient Test-Time Adaptation (ETTA), seeking to preserve the adaptation capability of TTA while reducing inference latency.
arXiv Detail & Related papers (2025-08-05T12:40:55Z)
Test-Time Adaptation with Binary Feedback [50.20923012663613]
BiTTA is a novel dual-path optimization framework that balances binary feedback-guided adaptation on uncertain samples with agreement-based self-adaptation on confident predictions.<n> Experiments show BiTTA achieves 13.3%p accuracy improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2025-05-24T05:24:10Z)
Test-Time Model Adaptation with Only Forward Passes [68.11784295706995]
Test-time adaptation has proven effective in adapting a given trained model to unseen test samples with potential distribution shifts. We propose a test-time Forward-Optimization Adaptation (FOA) method. FOA runs on quantized 8-bit ViT, outperforms gradient-based TENT on full-precision 32-bit ViT, and achieves an up to 24-fold memory reduction on ImageNet-C.
arXiv Detail & Related papers (2024-04-02T05:34:33Z)
E2USD: Efficient-yet-effective Unsupervised State Detection for Multivariate Time Series [18.02694168117277]
We propose E2Usd that enables efficient-yet-accurate unsupervised state detection. E2Usd exploits a Fast Fourier Transform-based Time Series and a Decomposed Dual-view Embedding Module. We also propose a False Negative Cancellation Contrastive Learning method to counteract the effects of false negatives.
arXiv Detail & Related papers (2024-02-21T10:16:57Z)
Optimization-Free Test-Time Adaptation for Cross-Person Activity Recognition [30.350005654271868]
Test-Time Adaptation aims to utilize the test stream to adjust predictions in real-time inference. High computational cost makes it intractable to run on resource-constrained edge devices. We propose an Optimization-Free Test-Time Adaptation framework for sensor-based HAR.
arXiv Detail & Related papers (2023-10-28T02:20:33Z)
Decoder Tuning: Efficient Language Understanding as Decoding [84.68266271483022]
We present Decoder Tuning (DecT), which in contrast optimize task-specific decoder networks on the output side. By gradient-based optimization, DecT can be trained within several seconds and requires only one P query per sample. We conduct extensive natural language understanding experiments and show that DecT significantly outperforms state-of-the-art algorithms with a $200times$ speed-up.
arXiv Detail & Related papers (2022-12-16T11:15:39Z)
Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing. Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16. It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z)
Robust Continual Test-time Adaptation: Instance-aware BN and Prediction-balanced Memory [58.72445309519892]
We present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams. Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner.
arXiv Detail & Related papers (2022-08-10T03:05:46Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation. Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.