Related papers: Test-time Batch Normalization

Test-time Batch Normalization

URL: http://arxiv.org/abs/2205.10210v1
Date: Fri, 20 May 2022 14:33:39 GMT
Title: Test-time Batch Normalization
Authors: Tao Yang, Shenglong Zhou, Yuwang Wang, Yan Lu, Nanning Zheng
Abstract summary: Deep neural networks often suffer the data distribution shift between training and testing. We revisit the batch normalization (BN) in the training process and reveal two key insights benefiting test-time optimization. We propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss.
Score: 61.292862024903584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks often suffer the data distribution shift between training and testing, and the batch statistics are observed to reflect the shift. In this paper, targeting of alleviating distribution shift in test time, we revisit the batch normalization (BN) in the training process and reveals two key insights benefiting test-time optimization: $(i)$ preserving the same gradient backpropagation form as training, and $(ii)$ using dataset-level statistics for robust optimization and inference. Based on the two insights, we propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss. We verify the effectiveness of our method on two typical settings with distribution shift, i.e., domain generalization and robustness tasks. Our GpreBN significantly improves the test-time performance and achieves the state of the art results.

Related papers

Discover Your Neighbors: Advanced Stable Test-Time Adaptation in Dynamic World [8.332531696256666]
Discover Your Neighbours (DYN) is the first backward-free approach specialized for dynamic test-time adaptation (TTA) Our DYN consists of layer-wise instance statistics clustering (LISC) and cluster-aware batch normalization (CABN) Experimental results validate DYN's robustness and effectiveness, demonstrating maintained performance under dynamic data stream patterns.
arXiv Detail & Related papers (2024-06-08T09:22:32Z)
Optimization-Free Test-Time Adaptation for Cross-Person Activity Recognition [30.350005654271868]
Test-Time Adaptation aims to utilize the test stream to adjust predictions in real-time inference. High computational cost makes it intractable to run on resource-constrained edge devices. We propose an Optimization-Free Test-Time Adaptation framework for sensor-based HAR.
arXiv Detail & Related papers (2023-10-28T02:20:33Z)
Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation [67.77048565738728]
Continual learning involves learning a sequence of tasks and balancing their knowledge appropriately. We propose Adaptive Balance of BN (AdaB$2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions. Our approach achieves significant performance gains across a wide range of benchmarks.
arXiv Detail & Related papers (2023-10-13T04:50:40Z)
TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation [28.63285970880039]
Recent test-time adaptation methods heavily rely on transductive batch normalization (TBN) Adopting TBN that employs test batch statistics mitigates the performance degradation caused by the domain shift. We present a new test-time normalization (TTN) method that interpolates the statistics by adjusting the importance between CBN and TBN according to the domain-shift sensitivity of each BN layer.
arXiv Detail & Related papers (2023-02-10T10:25:29Z)
DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning. First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates. Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z)
Sample-Efficient Optimisation with Probabilistic Transformer Surrogates [66.98962321504085]
This paper investigates the feasibility of employing state-of-the-art probabilistic transformers in Bayesian optimisation. We observe two drawbacks stemming from their training procedure and loss definition, hindering their direct deployment as proxies in black-box optimisation. We introduce two components: 1) a BO-tailored training prior supporting non-uniformly distributed points, and 2) a novel approximate posterior regulariser trading-off accuracy and input sensitivity to filter favourable stationary points for improved predictive performance.
arXiv Detail & Related papers (2022-05-27T11:13:17Z)
Test-time Batch Statistics Calibration for Covariate Shift [66.7044675981449]
We propose to adapt the deep models to the novel environment during inference. We present a general formulation $alpha$-BN to calibrate the batch statistics. We also present a novel loss function to form a unified test time adaptation framework Core.
arXiv Detail & Related papers (2021-10-06T08:45:03Z)
Double Forward Propagation for Memorized Batch Normalization [68.34268180871416]
Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs) We propose a memorized batch normalization (MBN) which considers multiple recent batches to obtain more accurate and robust statistics. Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference.
arXiv Detail & Related papers (2020-10-10T08:48:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.