Robust Question Answering against Distribution Shifts with Test-Time
Adaptation: An Empirical Study
- URL: http://arxiv.org/abs/2302.04618v1
- Date: Thu, 9 Feb 2023 13:10:53 GMT
- Title: Robust Question Answering against Distribution Shifts with Test-Time
Adaptation: An Empirical Study
- Authors: Hai Ye, Yuyang Ding, Juntao Li, Hwee Tou Ng
- Abstract summary: A deployed question answering (QA) model can easily fail when the test data has a distribution shift compared to the training data.
We evaluate test-time adaptation (TTA) to improve a model after deployment.
We also propose a novel TTA method called online imitation learning (OIL)
- Score: 24.34217596145152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A deployed question answering (QA) model can easily fail when the test data
has a distribution shift compared to the training data. Robustness tuning (RT)
methods have been widely studied to enhance model robustness against
distribution shifts before model deployment. However, can we improve a model
after deployment? To answer this question, we evaluate test-time adaptation
(TTA) to improve a model after deployment. We first introduce COLDQA, a unified
evaluation benchmark for robust QA against text corruption and changes in
language and domain. We then evaluate previous TTA methods on COLDQA and
compare them to RT methods. We also propose a novel TTA method called online
imitation learning (OIL). Through extensive experiments, we find that TTA is
comparable to RT methods, and applying TTA after RT can significantly boost the
performance on COLDQA. Our proposed OIL improves TTA to be more robust to
variation in hyper-parameters and test distributions over time.
Related papers
- Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - RDumb: A simple approach that questions our progress in continual test-time adaptation [12.374649969346441]
Test-Time Adaptation (TTA) allows to update pre-trained models to changing data distributions at deployment time.
Recent work proposed and applied methods for continual adaptation over long timescales.
We find that eventually all but one state-of-the-art methods collapse and perform worse than a non-adapting model.
arXiv Detail & Related papers (2023-06-08T17:52:34Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Test-Time Adaptation with Perturbation Consistency Learning [32.58879780726279]
We propose a simple test-time adaptation method to promote the model to make stable predictions for samples with distribution shifts.
Our method can achieve higher or comparable performance with less inference time over strong PLM backbones.
arXiv Detail & Related papers (2023-04-25T12:29:22Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [143.14128737978342]
Test-time adaptation, an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - Robust Test-Time Adaptation in Dynamic Scenarios [9.475271284789969]
Test-time adaptation (TTA) intends to adapt the pretrained model to test distributions with only unlabeled test data streams.
We elaborate a Robust Test-Time Adaptation (RoTTA) method against the complex data stream in PTTA.
Our method is easy to implement, making it a good choice for rapid deployment.
arXiv Detail & Related papers (2023-03-24T10:19:14Z) - Towards Stable Test-Time Adaptation in Dynamic Wild World [60.98073673220025]
Test-time adaptation (TTA) has shown to be effective at tackling distribution shifts between training and testing data by adapting a given model on test samples.
Online model updating of TTA may be unstable and this is often a key obstacle preventing existing TTA methods from being deployed in the real world.
arXiv Detail & Related papers (2023-02-24T02:03:41Z) - A Probabilistic Framework for Lifelong Test-Time Adaptation [34.07074915005366]
Test-time adaptation (TTA) is the problem of updating a pre-trained source model at inference time given test input(s) from a different target domain.
We present PETAL (Probabilistic lifElong Test-time Adaptation with seLf-training prior), which solves lifelong TTA using a probabilistic approach.
Our method achieves better results than the current state-of-the-art for online lifelong test-time adaptation across various benchmarks.
arXiv Detail & Related papers (2022-12-19T18:42:19Z) - Robust Continual Test-time Adaptation: Instance-aware BN and
Prediction-balanced Memory [58.72445309519892]
We present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams.
Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner.
arXiv Detail & Related papers (2022-08-10T03:05:46Z) - Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data.
We propose an active sample selection criterion to identify reliable and non-redundant samples.
We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.