Test-Time Training for Speech
- URL: http://arxiv.org/abs/2309.10930v2
- Date: Thu, 28 Sep 2023 21:06:02 GMT
- Title: Test-Time Training for Speech
- Authors: Sri Harsha Dumpala and Chandramouli Sastry and Sageev Oore
- Abstract summary: We introduce distribution-shifts to the test datasets of standard speech-classification tasks.
We explore how Test-Time Training (TTT) can help adjust to the distribution-shift.
- Score: 6.697702130929691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the application of Test-Time Training (TTT) as a
solution to handling distribution shifts in speech applications. In particular,
we introduce distribution-shifts to the test datasets of standard
speech-classification tasks -- for example, speaker-identification and
emotion-detection -- and explore how Test-Time Training (TTT) can help adjust
to the distribution-shift. In our experiments that include distribution shifts
due to background noise and natural variations in speech such as gender and
age, we identify some key-challenges with TTT including sensitivity to
optimization hyperparameters (e.g., number of optimization steps and subset of
parameters chosen for TTT) and scalability (e.g., as each example gets its own
set of parameters, TTT is not scalable). Finally, we propose using BitFit -- a
parameter-efficient fine-tuning algorithm proposed for text applications that
only considers the bias parameters for fine-tuning -- as a solution to the
aforementioned challenges and demonstrate that it is consistently more stable
than fine-tuning all the parameters of the model.
Related papers
- Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection [35.88667386998423]
We introduce adaptive learn-then-test (aLTT), which provides finite-sample statistical guarantees on the population risk of AI models.
ALTT can reduce the number of testing rounds, making it well-suited for scenarios in which testing is costly or presents safety risks.
arXiv Detail & Related papers (2024-09-24T08:14:26Z) - Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models [4.655740975414312]
This paper introduces Test-Time Low-rank adaptation (TTL) as an alternative to prompt tuning for zero-shot generalizations of large-scale vision-language models (VLMs)
TTL offers a test-time-efficient adaptation approach that updates the attention weights of the transformer by maximizing prediction confidence.
arXiv Detail & Related papers (2024-07-22T17:59:19Z) - Improved Test-Time Adaptation for Domain Generalization [48.239665441875374]
Test-time training (TTT) adapts the learned model with test data.
This work addresses two main factors: selecting an appropriate auxiliary TTT task for updating and identifying reliable parameters to update during the test phase.
We introduce additional adaptive parameters for the trained model, and we suggest only updating the adaptive parameters during the test phase.
arXiv Detail & Related papers (2023-04-10T10:12:38Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - Evaluating Parameter-Efficient Transfer Learning Approaches on SURE
Benchmark for Speech Understanding [40.27182770995891]
Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models.
We introduce the Speech UndeRstanding Evaluation (SURE) benchmark for parameter-efficient learning for various speech-processing tasks.
arXiv Detail & Related papers (2023-03-02T08:57:33Z) - Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language
Models [107.05966685291067]
We propose test-time prompt tuning (TPT) to learn adaptive prompts on the fly with a single test sample.
TPT improves the zero-shot top-1 accuracy of CLIP by 3.6% on average.
In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.
arXiv Detail & Related papers (2022-09-15T17:55:11Z) - STT: Soft Template Tuning for Few-Shot Adaptation [72.46535261444151]
We propose a new prompt-tuning framework, called Soft Template Tuning (STT)
STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task.
It can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.
arXiv Detail & Related papers (2022-07-18T07:07:22Z) - IDPG: An Instance-Dependent Prompt Generation Method [58.45110542003139]
Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.
We propose a conditional prompt generation method to generate prompts for each input instance.
arXiv Detail & Related papers (2022-04-09T15:45:27Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.