Divergence-Based Domain Transferability for Zero-Shot Classification
- URL: http://arxiv.org/abs/2302.05735v1
- Date: Sat, 11 Feb 2023 16:04:38 GMT
- Title: Divergence-Based Domain Transferability for Zero-Shot Classification
- Authors: Alexander Pugantsov, Richard McCreadie
- Abstract summary: Transferring learned patterns from pretrained neural language models has been shown to significantly improve effectiveness across a variety of language-based tasks.
Further tuning on intermediate tasks has been demonstrated to provide additional performance benefits, provided the intermediate task is sufficiently related to the target task.
However, how to identify related tasks is an open problem, and brute-force searching effective task combinations is prohibitively expensive.
- Score: 78.55044112903148
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transferring learned patterns from pretrained neural language models has been
shown to significantly improve effectiveness across a variety of language-based
tasks, meanwhile further tuning on intermediate tasks has been demonstrated to
provide additional performance benefits, provided the intermediate task is
sufficiently related to the target task. However, how to identify related tasks
is an open problem, and brute-force searching effective task combinations is
prohibitively expensive. Hence, the question arises, are we able to improve the
effectiveness and efficiency of tasks with no training examples through
selective fine-tuning? In this paper, we explore statistical measures that
approximate the divergence between domain representations as a means to
estimate whether tuning using one task pair will exhibit performance benefits
over tuning another. This estimation can then be used to reduce the number of
task pairs that need to be tested by eliminating pairs that are unlikely to
provide benefits. Through experimentation over 58 tasks and over 6,600 task
pair combinations, we demonstrate that statistical measures can distinguish
effective task pairs, and the resulting estimates can reduce end-to-end runtime
by up to 40%.
Related papers
- Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning [21.652389166495407]
We show that the transfer performance exhibits severe variance across different source tasks and training seeds.
Compared to embedding-free methods and text embeddings, task embeddings constructed from fine-tuned weights can better estimate task transferability.
We introduce a novel method that measures pairwise token similarity using maximum inner product search, leading to the highest performance in task prediction.
arXiv Detail & Related papers (2024-07-23T07:31:43Z) - Localizing Task Information for Improved Model Merging and Compression [61.16012721460561]
We show that the information required to solve each task is still preserved after merging as different tasks mostly use non-overlapping sets of weights.
We propose Consensus Merging, an algorithm that eliminates such weights and improves the general performance of existing model merging approaches.
arXiv Detail & Related papers (2024-05-13T14:54:37Z) - Instruction Matters, a Simple yet Effective Task Selection Approach in Instruction Tuning for Specific Tasks [51.15473776489712]
We show that leveraging instruction information textitalone enables the identification of pertinent tasks for instruction tuning.
By learning the unique instructional template style of the meta-dataset, we observe an improvement in task selection accuracy.
Experimental results demonstrate that training on a small set of tasks, chosen solely based on the instructions, leads to substantial performance improvements.
arXiv Detail & Related papers (2024-04-25T08:49:47Z) - Task Selection and Assignment for Multi-modal Multi-task Dialogue Act
Classification with Non-stationary Multi-armed Bandits [11.682678945754837]
Multi-task learning (MTL) aims to improve the performance of a primary task by jointly learning with related auxiliary tasks.
Previous studies suggest that such a random selection of tasks may not be helpful, and can even be harmful to performance.
This paper proposes a method for selecting and assigning tasks based on non-stationary multi-armed bandits.
arXiv Detail & Related papers (2023-09-18T14:51:51Z) - Efficiently Tuned Parameters are Task Embeddings [26.587153525003636]
Intermediate-task transfer can benefit a wide range of NLP tasks with properly selected source datasets.
It is computationally infeasible to experiment with all intermediate transfer combinations.
We propose to exploit these efficiently tuned parameters as off-the-shelf task embeddings.
arXiv Detail & Related papers (2022-10-21T03:19:54Z) - Identifying Suitable Tasks for Inductive Transfer Through the Analysis
of Feature Attributions [78.55044112903148]
We use explainability techniques to predict whether task pairs will be complementary, through comparison of neural network activation between single-task models.
Our results show that, through this approach, it is possible to reduce training time by up to 83.5% at a cost of only 0.034 reduction in positive-class F1 on the TREC-IS 2020-A dataset.
arXiv Detail & Related papers (2022-02-02T15:51:07Z) - Weighted Training for Cross-Task Learning [71.94908559469475]
We introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning.
We show that TAWT is easy to implement, is computationally efficient, requires little hyper parameter tuning, and enjoys non-asymptotic learning-theoretic guarantees.
As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning.
arXiv Detail & Related papers (2021-05-28T20:27:02Z) - Task Uncertainty Loss Reduce Negative Transfer in Asymmetric Multi-task
Feature Learning [0.0]
Multi-task learning (MTL) can improve task performance overall relative to single-task learning (STL), but can hide negative transfer (NT)
Asymmetric multitask feature learning (AMTFL) is an approach that tries to address this by allowing tasks with higher loss values to have smaller influence on feature representations for learning other tasks.
We present examples of NT in two datasets (image recognition and pharmacogenomics) and tackle this challenge by using aleatoric homoscedastic uncertainty to capture the relative confidence between tasks, and set weights for task loss.
arXiv Detail & Related papers (2020-12-17T13:30:45Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.