Divergence-Based Domain Transferability for Zero-Shot Classification
- URL: http://arxiv.org/abs/2302.05735v1
- Date: Sat, 11 Feb 2023 16:04:38 GMT
- Title: Divergence-Based Domain Transferability for Zero-Shot Classification
- Authors: Alexander Pugantsov, Richard McCreadie
- Abstract summary: Transferring learned patterns from pretrained neural language models has been shown to significantly improve effectiveness across a variety of language-based tasks.
Further tuning on intermediate tasks has been demonstrated to provide additional performance benefits, provided the intermediate task is sufficiently related to the target task.
However, how to identify related tasks is an open problem, and brute-force searching effective task combinations is prohibitively expensive.
- Score: 78.55044112903148
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transferring learned patterns from pretrained neural language models has been
shown to significantly improve effectiveness across a variety of language-based
tasks, meanwhile further tuning on intermediate tasks has been demonstrated to
provide additional performance benefits, provided the intermediate task is
sufficiently related to the target task. However, how to identify related tasks
is an open problem, and brute-force searching effective task combinations is
prohibitively expensive. Hence, the question arises, are we able to improve the
effectiveness and efficiency of tasks with no training examples through
selective fine-tuning? In this paper, we explore statistical measures that
approximate the divergence between domain representations as a means to
estimate whether tuning using one task pair will exhibit performance benefits
over tuning another. This estimation can then be used to reduce the number of
task pairs that need to be tested by eliminating pairs that are unlikely to
provide benefits. Through experimentation over 58 tasks and over 6,600 task
pair combinations, we demonstrate that statistical measures can distinguish
effective task pairs, and the resulting estimates can reduce end-to-end runtime
by up to 40%.
Related papers
- Revisiting Weight Averaging for Model Merging [16.503826062785773]
We present results that weight averaging implicitly induces task vectors centered around the weight averaging itself.
Applying a low-rank approximation to these centered task vectors significantly improves merging performance.
We evaluate our method on eight image classification tasks, demonstrating that it outperforms prior methods by a significant margin.
arXiv Detail & Related papers (2024-12-11T06:29:20Z) - Efficient Task Grouping Through Samplewise Optimisation Landscape Analysis [23.177220600984665]
This paper introduces an efficient task grouping framework to reduce the overwhelming computational demands of the existing methods.
A graph-based clustering algorithm is employed to pinpoint near-optimal task groups.
Empirical assessments conducted on 8 different datasets highlight the effectiveness of the proposed framework.
arXiv Detail & Related papers (2024-12-05T18:33:59Z) - Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning [21.652389166495407]
We show that the transfer performance exhibits severe variance across different source tasks and training seeds.
Compared to embedding-free methods and text embeddings, task embeddings constructed from fine-tuned weights can better estimate task transferability.
We introduce a novel method that measures pairwise token similarity using maximum inner product search, leading to the highest performance in task prediction.
arXiv Detail & Related papers (2024-07-23T07:31:43Z) - Localizing Task Information for Improved Model Merging and Compression [61.16012721460561]
We show that the information required to solve each task is still preserved after merging as different tasks mostly use non-overlapping sets of weights.
We propose Consensus Merging, an algorithm that eliminates such weights and improves the general performance of existing model merging approaches.
arXiv Detail & Related papers (2024-05-13T14:54:37Z) - Efficiently Tuned Parameters are Task Embeddings [26.587153525003636]
Intermediate-task transfer can benefit a wide range of NLP tasks with properly selected source datasets.
It is computationally infeasible to experiment with all intermediate transfer combinations.
We propose to exploit these efficiently tuned parameters as off-the-shelf task embeddings.
arXiv Detail & Related papers (2022-10-21T03:19:54Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - Identifying Suitable Tasks for Inductive Transfer Through the Analysis
of Feature Attributions [78.55044112903148]
We use explainability techniques to predict whether task pairs will be complementary, through comparison of neural network activation between single-task models.
Our results show that, through this approach, it is possible to reduce training time by up to 83.5% at a cost of only 0.034 reduction in positive-class F1 on the TREC-IS 2020-A dataset.
arXiv Detail & Related papers (2022-02-02T15:51:07Z) - Weighted Training for Cross-Task Learning [71.94908559469475]
We introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning.
We show that TAWT is easy to implement, is computationally efficient, requires little hyper parameter tuning, and enjoys non-asymptotic learning-theoretic guarantees.
As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning.
arXiv Detail & Related papers (2021-05-28T20:27:02Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.