Duality Diagram Similarity: a generic framework for initialization
selection in task transfer learning
- URL: http://arxiv.org/abs/2008.02107v1
- Date: Wed, 5 Aug 2020 13:00:34 GMT
- Title: Duality Diagram Similarity: a generic framework for initialization
selection in task transfer learning
- Authors: Kshitij Dwivedi, Jiahui Huang, Radoslaw Martin Cichy, Gemma Roig
- Abstract summary: We propose a new highly efficient and accurate approach based on duality diagram similarity (DDS) between deep neural networks (DNNs)
We validate our approach on the Taskonomy dataset by measuring the correspondence between actual transfer learning performance rankings and predicted rankings.
- Score: 20.87279811893808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we tackle an open research question in transfer learning,
which is selecting a model initialization to achieve high performance on a new
task, given several pre-trained models. We propose a new highly efficient and
accurate approach based on duality diagram similarity (DDS) between deep neural
networks (DNNs). DDS is a generic framework to represent and compare data of
different feature dimensions. We validate our approach on the Taskonomy dataset
by measuring the correspondence between actual transfer learning performance
rankings on 17 taskonomy tasks and predicted rankings. Computing DDS based
ranking for $17\times17$ transfers requires less than 2 minutes and shows a
high correlation ($0.86$) with actual transfer learning rankings, outperforming
state-of-the-art methods by a large margin ($10\%$) on the Taskonomy benchmark.
We also demonstrate the robustness of our model selection approach to a new
task, namely Pascal VOC semantic segmentation. Additionally, we show that our
method can be applied to select the best layer locations within a DNN for
transfer learning on 2D, 3D and semantic tasks on NYUv2 and Pascal VOC
datasets.
Related papers
- Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning [5.119396962985841]
Intermediate task transfer learning can greatly improve model performance.
We conduct the largest study on NLP task transferability and task selection with 12k source-target pairs.
Applying ESMs on a prior method reduces execution time and disk space usage by factors of 10 and 278, respectively.
arXiv Detail & Related papers (2024-10-19T16:22:04Z) - Pre-Trained Model Recommendation for Downstream Fine-tuning [22.343011779348682]
Model selection aims to rank off-the-shelf pre-trained models and select the most suitable one for the new target task.
Existing model selection techniques are often constrained in their scope and tend to overlook the nuanced relationships between models and tasks.
We present a pragmatic framework textbfFennec, delving into a diverse, large-scale model repository.
arXiv Detail & Related papers (2024-03-11T02:24:32Z) - Co-guiding for Multi-intent Spoken Language Understanding [53.30511968323911]
We propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks.
For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning.
Experiment results on multi-intent SLU show that our model outperforms existing models by a large margin.
arXiv Detail & Related papers (2023-11-22T08:06:22Z) - Deep Active Ensemble Sampling For Image Classification [8.31483061185317]
Active learning frameworks aim to reduce the cost of data annotation by actively requesting the labeling for the most informative data points.
Some proposed approaches include uncertainty-based techniques, geometric methods, implicit combination of uncertainty-based and geometric approaches.
We present an innovative integration of recent progress in both uncertainty-based and geometric frameworks to enable an efficient exploration/exploitation trade-off in sample selection strategy.
Our framework provides two advantages: (1) accurate posterior estimation, and (2) tune-able trade-off between computational overhead and higher accuracy.
arXiv Detail & Related papers (2022-10-11T20:20:20Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Team Cogitat at NeurIPS 2021: Benchmarks for EEG Transfer Learning
Competition [55.34407717373643]
Building subject-independent deep learning models for EEG decoding faces the challenge of strong co-shift.
Our approach is to explicitly align feature distributions at various layers of the deep learning model.
The methodology won first place in the 2021 Benchmarks in EEG Transfer Learning competition, hosted at the NeurIPS conference.
arXiv Detail & Related papers (2022-02-01T11:11:08Z) - Pareto-wise Ranking Classifier for Multi-objective Evolutionary Neural
Architecture Search [15.454709248397208]
This study focuses on how to find feasible deep models under diverse design objectives.
We propose a classification-wise Pareto evolution approach for one-shot NAS, where an online classifier is trained to predict the dominance relationship between the candidate and constructed reference architectures.
We find a number of neural architectures with different model sizes ranging from 2M to 6M under diverse objectives and constraints.
arXiv Detail & Related papers (2021-09-14T13:28:07Z) - Learning Invariant Representations across Domains and Tasks [81.30046935430791]
We propose a novel Task Adaptation Network (TAN) to solve this unsupervised task transfer problem.
In addition to learning transferable features via domain-adversarial training, we propose a novel task semantic adaptor that uses the learning-to-learn strategy to adapt the task semantics.
TAN significantly increases the recall and F1 score by 5.0% and 7.8% compared to recently strong baselines.
arXiv Detail & Related papers (2021-03-03T11:18:43Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.