Using a thousand optimization tasks to learn hyperparameter search
strategies
- URL: http://arxiv.org/abs/2002.11887v3
- Date: Wed, 1 Apr 2020 00:35:05 GMT
- Title: Using a thousand optimization tasks to learn hyperparameter search
strategies
- Authors: Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben
Poole, Jascha Sohl-Dickstein
- Abstract summary: We present TaskSet, a dataset of neural tasks for use in training and evaluating neurals.
TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional networks, to variational autoencoders, to non-volume preserving flows on a variety of datasets.
- Score: 53.318615663332274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present TaskSet, a dataset of tasks for use in training and evaluating
optimizers. TaskSet is unique in its size and diversity, containing over a
thousand tasks ranging from image classification with fully connected or
convolutional neural networks, to variational autoencoders, to non-volume
preserving flows on a variety of datasets. As an example application of such a
dataset we explore meta-learning an ordered list of hyperparameters to try
sequentially. By learning this hyperparameter list from data generated using
TaskSet we achieve large speedups in sample efficiency over random search. Next
we use the diversity of the TaskSet and our method for learning hyperparameter
lists to empirically explore the generalization of these lists to new
optimization tasks in a variety of settings including ImageNet classification
with Resnet50 and LM1B language modeling with transformers. As part of this
work we have opensourced code for all tasks, as well as ~29 million training
curves for these problems and the corresponding hyperparameters.
Related papers
- Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for Lifelong Instruction Tuning.
We construct pseudo-skill clusters by grouping gradient-based sample vectors.
We select the best-performing data selector for each skill cluster from a pool of selector experts.
arXiv Detail & Related papers (2024-10-14T15:48:09Z) - A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - HyperLoader: Integrating Hypernetwork-Based LoRA and Adapter Layers into Multi-Task Transformers for Sequence Labelling [5.955463697605461]
We present HyperLoader, a simple approach that combines different parameter-efficient fine-tuning methods in a multi-task setting.
Our method combines the benefits of multi-task learning by capturing the structure of all tasks.
We provide empirical evidence that HyperLoader outperforms previous approaches in most datasets.
arXiv Detail & Related papers (2024-07-01T16:00:53Z) - AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task
Learning [19.201899503691266]
We measure the task dominance degree of a parameter by the total updates of each task on this parameter.
We propose a Task-wise Adaptive learning rate approach, AdaTask, to separate the emphaccumulative gradients and hence the learning rate of each task.
Experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks.
arXiv Detail & Related papers (2022-11-28T04:24:38Z) - Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision
Tasks [36.34331439747556]
We propose Polyhistor and Polyhistor-Lite to share information across different tasks with a few trainable parameters.
Specifically, Polyhistor achieves competitive accuracy compared to the state-of-the-art while only using 10% of their trainable parameters.
arXiv Detail & Related papers (2022-10-07T00:25:02Z) - Attentional Mixtures of Soft Prompt Tuning for Parameter-efficient
Multi-task Knowledge Sharing [53.399742232323895]
ATTEMPT is a new modular, multi-task, and parameter-efficient language model (LM) tuning approach.
It combines knowledge transferred across different tasks via a mixture of soft prompts while keeping original LM unchanged.
It is parameter-efficient (e.g., updates 1,600 times fewer parameters than fine-tuning) and enables multi-task learning and flexible extensions.
arXiv Detail & Related papers (2022-05-24T10:48:33Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Exceeding the Limits of Visual-Linguistic Multi-Task Learning [0.0]
We construct 1000 unique classification tasks that share similarly-structured input data.
These classification tasks focus on learning the product hierarchy of different e-commerce websites.
We solve these tasks in unison using multi-task learning (MTL)
arXiv Detail & Related papers (2021-07-27T19:42:14Z) - Efficient Continual Adaptation for Generative Adversarial Networks [97.20244383723853]
We present a continual learning approach for generative adversarial networks (GANs)
Our approach is based on learning a set of global and task-specific parameters.
We show that the feature-map transformation based approach outperforms state-of-the-art continual GANs methods.
arXiv Detail & Related papers (2021-03-06T05:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.