Trace norm regularization for multi-task learning with scarce data
- URL: http://arxiv.org/abs/2202.06742v1
- Date: Mon, 14 Feb 2022 14:18:31 GMT
- Title: Trace norm regularization for multi-task learning with scarce data
- Authors: Etienne Boursier and Mikhail Konobeev and Nicolas Flammarion
- Abstract summary: This work provides the first estimation error bound for the trace norm regularized estimator when the number of samples per task is small.
The advantages of trace norm regularization for learning data-scarce tasks extend to meta-learning and are confirmed empirically on synthetic datasets.
- Score: 20.085733305266572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task learning leverages structural similarities between multiple tasks
to learn despite very few samples. Motivated by the recent success of neural
networks applied to data-scarce tasks, we consider a linear low-dimensional
shared representation model. Despite an extensive literature, existing
theoretical results either guarantee weak estimation rates or require a large
number of samples per task. This work provides the first estimation error bound
for the trace norm regularized estimator when the number of samples per task is
small. The advantages of trace norm regularization for learning data-scarce
tasks extend to meta-learning and are confirmed empirically on synthetic
datasets.
Related papers
- Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples [53.95282502030541]
Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples.
We try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view.
arXiv Detail & Related papers (2024-06-06T10:38:01Z) - Limits of Transformer Language Models on Learning to Compose Algorithms [77.2443883991608]
We evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks.
Our results indicate that compositional learning in state-of-the-art Transformer language models is highly sample inefficient.
arXiv Detail & Related papers (2024-02-08T16:23:29Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - Uncertainty-Aware Meta-Learning for Multimodal Task Distributions [3.7470451129384825]
We present UnLiMiTD (uncertainty-aware meta-learning for multimodal task distributions)
We take a probabilistic perspective and train a parametric, tuneable distribution over tasks on the meta-dataset.
We demonstrate that UnLiMiTD's predictions compare favorably to, and outperform in most cases, the standard baselines.
arXiv Detail & Related papers (2022-10-04T20:02:25Z) - Cross-Task Consistency Learning Framework for Multi-Task Learning [9.991706230252708]
We propose a new learning framework for 2-task MTL problem.
We define two new loss terms inspired by cycle-consistency loss and contrastive learning.
We theoretically prove that both losses help the model learn more efficiently and that cross-task consistency loss is better in terms of alignment with the straight-forward predictions.
arXiv Detail & Related papers (2021-11-28T11:55:19Z) - BAMLD: Bayesian Active Meta-Learning by Disagreement [39.59987601426039]
This paper introduces an information-theoretic active task selection mechanism to decrease the number of labeling requests for meta-training tasks.
We report its empirical performance results that compare favourably against existing acquisition mechanisms.
arXiv Detail & Related papers (2021-10-19T13:06:51Z) - On the relationship between disentanglement and multi-task learning [62.997667081978825]
We take a closer look at the relationship between disentanglement and multi-task learning based on hard parameter sharing.
We show that disentanglement appears naturally during the process of multi-task neural network training.
arXiv Detail & Related papers (2021-10-07T14:35:34Z) - Sample Efficient Subspace-based Representations for Nonlinear
Meta-Learning [28.2312127482203]
This work explores a more general class of nonlinear tasks with applications ranging from binary classification to neural nets.
We prove that subspace-based representations can be learned in a sample-efficient manner and provably benefit future tasks in terms of sample complexity.
arXiv Detail & Related papers (2021-02-14T17:40:04Z) - Combat Data Shift in Few-shot Learning with Knowledge Graph [42.59886121530736]
In real-world applications, few-shot learning paradigm often suffers from data shift.
Most existing few-shot learning approaches are not designed with the consideration of data shift.
We propose a novel metric-based meta-learning framework to extract task-specific representations and task-shared representations.
arXiv Detail & Related papers (2021-01-27T12:35:18Z) - Neural Complexity Measures [96.06344259626127]
We propose Neural Complexity (NC), a meta-learning framework for predicting generalization.
Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a data-driven way.
arXiv Detail & Related papers (2020-08-07T02:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.