Statistical Deficiency for Task Inclusion Estimation
- URL: http://arxiv.org/abs/2503.05491v2
- Date: Thu, 13 Mar 2025 08:41:29 GMT
- Title: Statistical Deficiency for Task Inclusion Estimation
- Authors: Loïc Fosse, Frédéric Béchet, Benoît Favre, Géraldine Damnati, Gwénolé Lecorvé, Maxime Darrin, Philippe Formont, Pablo Piantanida,
- Abstract summary: Tasks are central in machine learning, as they are the most natural objects to assess the capabilities of current models.<n>This study proposes a theoretically grounded setup to define the notion of task and to compute the bf inclusion between two tasks from a statistical deficiency point of view.
- Score: 24.755448493709604
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Tasks are central in machine learning, as they are the most natural objects to assess the capabilities of current models. The trend is to build general models able to address any task. Even though transfer learning and multitask learning try to leverage the underlying task space, no well-founded tools are available to study its structure. This study proposes a theoretically grounded setup to define the notion of task and to compute the {\bf inclusion} between two tasks from a statistical deficiency point of view. We propose a tractable proxy as information sufficiency to estimate the degree of inclusion between tasks, show its soundness on synthetic data, and use it to reconstruct empirically the classic NLP pipeline.
Related papers
- Provable Benefits of Task-Specific Prompts for In-context Learning [44.768199865867494]
In this work, we consider a novel setting where the global task distribution can be partitioned into a union of conditional task distributions.<n>We then examine the use of task-specific prompts and prediction heads for learning the prior information associated with the conditional task distribution using a one-layer attention model.
arXiv Detail & Related papers (2025-03-03T22:37:03Z) - Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Exploiting Task Relationships for Continual Learning Using Transferability-Aware Task Embeddings [8.000144830397911]
Continual learning (CL) has been an essential topic in the contemporary application of deep neural networks.<n>We propose a transferability-aware task embedding named H-embedding and train a hypernet under its guidance to learn task-conditioned model weights for CL tasks.
arXiv Detail & Related papers (2025-02-17T09:52:19Z) - Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - A Tensor Low-Rank Approximation for Value Functions in Multi-Task Reinforcement Learning [10.359616364592073]
In pursuit of reinforcement learning systems that could train in physical environments, we investigate multi-task approaches.<n>A low-rank structure enforces the notion of similarity, without the need to explicitly prescribe which tasks are similar.<n>The efficiency of our low-rank tensor approach to multi-task learning is demonstrated in two numerical experiments.
arXiv Detail & Related papers (2025-01-17T20:07:11Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Neural Approximate Sufficient Statistics for Implicit Models [34.44047460667847]
We frame the task of constructing sufficient statistics as learning mutual information maximizing representations of the data with the help of deep neural networks.
We apply our approach to both traditional approximate Bayesian computation and recent neural likelihood methods, boosting their performance on a range of tasks.
arXiv Detail & Related papers (2020-10-20T07:11:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.