An empirical study of task and feature correlations in the reuse of pre-trained models
- URL: http://arxiv.org/abs/2506.01975v2
- Date: Tue, 08 Jul 2025 03:46:46 GMT
- Title: An empirical study of task and feature correlations in the reuse of pre-trained models
- Authors: Jama Hussein Mohamud, Willie Brink,
- Abstract summary: Pre-trained neural networks are commonly used and reused in the machine learning community.<n>This paper introduces an experimental setup through which factors contributing to Bob's empirical success could be studied in silico.<n>We show in controlled real-world scenarios that Bob can effectively reuse Alice's pre-trained network if there are semantic correlations between his and Alice's task.
- Score: 1.0128808054306186
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained neural networks are commonly used and reused in the machine learning community. Alice trains a model for a particular task, and a part of her neural network is reused by Bob for a different task, often to great effect. To what can we ascribe Bob's success? This paper introduces an experimental setup through which factors contributing to Bob's empirical success could be studied in silico. As a result, we demonstrate that Bob might just be lucky: his task accuracy increases monotonically with the correlation between his task and Alice's. Even when Bob has provably uncorrelated tasks and input features from Alice's pre-trained network, he can achieve significantly better than random performance due to Alice's choice of network and optimizer. When there is little correlation between tasks, only reusing lower pre-trained layers is preferable, and we hypothesize the converse: that the optimal number of retrained layers is indicative of task and feature correlation. Finally, we show in controlled real-world scenarios that Bob can effectively reuse Alice's pre-trained network if there are semantic correlations between his and Alice's task.
Related papers
- Additive-Effect Assisted Learning [17.408937094829007]
We develop a two-stage assisted learning architecture for an agent, Alice, to seek assistance from another agent, Bob.
In the first stage, we propose a privacy-aware hypothesis testing-based screening method for Alice to decide on the usefulness of the data from Bob.
We show that Alice can achieve the oracle performance as if the training were from centralized data, both theoretically and numerically.
arXiv Detail & Related papers (2024-05-13T23:24:25Z) - Quantum advantage in a unified scenario and secure detection of resources [49.1574468325115]
We consider a single communication task to study different approaches of observing quantum advantage.<n>In our task, there are three parties - the Manager, Alice, and Bob.<n>We show that the goal of the task can be achieved when Alice sends a qubit.
arXiv Detail & Related papers (2023-09-22T23:06:20Z) - Multitask Learning with No Regret: from Improved Confidence Bounds to
Active Learning [79.07658065326592]
Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning.
We provide novel multitask confidence intervals in the challenging setting when neither the similarity between tasks nor the tasks' features are available to the learner.
We propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance.
arXiv Detail & Related papers (2023-08-03T13:08:09Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Identifying Suitable Tasks for Inductive Transfer Through the Analysis
of Feature Attributions [78.55044112903148]
We use explainability techniques to predict whether task pairs will be complementary, through comparison of neural network activation between single-task models.
Our results show that, through this approach, it is possible to reduce training time by up to 83.5% at a cost of only 0.034 reduction in positive-class F1 on the TREC-IS 2020-A dataset.
arXiv Detail & Related papers (2022-02-02T15:51:07Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Encoding priors in the brain: a reinforcement learning model for mouse
decision making [1.14219428942199]
We study the International Brain Laboratory task, in which a grating appears on either the right or left side of a screen, and a mouse has to move a wheel to bring the grating to the center.
We model this as a reinforcement learning task, using a feedforward neural network to map states to actions, and adjust the weights of the network to maximize reward, learning via policy gradient.
Our model reproduces the main experimental finding - that the psychometric curve with respect to contrast shifts after a block switch in about 10 trials.
arXiv Detail & Related papers (2021-12-10T20:16:36Z) - Asymmetric self-play for automatic goal discovery in robotic
manipulation [12.573331269520077]
We rely on asymmetric self-play for goal discovery, where two agents, Alice and Bob, play a game.
We show that this method can discover highly diverse and complex goals without any human priors.
Our method scales, resulting in a single policy that can generalize to many unseen tasks.
arXiv Detail & Related papers (2021-01-13T05:20:20Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z) - Learning to Branch for Multi-Task Learning [12.49373126819798]
We present an automated multi-task learning algorithm that learns where to share or branch within a network.
We propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure.
arXiv Detail & Related papers (2020-06-02T19:23:21Z) - Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving [60.95424607008241]
We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
arXiv Detail & Related papers (2020-01-04T17:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.