An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding
- URL: http://arxiv.org/abs/2210.11729v1
- Date: Fri, 21 Oct 2022 04:36:46 GMT
- Title: An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding
- Authors: Josiah Ross, Luke Yoffe, Alon Albalak, William Yang Wang
- Abstract summary: This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
- Score: 65.75873687351553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning is an exciting area of Natural Language Processing that has
the potential to both improve model performance and increase data efficiency.
This study explores the effects of varying quantities of target task training
data on sequential transfer learning in the dialog domain. We hypothesize that
a model can utilize the information learned from a source task to better learn
a target task, thereby reducing the number of target task training samples
required. Unintuitively, our data shows that often target task training data
size has minimal effect on how sequential transfer learning performs compared
to the same model without transfer learning. Our results lead us to believe
that this unexpected result could be due to the effects of catastrophic
forgetting, motivating further work into methods that prevent such forgetting.
Related papers
- Characterization of Transfer Using Multi-task Learning Curves [0.0]
We show that perturbing the data set by including more samples, instead of perturbing the model by updates gradient, provides a complementary and more fundamental characterization of transfer effects.<n>We model transfer effects using multi-task learning curves approximating the inductive performance over varying sample sizes.<n>Our results show that learning curves can better capture the effects of multi-task learning and their multi-task extensions can delineate pairwise and contextual transfer effects in foundation models.
arXiv Detail & Related papers (2025-12-31T13:55:18Z) - Latent Traits and Cross-Task Transfer: Deconstructing Dataset Interactions in LLM Fine-tuning [8.082936847467638]
We propose an analysis framework, building a transfer learning matrix and dimensionality reduction, to dissect cross-task interactions.<n>We train and analyze 10 models to identify latent abilities (e.g., Reasoning, Sentiment Classification, NLU, Arithmetic) and discover the side effects of the transfer learning.
arXiv Detail & Related papers (2025-09-17T01:45:42Z) - Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training.
We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO.
As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z) - Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data.
The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task.
We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z) - TIDo: Source-free Task Incremental Learning in Non-stationary
Environments [0.0]
Updating a model-based agent to learn new target tasks requires us to store past training data.
Few-shot task incremental learning methods overcome the limitation of labeled target datasets.
We propose a one-shot task incremental learning approach that can adapt to non-stationary source and target tasks.
arXiv Detail & Related papers (2023-01-28T02:19:45Z) - Frozen Overparameterization: A Double Descent Perspective on Transfer
Learning of Deep Neural Networks [27.17697714584768]
We study the generalization behavior of transfer learning of deep neural networks (DNNs)
We show that the test error evolution during the target training has a more significant double descent effect when the target training dataset is sufficiently large.
Also, we show that the double descent phenomenon may make a transfer from a less related source task better than a transfer from a more related source task.
arXiv Detail & Related papers (2022-11-20T20:26:23Z) - A Data-Based Perspective on Transfer Learning [76.30206800557411]
We take a closer look at the role of the source dataset's composition in transfer learning.
Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness.
arXiv Detail & Related papers (2022-07-12T17:58:28Z) - The Effect of Task Ordering in Continual Learning [12.571389210876315]
We show that reordering tasks significantly affects the amount of catastrophic forgetting.
We show that the effect of task ordering can be exploited to modify continual learning performance.
arXiv Detail & Related papers (2022-05-26T12:56:15Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Adaptive Transfer Learning on Graph Neural Networks [4.233435459239147]
Graph neural networks (GNNs) are widely used to learn a powerful representation of graph-structured data.
Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation.
We propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task.
arXiv Detail & Related papers (2021-07-19T11:46:28Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.