Related papers: An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding

An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding

URL: http://arxiv.org/abs/2210.11729v1
Date: Fri, 21 Oct 2022 04:36:46 GMT
Title: An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding
Authors: Josiah Ross, Luke Yoffe, Alon Albalak, William Yang Wang
Abstract summary: This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain. Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
Score: 65.75873687351553
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transfer learning is an exciting area of Natural Language Processing that has the potential to both improve model performance and increase data efficiency. This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain. We hypothesize that a model can utilize the information learned from a source task to better learn a target task, thereby reducing the number of target task training samples required. Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning. Our results lead us to believe that this unexpected result could be due to the effects of catastrophic forgetting, motivating further work into methods that prevent such forgetting.

Related papers

Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training. We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO. As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z)
Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data. The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task. We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z)
TIDo: Source-free Task Incremental Learning in Non-stationary Environments [0.0]
Updating a model-based agent to learn new target tasks requires us to store past training data. Few-shot task incremental learning methods overcome the limitation of labeled target datasets. We propose a one-shot task incremental learning approach that can adapt to non-stationary source and target tasks.
arXiv Detail & Related papers (2023-01-28T02:19:45Z)
Frozen Overparameterization: A Double Descent Perspective on Transfer Learning of Deep Neural Networks [27.17697714584768]
We study the generalization behavior of transfer learning of deep neural networks (DNNs) We show that the test error evolution during the target training has a more significant double descent effect when the target training dataset is sufficiently large. Also, we show that the double descent phenomenon may make a transfer from a less related source task better than a transfer from a more related source task.
arXiv Detail & Related papers (2022-11-20T20:26:23Z)
A Data-Based Perspective on Transfer Learning [76.30206800557411]
We take a closer look at the role of the source dataset's composition in transfer learning. Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness.
arXiv Detail & Related papers (2022-07-12T17:58:28Z)
The Effect of Task Ordering in Continual Learning [12.571389210876315]
We show that reordering tasks significantly affects the amount of catastrophic forgetting. We show that the effect of task ordering can be exploited to modify continual learning performance.
arXiv Detail & Related papers (2022-05-26T12:56:15Z)
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model. Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning. To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z)
Adaptive Transfer Learning on Graph Neural Networks [4.233435459239147]
Graph neural networks (GNNs) are widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. We propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task.
arXiv Detail & Related papers (2021-07-19T11:46:28Z)
Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems. Our results show that transfer learning is more beneficial than previously thought. We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL) Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks. As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.