Learning to Perform Complex Tasks through Compositional Fine-Tuning of
Language Models
- URL: http://arxiv.org/abs/2210.12607v1
- Date: Sun, 23 Oct 2022 03:22:34 GMT
- Title: Learning to Perform Complex Tasks through Compositional Fine-Tuning of
Language Models
- Authors: Victor S. Bursztyn, David Demeter, Doug Downey, Larry Birnbaum
- Abstract summary: compositional fine-tuning is an approach based on explicitly decomposing a target task into component tasks.
We show that CFT outperforms end-to-end learning even with equal amounts of data.
- Score: 20.173322408302134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to usefully encode compositional task structure has long been a core
challenge in AI. Recent work in chain of thought prompting has shown that for
very large neural language models (LMs), explicitly demonstrating the
inferential steps involved in a target task may improve performance over
end-to-end learning that focuses on the target task alone. However, chain of
thought prompting has significant limitations due to its dependency on huge
pretrained LMs. In this work, we present compositional fine-tuning (CFT): an
approach based on explicitly decomposing a target task into component tasks,
and then fine-tuning smaller LMs on a curriculum of such component tasks. We
apply CFT to recommendation tasks in two domains, world travel and local
dining, as well as a previously studied inferential task (sports
understanding). We show that CFT outperforms end-to-end learning even with
equal amounts of data, and gets consistently better as more component tasks are
modeled via fine-tuning. Compared with chain of thought prompting, CFT performs
at least as well using LMs only 7.4% of the size, and is moreover applicable to
task domains for which data are not available during pretraining.
Related papers
- Task Addition in Multi-Task Learning by Geometrical Alignment [4.220885199861056]
We propose a task addition approach for GATE to improve performance on target tasks with limited data.
It is achieved through supervised multi-task pre-training on a large dataset, followed by the addition and training of task-specific modules for each target task.
Our experiments demonstrate the superior performance of the task addition strategy for GATE over conventional multi-task methods, with comparable computational costs.
arXiv Detail & Related papers (2024-09-25T05:56:00Z) - Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly.
We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks.
Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - TaskLAMA: Probing the Complex Task Understanding of Language Models [13.336015994186955]
Structured Complex Task Decomposition (SCTD) is a problem of breaking down a complex real-world task into a directed acyclic graph over individual steps that contribute to achieving the task.
We probe how accurately SCTD can be done with the knowledge extracted from Large Language Models (LLMs)
Our experiments reveal that LLMs are able to decompose complex tasks into individual steps effectively, with a relative improvement of 15% to 280% over the best baseline.
arXiv Detail & Related papers (2023-08-29T13:36:45Z) - Task Residual for Tuning Vision-Language Models [69.22958802711017]
We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes)
TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task.
The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
arXiv Detail & Related papers (2022-11-18T15:09:03Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - On-edge Multi-task Transfer Learning: Model and Practice with
Data-driven Task Allocation [20.20889051697198]
We show that task allocation with task importance for Multi-task Transfer Learning (MTL) is a variant of the NP-complete Knapsack problem.
We propose a Data-driven Cooperative Task Allocation (DCTA) approach to solve TATIM with high computational efficiency.
Our DCTA reduces 3.24 times of processing time, and saves 48.4% energy consumption compared with the state-of-the-art when solving TATIM.
arXiv Detail & Related papers (2021-07-06T08:24:25Z) - Weighted Training for Cross-Task Learning [71.94908559469475]
We introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning.
We show that TAWT is easy to implement, is computationally efficient, requires little hyper parameter tuning, and enjoys non-asymptotic learning-theoretic guarantees.
As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning.
arXiv Detail & Related papers (2021-05-28T20:27:02Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.