AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning
- URL: http://arxiv.org/abs/2407.07094v1
- Date: Tue, 9 Jul 2024 17:59:56 GMT
- Title: AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning
- Authors: Jiaxi Cui, Wentao Zhang, Jing Tang, Xudong Tong, Zhenwei Zhang, Amie, Jing Wen, Rongsheng Wang, Pengfei Wu,
- Abstract summary: This work introduces textbfAnyTaskTune, a novel fine-tuning methodology to elevate model performance on a diverse array of domain-specific tasks.
We conducted experiments across over twenty different sub-tasks derived from the domains of finance, healthcare, law, psychology, consumer services, and human resources.
Our findings demonstrate that models fine-tuned using the textbfTask-Fine-Tune methodology not only achieve superior performance on these specific tasks but also significantly outperform models with higher general capabilities in their respective domains.
- Score: 22.208506419271043
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The pervasive deployment of Large Language Models-LLMs in various sectors often neglects the nuanced requirements of individuals and small organizations, who benefit more from models precisely tailored to their specific business contexts rather than those with broadly superior general capabilities. This work introduces \textbf{AnyTaskTune}, a novel fine-tuning methodology coined as \textbf{Task-Fine-Tune}, specifically developed to elevate model performance on a diverse array of domain-specific tasks. This method involves a meticulous process to identify and define targeted sub-tasks within a domain, followed by the creation of specialized enhancement datasets for fine-tuning, thereby optimizing task-specific model performance. We conducted comprehensive fine-tuning experiments not only in the legal domain for tasks such as keyword extraction and sentence prediction but across over twenty different sub-tasks derived from the domains of finance, healthcare, law, psychology, consumer services, and human resources. To substantiate our approach and facilitate community engagement, we will open-source these bilingual task datasets. Our findings demonstrate that models fine-tuned using the \textbf{Task-Fine-Tune} methodology not only achieve superior performance on these specific tasks but also significantly outperform models with higher general capabilities in their respective domains. Our work is publicly available at \url{https://github.com/PandaVT/DataTager}.
Related papers
- Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner [12.360598915420255]
We propose textbfSODP, a two-stage framework that learns a textbfDiffusion textbfPlanner, which is generalizable for various downstream tasks.
In the pre-training stage, we train a foundation diffusion planner that extracts general planning capabilities by modeling the versatile distribution of multi-task trajectories.
Then for downstream tasks, we adopt RL-based fine-tuning with task-specific rewards to fast refine the diffusion planner.
arXiv Detail & Related papers (2024-09-30T05:05:37Z) - BoRA: Bayesian Hierarchical Low-Rank Adaption for Multi-task Large Language Models [0.0]
This paper introduces Bayesian Hierarchical Low-Rank Adaption (BoRA), a novel method for finetuning multi-task Large Language Models (LLMs)
BoRA addresses trade-offs by leveraging a Bayesian hierarchical model that allows tasks to share information through global hierarchical priors.
Our experimental results show that BoRA outperforms both individual and unified model approaches, achieving lower perplexity and better generalization across tasks.
arXiv Detail & Related papers (2024-07-08T06:38:50Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - MultiMatch: Multi-task Learning for Semi-supervised Domain Generalization [55.06956781674986]
We resort to solving the semi-supervised domain generalization task, where there are a few label information in each source domain.
We propose MultiMatch, extending FixMatch to the multi-task learning framework, producing the high-quality pseudo-label for SSDG.
A series of experiments validate the effectiveness of the proposed method, and it outperforms the existing semi-supervised methods and the SSDG method on several benchmark DG datasets.
arXiv Detail & Related papers (2022-08-11T14:44:33Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Semi-supervised Multi-task Learning for Semantics and Depth [88.77716991603252]
Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.
We propose the Semi-supervised Multi-Task Learning (MTL) method to leverage the available supervisory signals from different datasets.
We present a domain-aware discriminator structure with various alignment formulations to mitigate the domain discrepancy issue among datasets.
arXiv Detail & Related papers (2021-10-14T07:43:39Z) - XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation [80.18830380517753]
We develop a new task-agnostic distillation framework XtremeDistilTransformers.
We study the transferability of several source tasks, augmentation resources and model architecture for distillation.
arXiv Detail & Related papers (2021-06-08T17:49:33Z) - Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads.
We explore various attention-based contexts, such as global and local, in the multi-task setting.
We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z) - Modelling Latent Skills for Multitask Language Generation [15.126163032403811]
We present a generative model for multitask conditional language generation.
Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks.
We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model.
arXiv Detail & Related papers (2020-02-21T20:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.