SkillNet-NLG: General-Purpose Natural Language Generation with a
Sparsely Activated Approach
- URL: http://arxiv.org/abs/2204.12184v1
- Date: Tue, 26 Apr 2022 09:37:01 GMT
- Title: SkillNet-NLG: General-Purpose Natural Language Generation with a
Sparsely Activated Approach
- Authors: Junwei Liao, Duyu Tang, Fan Zhang, Shuming Shi
- Abstract summary: SkillNet-NLG is a sparsely activated approach that handles many natural language generation tasks with one model.
We evaluate on Chinese natural language generation tasks.
- Score: 32.79493780508332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present SkillNet-NLG, a sparsely activated approach that handles many
natural language generation tasks with one model. Different from traditional
dense models that always activate all the parameters, SkillNet-NLG selectively
activates relevant parts of the parameters to accomplish a task, where the
relevance is controlled by a set of predefined skills. The strength of such
model design is that it provides an opportunity to precisely adapt relevant
skills to learn new tasks effectively. We evaluate on Chinese natural language
generation tasks. Results show that, with only one model file, SkillNet-NLG
outperforms previous best performance methods on four of five tasks.
SkillNet-NLG performs better than two multi-task learning baselines (a dense
model and a Mixture-of-Expert model) and achieves comparable performance to
task-specific models. Lastly, SkillNet-NLG surpasses baseline systems when
being adapted to new tasks.
Related papers
- UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - SkillNet-X: A Multilingual Multitask Model with Sparsely Activated
Skills [51.74947795895178]
This paper proposes a general multilingual multitask model, named SkillNet-X.
We define several language-specific skills and task-specific skills, each of which corresponds to a skill module.
We evaluate SkillNet-X on eleven natural language understanding datasets in four languages.
arXiv Detail & Related papers (2023-06-28T12:53:30Z) - Skill-Based Few-Shot Selection for In-Context Learning [123.26522773708683]
Skill-KNN is a skill-based few-shot selection method for in-context learning.
It does not require training or fine-tuning of any models, making it suitable for frequently expanding or changing example banks.
Experimental results across five cross-domain semantic parsing datasets and six backbone models show that Skill-KNN significantly outperforms existing methods.
arXiv Detail & Related papers (2023-05-23T16:28:29Z) - One Model, Multiple Modalities: A Sparsely Activated Approach for Text,
Sound, Image, Video and Code [26.40920402395547]
This paper presents an approach that excels at handling multiple modalities of information with a single model.
We develop our model for five modalities including text, image, sound, video and code.
Our model supports self-supervised pretraining with the same sparsely activated way, resulting in better parameters for different modalities.
arXiv Detail & Related papers (2022-05-12T14:39:21Z) - One Model, Multiple Tasks: Pathways for Natural Language Understanding [34.58880663537492]
This paper presents a Pathways approach to handle many tasks at once.
Unlike prevailing single-purpose models that overspecialize at individual tasks and learn from scratch when being extended to new tasks, our approach is general-purpose with the ability of stitching together existing skills to learn new tasks more effectively.
arXiv Detail & Related papers (2022-03-07T11:48:09Z) - GLGE: A New General Language Generation Evaluation Benchmark [139.25515221280767]
General Language Generation Evaluation (GLGE) is a new multi-task benchmark for evaluating the generalization capabilities of NLG models.
To encourage research on pretraining and transfer learning on NLG models, we make GLGE publicly available and build a leaderboard with strong baselines.
arXiv Detail & Related papers (2020-11-24T06:59:45Z) - HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable
Hyper Projections [96.64246471034195]
We propose textscHyperGrid, a new approach for highly effective multi-task learning.
Our method helps bridge the gap between fine-tuning and multi-task learning approaches.
arXiv Detail & Related papers (2020-07-12T02:49:16Z) - Exploring Versatile Generative Language Model Via Parameter-Efficient
Transfer Learning [70.81910984985683]
We propose an effective way to fine-tune multiple down-stream generation tasks simultaneously using a single, large pre-trained model.
The experiments on five diverse language generation tasks show that by just using an additional 2-3% parameters for each task, our model can maintain or even improve the performance of fine-tuning the whole model.
arXiv Detail & Related papers (2020-04-08T06:18:44Z) - Modelling Latent Skills for Multitask Language Generation [15.126163032403811]
We present a generative model for multitask conditional language generation.
Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks.
We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model.
arXiv Detail & Related papers (2020-02-21T20:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.