SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence
Understanding
- URL: http://arxiv.org/abs/2308.10529v1
- Date: Mon, 21 Aug 2023 07:31:19 GMT
- Title: SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence
Understanding
- Authors: Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, Wei
Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu
Zhang, Pengjun Xie, Fei Huang, Yong Jiang
- Abstract summary: Large language models (LLMs) have shown impressive ability for open-domain NLP tasks.
We present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding.
- Score: 103.34092301324425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have shown impressive ability for open-domain
NLP tasks. However, LLMs are sometimes too footloose for natural language
understanding (NLU) tasks which always have restricted output and input format.
Their performances on NLU tasks are highly related to prompts or demonstrations
and are shown to be poor at performing several representative NLU tasks, such
as event extraction and entity typing. To this end, we present SeqGPT, a
bilingual (i.e., English and Chinese) open-source autoregressive model
specially enhanced for open-domain natural language understanding. We express
all NLU tasks with two atomic tasks, which define fixed instructions to
restrict the input and output format but still ``open'' for arbitrarily varied
label sets. The model is first instruction-tuned with extremely fine-grained
labeled data synthesized by ChatGPT and then further fine-tuned by 233
different atomic tasks from 152 datasets across various domains. The
experimental results show that SeqGPT has decent classification and extraction
ability, and is capable of performing language understanding tasks on unseen
domains. We also conduct empirical studies on the scaling of data and model
size as well as on the transfer across tasks. Our model is accessible at
https://github.com/Alibaba-NLP/SeqGPT.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Scaling Behavior of Machine Translation with Large Language Models under Prompt Injection Attacks [4.459306403129608]
Large Language Models (LLMs) are increasingly becoming the preferred foundation platforms for many Natural Language Processing tasks.
Their generality opens them up to subversion by end users who may embed into their requests instructions that cause the model to behave in unauthorized and possibly unsafe ways.
We study these Prompt Injection Attacks (PIAs) on multiple families of LLMs on a Machine Translation task, focusing on the effects of model size on the attack success rates.
arXiv Detail & Related papers (2024-03-14T19:39:10Z) - Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in
Low-Resource Languages [0.0]
"prompting" is where a user provides a description of a task and some completed examples of the task to a PLM as context before prompting the PLM to perform the task on a new example.
We consider three methods: few-shot prompting (prompt), language-adaptive fine-tuning (LAFT), and neural machine translation (translate)
We find that translate and prompt settings are a compute-efficient and cost-effective method of few-shot prompting for the selected low-resource languages.
arXiv Detail & Related papers (2024-03-09T21:36:13Z) - UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - Coupling Large Language Models with Logic Programming for Robust and
General Reasoning from Text [5.532477732693001]
We show that a large language model can serve as a highly effective few-shot semantically.
It can convert natural language sentences into a logical form that serves as input for answer set programs.
We demonstrate that this method achieves state-of-the-art performance on several benchmarks, including bAbI, StepGame, CLUTRR, and gSCAN.
arXiv Detail & Related papers (2023-07-15T03:29:59Z) - EXnet: Efficient In-context Learning for Data-less Text classification [0.0]
We present EXnet, a model specifically designed to perform in-context learning without limitations on the number of examples.
We argue that in-context learning is an effective method to increase task accuracy, and providing examples facilitates cross-task generalization.
With extensive experiments, we show that even our smallest model (15M parameters) generalizes to several unseen classification tasks and domains.
arXiv Detail & Related papers (2023-05-24T01:40:57Z) - Crosslingual Generalization through Multitask Finetuning [80.8822603322471]
Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting.
We apply MTF to the pretrained multilingual BLOOM and mT5 model families to produce finetuned variants called BLOOMZ and mT0.
We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages.
arXiv Detail & Related papers (2022-11-03T13:19:32Z) - Multitask Prompted Training Enables Zero-Shot Task Generalization [70.12770442071657]
We develop a system for mapping general natural language tasks into a human-readable prompted form.
We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks.
The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size.
arXiv Detail & Related papers (2021-10-15T17:08:57Z) - Universal Natural Language Processing with Limited Annotations: Try
Few-shot Textual Entailment as a Start [125.23550801424328]
Universal Few-shot textual Entailment (UFO-Entail)
We introduce Universal Few-shot textual Entailment (UFO-Entail)
We demonstrate that this framework enables a pretrained entailment model to work well on new entailment domains in a few-shot setting.
arXiv Detail & Related papers (2020-10-06T09:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.