Related papers: CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

URL: http://arxiv.org/abs/2305.05711v2
Date: Thu, 11 May 2023 01:27:43 GMT
Title: CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors
Authors: Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu
Abstract summary: Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. In this paper, we propose to recast the structured output in the form of code instead of natural language.
Score: 92.17328076003628
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it. However, it is nontrivial to perform information extraction (IE) tasks with NL-LLMs since the output of the IE task is usually structured and therefore is hard to be converted into plain text. In this paper, we propose to recast the structured output in the form of code instead of natural language and utilize generative LLMs of code (Code-LLMs) such as Codex to perform IE tasks, in particular, named entity recognition and relation extraction. In contrast to NL-LLMs, we show that Code-LLMs can be well-aligned with these IE tasks by designing code-style prompts and formulating these IE tasks as code generation tasks. Experiment results on seven benchmarks show that our method consistently outperforms fine-tuning moderate-size pre-trained models specially designed for IE tasks (e.g., UIE) and prompting NL-LLMs under few-shot settings. We further conduct a series of in-depth analyses to demonstrate the merits of leveraging Code-LLMs for IE tasks.

Related papers

ADELIE: Aligning Large Language Models on Information Extraction [55.60192044049083]
Large language models (LLMs) usually fall short on information extraction tasks. In this paper, we introduce ADELIE, an aligned LLM that effectively solves various IE tasks. We show that our models achieve state-of-the-art (SoTA) performance among open-source models.
arXiv Detail & Related papers (2024-05-08T12:24:52Z)
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [59.07490387145391]
Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks. Their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language. We introduce a novel instruction tuning dataset, INTERS, encompassing 20 tasks across three fundamental IR categories.
arXiv Detail & Related papers (2024-01-12T12:10:28Z)
Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation. We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques. We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z)
Exploring Large Language Models for Code Explanation [3.2570216147409514]
Large Language Models (LLMs) have made remarkable strides in Natural Language Processing. This study specifically delves into the task of generating natural-language summaries for code snippets, using various LLMs.
arXiv Detail & Related papers (2023-10-25T14:38:40Z)
Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction [46.09887436555637]
This paper introduces a fine-grained IE benchmark dataset tailored for Large Language Models (LLMs) Through extensive evaluations, we observe that encoder-decoder models, particularly T5 and FLAN-T5, perform well in generalizing to unseen information types.
arXiv Detail & Related papers (2023-10-08T09:41:18Z)
Universal Information Extraction with Meta-Pretrained Self-Retrieval [39.69130086395689]
Universal Information Extraction(Universal IE) aims to solve different extraction tasks in a uniform text-to-structure generation manner. Retrieving knowledge from external knowledge bases may help models to overcome this problem but it is impossible to construct a knowledge base suitable for various IE tasks. We propose MetaRetriever to retrieve task-specific knowledge from PLMs to enhance universal IE.
arXiv Detail & Related papers (2023-06-18T00:16:00Z)
ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT [89.49161588240061]
Zero-shot information extraction (IE) aims to build IE systems from the unannotated text. Recent efforts on large language models (LLMs, e.g., GPT-3, ChatGPT) show promising performance on zero-shot settings. We transform the zero-shot IE task into a multi-turn question-answering problem with a two-stage framework (ChatIE)
arXiv Detail & Related papers (2023-02-20T12:57:12Z)
Language Models of Code are Few-Shot Commonsense Learners [106.1531522893209]
Given a natural language input, the goal is to generate a graph such as an event -- or a reasoning-graph. Existing approaches serialize the output graph as a flat list of nodes and edges. We show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language.
arXiv Detail & Related papers (2022-10-13T16:09:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.