Related papers: ADELIE: Aligning Large Language Models on Information Extraction

ADELIE: Aligning Large Language Models on Information Extraction

URL: http://arxiv.org/abs/2405.05008v1
Date: Wed, 8 May 2024 12:24:52 GMT
Title: ADELIE: Aligning Large Language Models on Information Extraction
Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li,
Abstract summary: Large language models (LLMs) usually fall short on information extraction tasks. In this paper, we introduce ADELIE, an aligned LLM that effectively solves various IE tasks. We show that our models achieve state-of-the-art (SoTA) performance among open-source models.
Score: 55.60192044049083
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effectively solves various IE tasks, including closed IE, open IE, and on-demand IE. We first collect and construct a high-quality alignment corpus IEInstruct for IE. Then we train ADELIE_SFT using instruction tuning on IEInstruct. We further train ADELIE_SFT with direct preference optimization (DPO) objective, resulting in ADELIE_DPO. Extensive experiments on various held-out IE datasets demonstrate that our models (ADELIE_SFT and ADELIE_DPO) achieve state-of-the-art (SoTA) performance among open-source models. We further explore the general capabilities of ADELIE, and experimental results reveal that their general capabilities do not exhibit a noticeable decline. We will release the code, data, and models to facilitate further research.

Related papers

Towards Robust Universal Information Extraction: Benchmark, Evaluation, and Solution [66.11004226578771]
Existing robust benchmark datasets have two key limitations. They generate only a limited range of perturbations for a single Information Extraction (IE) task. Considering the powerful generation capabilities of Large Language Models (LLMs), we introduce a new benchmark dataset for Robust UIE, called RUIE-Bench. We show that training with only textbf15% of the data leads to an average textbf7.5% relative performance improvement across three IE tasks.
arXiv Detail & Related papers (2025-03-05T05:39:29Z)
MetaIE: Distilling a Meta Model from LLM for All Kinds of Information Extraction Tasks [40.84745946091173]
We propose a novel framework MetaIE to build a small LM as meta-model by learning to extract "important information" Specifically, MetaIE obtains the small LM via a symbolic distillation from an LLM following the label-to-span scheme. We construct the distillation dataset via sampling sentences from language model pre-training datasets. We evaluate the meta-model under the few-shot adaptation setting.
arXiv Detail & Related papers (2024-03-30T19:43:45Z)
Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation. We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques. We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z)
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction [25.028613319081696]
We propose GoLLIE (Guideline-following Large Language Model for IE), a model able to improve zero-shot results on unseen IE tasks. GoLLIE is able to generalize to and follow unseen guidelines, outperforming previous attempts at zero-shot information extraction.
arXiv Detail & Related papers (2023-10-05T16:43:13Z)
PIVOINE: Instruction Tuning for Open-world Information Extraction [53.98073623222221]
We consider the problem of Open-world Information Extraction (Open-world IE), which extracts comprehensive entity profiles from unstructured texts. We develop a large language model (LLM) that is able to perform Open-world IE to extract desirable entity profiles characterized by (possibly fine-grained) natural language instructions. In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions.
arXiv Detail & Related papers (2023-05-24T08:52:08Z)
InteractiveIE: Towards Assessing the Strength of Human-AI Collaboration in Improving the Performance of Information Extraction [48.45550809455558]
We show how a proxy human-supervision on-the-fly (termed as InteractiveIE) can boost the performance of learning template based information extraction from documents. Experiments on biomedical and legal documents, where obtaining training data is expensive, reveal encouraging trends of performance improvement using InteractiveIE over AI-only baseline.
arXiv Detail & Related papers (2023-05-24T02:53:22Z)
InstructIE: A Bilingual Instruction-based Information Extraction Dataset [44.65162892808696]
Large language models can perform well on general natural language tasks, but their effectiveness is still suboptimal for information extraction (IE) Recent works indicate that the main reason lies in the lack of extensive data on IE instructions. We introduce InstructIE, a bilingual instruction-based IE dataset, which covers 12 diverse domains.
arXiv Detail & Related papers (2023-05-19T08:51:11Z)
CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors [92.17328076003628]
Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. In this paper, we propose to recast the structured output in the form of code instead of natural language.
arXiv Detail & Related papers (2023-05-09T18:40:31Z)
LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction [0.9966318185310058]
We introduce a new dataset by converting the QA-SRL 2.0 dataset to a large-scale Open Information Extraction (OIE) dataset (LSOIE) Our LSOIE dataset is 20 times larger than the next largest human-annotated OIE dataset (LSOIE)
arXiv Detail & Related papers (2021-01-27T02:49:26Z)
SciREX: A Challenge Dataset for Document-Level Information Extraction [56.83748634747753]
It is challenging to create a large-scale information extraction dataset at the document level. We introduce SciREX, a document level IE dataset that encompasses multiple IE tasks. We develop a neural model as a strong baseline that extends previous state-of-the-art IE models to document-level IE.
arXiv Detail & Related papers (2020-05-01T17:30:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.