An Effective System for Multi-format Information Extraction
- URL: http://arxiv.org/abs/2108.06957v1
- Date: Mon, 16 Aug 2021 08:25:17 GMT
- Title: An Effective System for Multi-format Information Extraction
- Authors: Yaduo Liu, Longhui Zhang, Shujuan Yin, Xiaofeng Zhao, Feiliang Ren
- Abstract summary: The 2021 Language and Intelligence Challenge is designed to evaluate information extraction from different dimensions.
Here we describe our system for this multi-format information extraction competition task.
Our system ranks No.4 on the test set leader-board of this multi-format information extraction task.
- Score: 1.027461951217988
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The multi-format information extraction task in the 2021 Language and
Intelligence Challenge is designed to comprehensively evaluate information
extraction from different dimensions. It consists of an multiple slots relation
extraction subtask and two event extraction subtasks that extract events from
both sentence-level and document-level. Here we describe our system for this
multi-format information extraction competition task. Specifically, for the
relation extraction subtask, we convert it to a traditional triple extraction
task and design a voting based method that makes full use of existing models.
For the sentence-level event extraction subtask, we convert it to a NER task
and use a pointer labeling based method for extraction. Furthermore,
considering the annotated trigger information may be helpful for event
extraction, we design an auxiliary trigger recognition model and use the
multi-task learning mechanism to integrate the trigger features into the event
extraction model. For the document-level event extraction subtask, we design an
Encoder-Decoder based method and propose a Transformer-alike decoder.
Finally,our system ranks No.4 on the test set leader-board of this multi-format
information extraction task, and its F1 scores for the subtasks of relation
extraction, event extractions of sentence-level and document-level are 79.887%,
85.179%, and 70.828% respectively. The codes of our model are available at
{https://github.com/neukg/MultiIE}.
Related papers
- Are Triggers Needed for Document-Level Event Extraction? [16.944314894087075]
We provide the first investigation of the role of triggers for the more difficult and much less studied task of document-level event extraction.
We analyze their usefulness in multiple end-to-end and pipelined neural event extraction models for three document-level event extraction datasets.
Our research shows that trigger effectiveness varies based on the extraction task's characteristics and data quality, with basic, automatically-generated triggers serving as a viable alternative to human-annotated ones.
arXiv Detail & Related papers (2024-11-13T15:50:38Z) - Automated Few-shot Classification with Instruction-Finetuned Language
Models [76.69064714392165]
We show that AuT-Few outperforms state-of-the-art few-shot learning methods.
We also show that AuT-Few is the best ranking method across datasets on the RAFT few-shot benchmark.
arXiv Detail & Related papers (2023-05-21T21:50:27Z) - Jointly Learning Span Extraction and Sequence Labeling for Information
Extraction from Business Documents [1.6249267147413522]
This paper introduces a new information extraction model for business documents.
It takes into account advantage of both span extraction and sequence labeling.
The model is trained end-to-end to jointly optimize the two tasks.
arXiv Detail & Related papers (2022-05-26T15:37:24Z) - AIFB-WebScience at SemEval-2022 Task 12: Relation Extraction First --
Using Relation Extraction to Identify Entities [0.0]
We present an end-to-end joint entity and relation extraction approach based on transformer-based language models.
In contrast to existing approaches, which perform entity and relation extraction in sequence, our system incorporates information from relation extraction into entity extraction.
arXiv Detail & Related papers (2022-03-10T12:19:44Z) - Joint Multimedia Event Extraction from Video and Article [51.159034070824056]
We propose the first approach to jointly extract events from video and text articles.
First, we propose the first self-supervised multimodal event coreference model.
Second, we introduce the first multimodal transformer which extracts structured event information jointly from both videos and text documents.
arXiv Detail & Related papers (2021-09-27T03:22:12Z) - Zero-Shot Information Extraction as a Unified Text-to-Triple Translation [56.01830747416606]
We cast a suite of information extraction tasks into a text-to-triple translation framework.
We formalize the task as a translation between task-specific input text and output triples.
We study the zero-shot performance of this framework on open information extraction.
arXiv Detail & Related papers (2021-09-23T06:54:19Z) - Document-Level Event Argument Extraction by Conditional Generation [75.73327502536938]
Event extraction has long been treated as a sentence-level task in the IE community.
We propose a document-level neural event argument extraction model by formulating the task as conditional generation following event templates.
We also compile a new document-level event extraction benchmark dataset WikiEvents.
arXiv Detail & Related papers (2021-04-13T03:36:38Z) - IMoJIE: Iterative Memory-Based Joint Open Information Extraction [37.487044478970965]
We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracteds.
IMoJIE outperforms CopyAttention by about 18 F1 pts, and a BERT-based strong baseline by 2 F1 pts.
arXiv Detail & Related papers (2020-05-17T07:04:08Z) - Document-Level Event Role Filler Extraction using Multi-Granularity
Contextualized Encoding [40.13163091122463]
Event extraction is a difficult task since it requires a view of a larger context to determine which spans of text correspond to event role fillers.
We first investigate how end-to-end neural sequence models perform on document-level role filler extraction.
We show that our best system performs substantially better than prior work.
arXiv Detail & Related papers (2020-05-13T20:42:17Z) - Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic
Conditional Random Fields [67.51177964010967]
We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks.
We find that explicit modeling of inter-dependencies between task predictions outperforms single-task as well as standard multi-task models.
arXiv Detail & Related papers (2020-05-01T07:11:34Z) - At Which Level Should We Extract? An Empirical Analysis on Extractive
Document Summarization [110.54963847339775]
We show that unnecessity and redundancy issues exist when extracting full sentences.
We propose extracting sub-sentential units based on the constituency parsing tree.
arXiv Detail & Related papers (2020-04-06T13:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.