A Multi-Format Transfer Learning Model for Event Argument Extraction via
Variational Information Bottleneck
- URL: http://arxiv.org/abs/2208.13017v2
- Date: Tue, 30 Aug 2022 02:32:01 GMT
- Title: A Multi-Format Transfer Learning Model for Event Argument Extraction via
Variational Information Bottleneck
- Authors: Jie Zhou and Qi Zhang and Qin Chen and Liang He and Xuanjing Huang
- Abstract summary: Event argument extraction (EAE) aims to extract arguments with given roles from texts.
We propose a multi-format transfer learning model with variational information bottleneck.
We conduct extensive experiments on three benchmark datasets, and obtain new state-of-the-art performance on EAE.
- Score: 68.61583160269664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event argument extraction (EAE) aims to extract arguments with given roles
from texts, which have been widely studied in natural language processing. Most
previous works have achieved good performance in specific EAE datasets with
dedicated neural architectures. Whereas, these architectures are usually
difficult to adapt to new datasets/scenarios with various annotation schemas or
formats. Furthermore, they rely on large-scale labeled data for training, which
is unavailable due to the high labelling cost in most cases. In this paper, we
propose a multi-format transfer learning model with variational information
bottleneck, which makes use of the information especially the common knowledge
in existing datasets for EAE in new datasets. Specifically, we introduce a
shared-specific prompt framework to learn both format-shared and
format-specific knowledge from datasets with different formats. In order to
further absorb the common knowledge for EAE and eliminate the irrelevant noise,
we integrate variational information bottleneck into our architecture to refine
the shared representation. We conduct extensive experiments on three benchmark
datasets, and obtain new state-of-the-art performance on EAE.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Semantic-Aware Representation of Multi-Modal Data for Data Ingress: A Literature Review [1.8590097948961688]
Generative AI such as Large Language Models (LLMs) sees broad adoption to process multi-modal data such as text, images, audio, and video.
Managing this data efficiently has become a significant practical challenge in the industry-double as much data is not double as good.
This study focuses on the different semantic-aware techniques to extract embeddings from mono-modal, multi-modal, and cross-modal data.
arXiv Detail & Related papers (2024-07-17T09:49:11Z) - EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity
Information Extraction from Document Images [27.36816896426097]
Information Extraction from document images is challenging due to the high variability of layout formats.
We propose a novel approach, EIGEN, which combines rule-based methods with deep learning models using data programming approaches.
We empirically show that our EIGEN framework can significantly improve the performance of state-of-the-art deep models with the availability of very few labeled data instances.
arXiv Detail & Related papers (2023-11-23T13:20:42Z) - Aggregating Intrinsic Information to Enhance BCI Performance through
Federated Learning [29.65566062475597]
Insufficient data is a long-standing challenge for Brain-Computer Interface (BCI) to build a high-performance deep learning model.
We propose a hierarchical personalized Federated Learning EEG decoding framework to surmount this challenge.
arXiv Detail & Related papers (2023-08-14T08:59:44Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.