Effective Few-Shot Named Entity Linking by Meta-Learning
- URL: http://arxiv.org/abs/2207.05280v1
- Date: Tue, 12 Jul 2022 03:23:02 GMT
- Title: Effective Few-Shot Named Entity Linking by Meta-Learning
- Authors: Xiuxing Li, Zhenyu Li, Zhengyan Zhang, Ning Liu, Haitao Yuan, Wei
Zhang, Zhiyuan Liu, Jianyong Wang
- Abstract summary: We propose a novel weak supervision strategy to generate non-trivial synthetic entity-mention pairs.
We also design a meta-learning mechanism to assign different weights to each synthetic entity-mention pair automatically.
Experiments on real-world datasets show that the proposed method can extensively improve the state-of-the-art few-shot entity linking model.
- Score: 34.70028855572534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entity linking aims to link ambiguous mentions to their corresponding
entities in a knowledge base, which is significant and fundamental for various
downstream applications, e.g., knowledge base completion, question answering,
and information extraction. While great efforts have been devoted to this task,
most of these studies follow the assumption that large-scale labeled data is
available. However, when the labeled data is insufficient for specific domains
due to labor-intensive annotation work, the performance of existing algorithms
will suffer an intolerable decline. In this paper, we endeavor to solve the
problem of few-shot entity linking, which only requires a minimal amount of
in-domain labeled data and is more practical in real situations. Specifically,
we firstly propose a novel weak supervision strategy to generate non-trivial
synthetic entity-mention pairs based on mention rewriting. Since the quality of
the synthetic data has a critical impact on effective model training, we
further design a meta-learning mechanism to assign different weights to each
synthetic entity-mention pair automatically. Through this way, we can
profoundly exploit rich and precious semantic information to derive a
well-trained entity linking model under the few-shot setting. The experiments
on real-world datasets show that the proposed method can extensively improve
the state-of-the-art few-shot entity linking model and achieve impressive
performance when only a small amount of labeled data is available. Moreover, we
also demonstrate the outstanding ability of the model's transferability.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation [9.068045557591612]
We propose a cost-effective alternative that harnesses multi-source data with only partial or sparse segmentation labels for training.
We devise strategies for model self-disambiguation, prior knowledge incorporation, and imbalance mitigation to tackle challenges associated with inconsistently labeled multi-source data.
arXiv Detail & Related papers (2023-11-17T18:28:32Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Does Synthetic Data Make Large Language Models More Efficient? [0.0]
This paper explores the nuances of synthetic data generation in NLP.
We highlight its advantages, including data augmentation potential and the introduction of structured variety.
We demonstrate the impact of template-based synthetic data on the performance of modern transformer models.
arXiv Detail & Related papers (2023-10-11T19:16:09Z) - Combining Public Human Activity Recognition Datasets to Mitigate Labeled
Data Scarcity [1.274578243851308]
We propose a novel strategy to combine publicly available datasets with the goal of learning a generalized HAR model.
Our experimental evaluation, which includes experimenting with different state-of-the-art neural network architectures, shows that combining public datasets can significantly reduce the number of labeled samples.
arXiv Detail & Related papers (2023-06-23T18:51:22Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Probing transfer learning with a model of synthetic correlated datasets [11.53207294639557]
Transfer learning can significantly improve the sample efficiency of neural networks.
We re-think a solvable model of synthetic data as a framework for modeling correlation between data-sets.
We show that our model can capture a range of salient features of transfer learning with real data.
arXiv Detail & Related papers (2021-06-09T22:15:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.