Metadata Shaping: Natural Language Annotations for the Tail
- URL: http://arxiv.org/abs/2110.08430v1
- Date: Sat, 16 Oct 2021 01:00:47 GMT
- Title: Metadata Shaping: Natural Language Annotations for the Tail
- Authors: Simran Arora, Sen Wu, Enci Liu, Christopher Re
- Abstract summary: Language models (LMs) have made remarkable progress, but still struggle to generalize beyond the training data to rare linguistic patterns.
We propose metadata shaping, a method in which readily available metadata, such as entity descriptions and categorical tags, are appended to examples based on information theoretic metrics.
With no changes to the LM whatsoever, metadata shaping exceeds the BERT-baseline by up to 5.3 F1 points, and achieves or competes with state-of-the-art results.
- Score: 4.665656172490747
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models (LMs) have made remarkable progress, but still struggle to
generalize beyond the training data to rare linguistic patterns. Since rare
entities and facts are prevalent in the queries users submit to popular
applications such as search and personal assistant systems, improving the
ability of LMs to reliably capture knowledge over rare entities is a pressing
challenge studied in significant prior work. Noticing that existing approaches
primarily modify the LM architecture or introduce auxiliary objectives to
inject useful entity knowledge, we ask to what extent we could match the
quality of these architectures using a base LM architecture, and only changing
the data? We propose metadata shaping, a method in which readily available
metadata, such as entity descriptions and categorical tags, are appended to
examples based on information theoretic metrics. Intuitively, if metadata
corresponding to popular entities overlap with metadata for rare entities, the
LM may be able to better reason about the rare entities using patterns learned
from similar popular entities. On standard entity-rich tasks (TACRED, FewRel,
OpenEntity), with no changes to the LM whatsoever, metadata shaping exceeds the
BERT-baseline by up to 5.3 F1 points, and achieves or competes with
state-of-the-art results. We further show the improvements are up to 10x larger
on examples containing tail versus popular entities.
Related papers
- GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding [39.67113788660731]
We introduce a framework for developing Graph-aligned LAnguage Models (GLaM)
We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning.
arXiv Detail & Related papers (2024-02-09T19:53:29Z) - Utilising a Large Language Model to Annotate Subject Metadata: A Case
Study in an Australian National Research Data Catalogue [18.325675189960833]
In support of open and reproducible research, there has been a rapidly increasing number of datasets made available for research.
As the availability of datasets increases, it becomes more important to have quality metadata for discovering and reusing them.
This paper proposes to leverage large language models (LLMs) for cost-effective annotation of subject metadata through the LLM-based in-context learning.
arXiv Detail & Related papers (2023-10-17T14:52:33Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Multi-Modal Fusion by Meta-Initialization [0.0]
We propose an extension to the Model-Agnostic Meta-Learning algorithm (MAML)
This allows the model to adapt using auxiliary information as well as task experience.
FuMI significantly outperforms uni-modal baselines such as MAML in the few-shot regime.
arXiv Detail & Related papers (2022-10-10T17:00:58Z) - Meta Knowledge Condensation for Federated Learning [65.20774786251683]
Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model.
This would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous.
Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning.
arXiv Detail & Related papers (2022-09-29T15:07:37Z) - A Multi-Format Transfer Learning Model for Event Argument Extraction via
Variational Information Bottleneck [68.61583160269664]
Event argument extraction (EAE) aims to extract arguments with given roles from texts.
We propose a multi-format transfer learning model with variational information bottleneck.
We conduct extensive experiments on three benchmark datasets, and obtain new state-of-the-art performance on EAE.
arXiv Detail & Related papers (2022-08-27T13:52:01Z) - Entity Cloze By Date: What LMs Know About Unseen Entities [79.34707800653597]
Language models (LMs) are typically trained once on a large-scale corpus and used for years without being updated.
We propose a framework to analyze what LMs can infer about new entities that did not exist when the LMs were pretrained.
We derive a dataset of entities indexed by their origination date and paired with their English Wikipedia articles, from which we can find sentences about each entity.
arXiv Detail & Related papers (2022-05-05T17:59:31Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - Learning to Generalize Unseen Domains via Memory-based Multi-Source
Meta-Learning for Person Re-Identification [59.326456778057384]
We propose the Memory-based Multi-Source Meta-Learning framework to train a generalizable model for unseen domains.
We also present a meta batch normalization layer (MetaBN) to diversify meta-test features.
Experiments demonstrate that our M$3$L can effectively enhance the generalization ability of the model for unseen domains.
arXiv Detail & Related papers (2020-12-01T11:38:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.