Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For
Language Model Synonym-Aware Pretraining
- URL: http://arxiv.org/abs/2303.14425v1
- Date: Sat, 25 Mar 2023 10:19:14 GMT
- Title: Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For
Language Model Synonym-Aware Pretraining
- Authors: Zhouhong Gu, Sihang Jiang, Wenhao Huang, Jiaqing Liang, Hongwei Feng,
Yanghua Xiao
- Abstract summary: Many Pretrained Language Model (PLM) lack synonym knowledge due to limitation of small-scale synsets and PLM's pretraining objectives.
We propose a framework called Sem4SAP to mine synsets from Open Knowledge Graph (Open-KG) and using the mined synsets to do synonym-aware pretraining for language models.
We also propose two novel and effective synonym-aware pre-training methods for injecting synonym knowledge into PLMs.
- Score: 17.68675964560931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The model's ability to understand synonymous expression is crucial in many
kinds of downstream tasks. It will make the model to better understand the
similarity between context, and more robust to the synonym substitution attack.
However, many Pretrained Language Model (PLM) lack synonym knowledge due to
limitation of small-scale synsets and PLM's pretraining objectives. In this
paper, we propose a framework called Sem4SAP to mine synsets from Open
Knowledge Graph (Open-KG) and using the mined synsets to do synonym-aware
pretraining for language models. We propose to coarsly filter the content in
Open-KG and use the frequency information to better help the clustering process
under low-resource unsupervised conditions. We expand the mined synsets by
migrating core semantics between synonymous expressions.We also propose two
novel and effective synonym-aware pre-training methods for injecting synonym
knowledge into PLMs.Extensive experiments demonstrate that Sem4SAP can
dramatically outperform the original PLMs and other baselines on ten different
tasks.
Related papers
- Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts.
However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations.
We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - LLM-TAKE: Theme Aware Keyword Extraction Using Large Language Models [10.640773460677542]
We explore using Large Language Models (LLMs) in generating keywords for items that are inferred from the items textual metadata.
Our modeling framework includes several stages to fine grain the results by avoiding outputting keywords that are non informative or sensitive.
We propose two variations of framework for generating extractive and abstractive themes for products in an E commerce setting.
arXiv Detail & Related papers (2023-12-01T20:13:08Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual
Synonym Knowledge [30.010315144903885]
Contextual synonym knowledge is crucial for similarity-oriented tasks.
Most Pre-trained Language Models (PLMs) lack synonym knowledge due to inherent limitations of their pre-training objectives.
We propose PICSO, a flexible framework that supports the injection of contextual synonym knowledge from multiple domains into PLMs.
arXiv Detail & Related papers (2022-11-20T15:25:19Z) - An Explanation of In-context Learning as Implicit Bayesian Inference [117.19809377740188]
We study the role of the pretraining distribution on the emergence of in-context learning.
We prove that in-context learning occurs implicitly via Bayesian inference of the latent concept.
We empirically find that scaling model size improves in-context accuracy even when the pretraining loss is the same.
arXiv Detail & Related papers (2021-11-03T09:12:33Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - Syntactic and Semantic-driven Learning for Open Information Extraction [42.65591370263333]
One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora.
We propose a syntactic and semantic-driven learning approach, which can learn neural open IE models without any human-labelled data.
arXiv Detail & Related papers (2021-03-05T02:59:40Z) - Introducing Syntactic Structures into Target Opinion Word Extraction
with Deep Learning [89.64620296557177]
We propose to incorporate the syntactic structures of the sentences into the deep learning models for targeted opinion word extraction.
We also introduce a novel regularization technique to improve the performance of the deep learning models.
The proposed model is extensively analyzed and achieves the state-of-the-art performance on four benchmark datasets.
arXiv Detail & Related papers (2020-10-26T07:13:17Z) - SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and
Synonym Discovery [66.24624547470175]
SynSetExpan is a novel framework that enables two tasks to mutually enhance each other.
We create the first large-scale Synonym-Enhanced Set Expansion dataset via crowdsourcing.
Experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both entity set expansion and synonym discovery tasks.
arXiv Detail & Related papers (2020-09-29T07:32:17Z) - MICE: Mining Idioms with Contextual Embeddings [0.0]
MICEatic expressions can be problematic for natural language processing applications.
We present an approach that uses contextual embeddings for that purpose.
We show that deep neural networks using both embeddings perform much better than existing approaches.
arXiv Detail & Related papers (2020-08-13T08:56:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.