Related papers: Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual Synonym Knowledge

Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual Synonym Knowledge

URL: http://arxiv.org/abs/2211.10997v1
Date: Sun, 20 Nov 2022 15:25:19 GMT
Title: Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual Synonym Knowledge
Authors: Yangning Li, Jiaoyan Chen, Yinghui Li, Tianyu Yu, Xi Chen, Hai-Tao Zheng
Abstract summary: Contextual synonym knowledge is crucial for similarity-oriented tasks. Most Pre-trained Language Models (PLMs) lack synonym knowledge due to inherent limitations of their pre-training objectives. We propose PICSO, a flexible framework that supports the injection of contextual synonym knowledge from multiple domains into PLMs.
Score: 30.010315144903885
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contextual synonym knowledge is crucial for those similarity-oriented tasks whose core challenge lies in capturing semantic similarity between entities in their contexts, such as entity linking and entity matching. However, most Pre-trained Language Models (PLMs) lack synonym knowledge due to inherent limitations of their pre-training objectives such as masked language modeling (MLM). Existing works which inject synonym knowledge into PLMs often suffer from two severe problems: (i) Neglecting the ambiguity of synonyms, and (ii) Undermining semantic understanding of original PLMs, which is caused by inconsistency between the exact semantic similarity of the synonyms and the broad conceptual relevance learned from the original corpus. To address these issues, we propose PICSO, a flexible framework that supports the injection of contextual synonym knowledge from multiple domains into PLMs via a novel entity-aware Adapter which focuses on the semantics of the entities (synonyms) in the contexts. Meanwhile, PICSO stores the synonym knowledge in additional parameters of the Adapter structure, which prevents it from corrupting the semantic understanding of the original PLM. Extensive experiments demonstrate that PICSO can dramatically outperform the original PLMs and the other knowledge and synonym injection models on four different similarity-oriented tasks. In addition, experiments on GLUE prove that PICSO also benefits general natural language understanding tasks. Codes and data will be public.

Related papers

QUDsim: Quantifying Discourse Similarities in LLM-Generated Text [70.22275200293964]
We introduce an abstraction based on linguistic theories in Questions Under Discussion (QUD) and question semantics to help quantify differences in discourse progression. We then use this framework to build $textbfQUDsim$, a similarity metric that can detect discursive parallels between documents. Using QUDsim, we find that LLMs often reuse discourse structures (more so than humans) across samples, even when content differs.
arXiv Detail & Related papers (2025-04-12T23:46:09Z)
Do LLMs Really Adapt to Domains? An Ontology Learning Perspective [2.0755366440393743]
Large Language Models (LLMs) have demonstrated unprecedented prowess across various natural language processing tasks in various application domains. Recent studies show that LLMs can be leveraged to perform lexical semantic tasks, such as Knowledge Base Completion (KBC) or Ontology Learning (OL) This paper investigates the question: Do LLMs really adapt to domains and remain consistent in the extraction of structured knowledge, or do they only learn lexical senses instead of reasoning?
arXiv Detail & Related papers (2024-07-29T13:29:43Z)
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results. We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z)
Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining [17.68675964560931]
Many Pretrained Language Model (PLM) lack synonym knowledge due to limitation of small-scale synsets and PLM's pretraining objectives. We propose a framework called Sem4SAP to mine synsets from Open Knowledge Graph (Open-KG) and using the mined synsets to do synonym-aware pretraining for language models. We also propose two novel and effective synonym-aware pre-training methods for injecting synonym knowledge into PLMs.
arXiv Detail & Related papers (2023-03-25T10:19:14Z)
Imitation Learning-based Implicit Semantic-aware Communication Networks: Multi-layer Representation and Collaborative Reasoning [68.63380306259742]
Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy. We propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate. We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users.
arXiv Detail & Related papers (2022-10-28T13:26:08Z)
Dictionary-Assisted Supervised Contrastive Learning [0.0]
We introduce the dictionary-assisted supervised contrastive learning (DASCL) objective, allowing researchers to leverage specialized dictionaries. The text is first keyword simplified: a common, fixed token replaces any word in the corpus that appears in the dictionary(ies) relevant to the concept of interest. DASCL and cross-entropy improves classification performance metrics in few-shot learning settings and social science applications.
arXiv Detail & Related papers (2022-10-27T04:57:43Z)
The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative [7.03497683558609]
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. We present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC) Our results show that all three investigated PLMs are able to recognise the structure of the CC but fail to use its meaning.
arXiv Detail & Related papers (2022-10-24T13:01:24Z)
Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation [59.01297461453444]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z)
Integrating Language Guidance into Vision-based Deep Metric Learning [78.18860829585182]
We propose to learn metric spaces which encode semantic similarities as embedding space. These spaces should be transferable to classes beyond those seen during training. This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes.
arXiv Detail & Related papers (2022-03-16T11:06:50Z)
Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation. This paper aims to address the issue with a mask-and-predict strategy. We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions. Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z)
EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings. We derive new distributional semantic similarity measures for M-SE from prior ones. We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z)
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text. Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.