Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents
- URL: http://arxiv.org/abs/2305.13303v3
- Date: Fri, 20 Oct 2023 12:27:41 GMT
- Title: Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents
- Authors: Jannis Vamvas and Rico Sennrich
- Abstract summary: We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
- Score: 61.63208012250885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatically highlighting words that cause semantic differences between two
documents could be useful for a wide range of applications. We formulate
recognizing semantic differences (RSD) as a token-level regression task and
study three unsupervised approaches that rely on a masked language model. To
assess the approaches, we begin with basic English sentences and gradually move
to more complex, cross-lingual document pairs. Our results show that an
approach based on word alignment and sentence-level contrastive learning has a
robust correlation to gold labels. However, all unsupervised approaches still
leave a large margin of improvement. Code to reproduce our experiments is
available at https://github.com/ZurichNLP/recognizing-semantic-differences
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view
Contrastive Learning [54.523172171533645]
Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora.
We propose Multi-view Contrastive Learning for Cross-lingual Named Entity Recognition (mCL-NER)
Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of mCL-NER over prior data-driven and model-based approaches.
arXiv Detail & Related papers (2023-08-17T16:02:29Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - Stance Detection: A Practical Guide to Classifying Political Beliefs in Text [0.0]
This paper advances text analysis methods by precisely defining stance detection.
I present three distinct approaches: supervised classification, natural language inference, and in-context learning with generative language models.
I provide guidance on application and validation techniques, as well as coding tutorials for implementation.
arXiv Detail & Related papers (2023-05-02T18:49:12Z) - Semantic-aware Contrastive Learning for More Accurate Semantic Parsing [32.74456368167872]
We propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations.
Experiments on two standard datasets show that our approach achieves significant improvements over MLE baselines.
arXiv Detail & Related papers (2023-01-19T07:04:32Z) - Constructing Phrase-level Semantic Labels to Form Multi-Grained
Supervision for Image-Text Retrieval [48.20798265640068]
We introduce additional phrase-level supervision for the better identification of mismatched units in the text.
We construct text scene graphs for the matched sentences and extract entities and triples as the phrase-level labels.
For the training, we propose multi-scale matching losses from both global and local perspectives.
arXiv Detail & Related papers (2021-09-12T14:21:15Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.