Novel Aficionados and Doppelg\"angers: a referential task for semantic
representations of individual entities
- URL: http://arxiv.org/abs/2104.10270v1
- Date: Tue, 20 Apr 2021 22:24:19 GMT
- Title: Novel Aficionados and Doppelg\"angers: a referential task for semantic
representations of individual entities
- Authors: Andrea Bruera and Aur\'elie Herbelot
- Abstract summary: We show that the semantic distinction between proper names and common nouns is reflected in their linguistic distributions.
The results indicate that the distributional representations of different individual entities are less clearly distinguishable from each other than those of common nouns.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In human semantic cognition, proper names (names which refer to individual
entities) are harder to learn and retrieve than common nouns. This seems to be
the case for machine learning algorithms too, but the linguistic and
distributional reasons for this behaviour have not been investigated in depth
so far. To tackle this issue, we show that the semantic distinction between
proper names and common nouns is reflected in their linguistic distributions by
employing an original task for distributional semantics, the Doppelg\"anger
test, an extensive set of models, and a new dataset, the Novel Aficionados
dataset. The results indicate that the distributional representations of
different individual entities are less clearly distinguishable from each other
than those of common nouns, an outcome which intriguingly mirrors human
cognition.
Related papers
- Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs)
We form "semantic tokens" by merging the semantically similar subwords and their embeddings.
inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z) - Identifying and interpreting non-aligned human conceptual
representations using language modeling [0.0]
We show that congenital blindness induces conceptual reorganization in both a-modal and sensory-related verbal domains.
We find that blind individuals more strongly associate social and cognitive meanings to verbs related to motion.
For some verbs, representations of blind and sighted are highly similar.
arXiv Detail & Related papers (2024-03-10T13:02:27Z) - Multicultural Name Recognition For Previously Unseen Names [65.268245109828]
This paper attempts to improve recognition of person names, a diverse category that can grow any time someone is born or changes their name.
I look at names from 103 countries to compare how well the model performs on names from different cultures.
I find that a model with combined character and word input outperforms word-only models and may improve on accuracy compared to classical NER models.
arXiv Detail & Related papers (2024-01-23T17:58:38Z) - Biomedical Named Entity Recognition via Dictionary-based Synonym
Generalization [51.89486520806639]
We propose a novel Synonym Generalization (SynGen) framework that recognizes the biomedical concepts contained in the input text using span-based predictions.
We extensively evaluate our approach on a wide range of benchmarks and the results verify that SynGen outperforms previous dictionary-based models by notable margins.
arXiv Detail & Related papers (2023-05-22T14:36:32Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - A Linguistic Investigation of Machine Learning based Contradiction
Detection Models: An Empirical Analysis and Future Perspectives [0.34998703934432673]
We analyze two Natural Language Inference data sets with respect to their linguistic features.
The goal is to identify those syntactic and semantic properties that are particularly hard to comprehend for a machine learning model.
arXiv Detail & Related papers (2022-10-19T10:06:03Z) - Visual Comparison of Language Model Adaptation [55.92129223662381]
adapters are lightweight alternatives for model adaptation.
In this paper, we discuss several design and alternatives for interactive, comparative visual explanation methods.
We show that, for instance, an adapter trained on the language debiasing task according to context-0 embeddings introduces a new type of bias.
arXiv Detail & Related papers (2022-08-17T09:25:28Z) - Multi-sense embeddings through a word sense disambiguation process [2.2344764434954256]
Most Suitable Sense.
(MSSA) disambiguates and annotates each word by its specific sense, considering the semantic effects of its context.
We test our approach on six different benchmarks for the word similarity task, showing that our approach can produce state-of-the-art results.
arXiv Detail & Related papers (2021-01-21T16:22:34Z) - Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards [13.753240692520098]
We present a neural network architecture for joint coreference resolution and semantic role labeling for English.
We use reinforcement learning to encourage global coherence over the document and between semantic annotations.
This leads to improvements on both tasks in multiple datasets from different domains.
arXiv Detail & Related papers (2020-10-12T09:36:24Z) - Pairwise Learning for Name Disambiguation in Large-Scale Heterogeneous
Academic Networks [81.00481125272098]
We introduce Multi-view Attention-based Pairwise Recurrent Neural Network (MA-PairRNN) to solve the name disambiguation problem.
MA-PairRNN combines heterogeneous graph embedding learning and pairwise similarity learning into a framework.
Results on two real-world datasets demonstrate that our framework has a significant and consistent improvement of performance on the name disambiguation task.
arXiv Detail & Related papers (2020-08-30T06:08:20Z) - Words aren't enough, their order matters: On the Robustness of Grounding
Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition.
We show that 83.7% of test instances do not require reasoning on linguistic structure.
We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.