From cart to truck: meaning shift through words in English in the last two centuries
- URL: http://arxiv.org/abs/2408.16209v1
- Date: Thu, 29 Aug 2024 02:05:39 GMT
- Title: From cart to truck: meaning shift through words in English in the last two centuries
- Authors: Esteban RodrÃguez Betancourt, Edgar Casasola Murillo,
- Abstract summary: This onomasiological study uses diachronic word embeddings to explore how different words represented the same concepts over time.
We identify shifts in energy, transport, entertainment, and computing domains, revealing connections between language and societal changes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This onomasiological study uses diachronic word embeddings to explore how different words represented the same concepts over time, using historical word data from 1800 to 2000. We identify shifts in energy, transport, entertainment, and computing domains, revealing connections between language and societal changes. Our approach consisted in using diachronic word embeddings trained using word2vec with skipgram and aligning them using orthogonal Procrustes. We discuss possible difficulties linked to the relationships the method identifies. Moreover, we look at the ethical aspects of interpreting results, highlighting the need for expert insights to understand the method's significance.
Related papers
- Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - Dialectograms: Machine Learning Differences between Discursive
Communities [0.0]
We take a step towards leveraging the richness of the full embedding space by using word embeddings to map out how words are used differently.
We provide a new measure of the degree to which words are used differently that overcomes the tendency for existing measures to pick out low frequent or polysemous words.
arXiv Detail & Related papers (2023-02-11T11:32:08Z) - Towards a Theoretical Understanding of Word and Relation Representation [8.020742121274418]
Representing words by vectors, or embeddings, enables computational reasoning.
We focus on word embeddings learned from text corpora and knowledge graphs.
arXiv Detail & Related papers (2022-02-01T15:34:58Z) - Simple, Interpretable and Stable Method for Detecting Words with Usage
Change across Corpora [54.757845511368814]
The problem of comparing two bodies of text and searching for words that differ in their usage arises often in digital humanities and computational social science.
This is commonly approached by training word embeddings on each corpus, aligning the vector spaces, and looking for words whose cosine distance in the aligned space is large.
We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word.
arXiv Detail & Related papers (2021-12-28T23:46:00Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Competition in Cross-situational Word Learning: A Computational Study [10.069127121936281]
Children learn word meanings by tapping into the commonalities across different situations in which words are used.
In a set of computational studies, we show that to successfully learn word meanings in the face of uncertainty, a learner needs to use two types of competition.
arXiv Detail & Related papers (2020-12-06T20:32:56Z) - UoB at SemEval-2020 Task 1: Automatic Identification of Novel Word
Senses [0.6980076213134383]
This paper presents an approach to lexical semantic change detection based on Bayesian word sense induction suitable for novel word sense identification.
The same approach is also applied to a corpus gleaned from 15 years of Twitter data, the results of which are then used to identify words which may be instances of slang.
arXiv Detail & Related papers (2020-10-18T19:27:06Z) - Improving Machine Reading Comprehension with Contextualized Commonsense
Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension.
We propose to represent relations implicitly by situating structured knowledge in a context.
We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z) - Cultural Cartography with Word Embeddings [0.0]
We show how word embeddings are commensurate with prevailing theories of meaning in sociology.
First, one can hold terms constant and measure how the embedding space moves around them.
Second, one can also hold the embedding space constant and see how documents or authors move relative to it.
arXiv Detail & Related papers (2020-07-09T01:58:28Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.