Cultural Cartography with Word Embeddings
- URL: http://arxiv.org/abs/2007.04508v4
- Date: Mon, 3 May 2021 21:13:27 GMT
- Title: Cultural Cartography with Word Embeddings
- Authors: Dustin S. Stoltz and Marshall A. Taylor
- Abstract summary: We show how word embeddings are commensurate with prevailing theories of meaning in sociology.
First, one can hold terms constant and measure how the embedding space moves around them.
Second, one can also hold the embedding space constant and see how documents or authors move relative to it.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Using the frequency of keywords is a classic approach in the formal analysis
of text, but has the drawback of glossing over the relationality of word
meanings. Word embedding models overcome this problem by constructing a
standardized and continuous "meaning space" where words are assigned a location
based on relations of similarity to other words based on how they are used in
natural language samples. We show how word embeddings are commensurate with
prevailing theories of meaning in sociology and can be put to the task of
interpretation via two kinds of navigation. First, one can hold terms constant
and measure how the embedding space moves around them--much like astronomers
measured the changing of celestial bodies with the seasons. Second, one can
also hold the embedding space constant and see how documents or authors move
relative to it--just as ships use the stars on a given night to determine their
location. Using the empirical case of immigration discourse in the United
States, we demonstrate the merits of these two broad strategies for advancing
important topics in cultural theory, including social marking, media fields,
echo chambers, and cultural diffusion and change more broadly.
Related papers
- From cart to truck: meaning shift through words in English in the last two centuries [0.0]
This onomasiological study uses diachronic word embeddings to explore how different words represented the same concepts over time.
We identify shifts in energy, transport, entertainment, and computing domains, revealing connections between language and societal changes.
arXiv Detail & Related papers (2024-08-29T02:05:39Z) - RealCustom: Narrowing Real Text Word for Real-Time Open-Domain
Text-to-Image Customization [57.86083349873154]
Text-to-image customization aims to synthesize text-driven images for the given subjects.
Existing works follow the pseudo-word paradigm, i.e., represent the given subjects as pseudo-words and then compose them with the given text.
We present RealCustom that, for the first time, disentangles similarity from controllability by precisely limiting subject influence to relevant parts only.
arXiv Detail & Related papers (2024-03-01T12:12:09Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - Dialectograms: Machine Learning Differences between Discursive
Communities [0.0]
We take a step towards leveraging the richness of the full embedding space by using word embeddings to map out how words are used differently.
We provide a new measure of the degree to which words are used differently that overcomes the tendency for existing measures to pick out low frequent or polysemous words.
arXiv Detail & Related papers (2023-02-11T11:32:08Z) - HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding [54.52651110749165]
We present a novel framework that introduces hyperbolic embeddings to represent words and topics.
With the tree-likeness property of hyperbolic space, the underlying semantic hierarchy can be better exploited to mine more interpretable topics.
arXiv Detail & Related papers (2022-10-16T02:54:17Z) - Latent Topology Induction for Understanding Contextualized
Representations [84.7918739062235]
We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models.
We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
arXiv Detail & Related papers (2022-06-03T11:22:48Z) - Towards a Theoretical Understanding of Word and Relation Representation [8.020742121274418]
Representing words by vectors, or embeddings, enables computational reasoning.
We focus on word embeddings learned from text corpora and knowledge graphs.
arXiv Detail & Related papers (2022-02-01T15:34:58Z) - Theoretical foundations and limits of word embeddings: what types of
meaning can they capture? [0.0]
Measuring meaning is a central problem in cultural sociology.
I theorize the ways in which word embeddings model three core premises of a structural linguistic theory of meaning.
arXiv Detail & Related papers (2021-07-22T00:40:33Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - Enriching Word Embeddings with Temporal and Spatial Information [37.0220769789037]
We present a model for learning word representation conditioned on time and location.
We train our model on time- and location-stamped corpora, and show using both quantitative and qualitative evaluations that it can capture semantics across time and locations.
arXiv Detail & Related papers (2020-10-02T03:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.