The Echoes of the 'I': Tracing Identity with Demographically Enhanced Word Embeddings
- URL: http://arxiv.org/abs/2407.00340v1
- Date: Sat, 29 Jun 2024 06:59:35 GMT
- Title: The Echoes of the 'I': Tracing Identity with Demographically Enhanced Word Embeddings
- Authors: Ivan Smirnov,
- Abstract summary: Identity is one of the most commonly studied constructs in social science.
This paper introduces a novel approach to studying identity by enhancing word embeddings with socio-demographic information.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identity is one of the most commonly studied constructs in social science. However, despite extensive theoretical work on identity, there remains a need for additional empirical data to validate and refine existing theories. This paper introduces a novel approach to studying identity by enhancing word embeddings with socio-demographic information. As a proof of concept, we demonstrate that our approach successfully reproduces and extends established findings regarding gendered self-views. Our methodology can be applied in a wide variety of settings, allowing researchers to tap into a vast pool of naturally occurring data, such as social media posts. Unlike similar methods already introduced in computer science, our approach allows for the study of differences between social groups. This could be particularly appealing to social scientists and may encourage the faster adoption of computational methods in the field.
Related papers
- A Primer on Word Embeddings: AI Techniques for Text Analysis in Social Work [0.0]
This paper introduces word embeddings to social work researchers.
We discuss fundamental concepts, technical foundations, and practical applications.
We conclude that successfully implementing embedding technologies in social work requires developing domain-specific models, creating accessible tools, and establishing best practices aligned with social work's ethical principles.
arXiv Detail & Related papers (2024-11-11T17:33:51Z) - Causal Representation Learning from Multimodal Biological Observations [57.00712157758845]
We aim to develop flexible identification conditions for multimodal data.
We establish identifiability guarantees for each latent component, extending the subspace identification results from prior work.
Our key theoretical ingredient is the structural sparsity of the causal connections among distinct modalities.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Learning Interpretable Concepts: Unifying Causal Representation Learning
and Foundation Models [51.43538150982291]
We study how to learn human-interpretable concepts from data.
Weaving together ideas from both fields, we show that concepts can be provably recovered from diverse data.
arXiv Detail & Related papers (2024-02-14T15:23:59Z) - Getting aligned on representational alignment [93.08284685325674]
We study the study of representational alignment in cognitive science, neuroscience, and machine learning.
Despite their overlapping interests, there is limited knowledge transfer between these fields.
We propose a unifying framework that can serve as a common language for research on representational alignment.
arXiv Detail & Related papers (2023-10-18T17:47:58Z) - Self-supervised Hypergraph Representation Learning for Sociological
Analysis [52.514283292498405]
We propose a fundamental methodology to support the further fusion of data mining techniques and sociological behavioral criteria.
First, we propose an effective hypergraph awareness and a fast line graph construction framework.
Second, we propose a novel hypergraph-based neural network to learn social influence flowing from users to users.
arXiv Detail & Related papers (2022-12-22T01:20:29Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Word Embedding for Social Sciences: An Interdisciplinary Survey [9.657531563610767]
We build a taxonomy to illustrate the methods and procedures used in the surveyed papers.
This survey also conducts a simple experiment to warn that common similarity measurements used in the literature could yield different results.
arXiv Detail & Related papers (2022-07-07T04:49:21Z) - Subverting machines, fluctuating identities: Re-learning human
categorization [1.3106063755117399]
Default paradigm in AI research envisions identity with essential attributes that are discrete and static.
In stark contrast, strands of thought within critical theory present a conception of identity as malleable and constructed entirely through interaction.
arXiv Detail & Related papers (2022-05-27T03:09:25Z) - Rumor Detection with Self-supervised Learning on Texts and Social Graph [101.94546286960642]
We propose contrastive self-supervised learning on heterogeneous information sources, so as to reveal their relations and characterize rumors better.
We term this framework as Self-supervised Rumor Detection (SRD)
Extensive experiments on three real-world datasets validate the effectiveness of SRD for automatic rumor detection on social media.
arXiv Detail & Related papers (2022-04-19T12:10:03Z) - Gender Recognition in Informal and Formal Language Scenarios via
Transfer Learning [11.048994919361034]
Recognition and identification of demographic traits such as gender, age, location, or personality based on text data can help to improve different marketing strategies.
This paper proposes the use of recurrent and convolutional neural networks, and a transfer learning strategy for gender recognition in documents written in informal and formal languages.
arXiv Detail & Related papers (2021-06-23T15:32:50Z) - Friend or Foe: A Review and Synthesis of Computational Models of the
Identity Labeling Problem [3.180013942295509]
We introduce the identity labeling problem - given an individual in a social situation, can we predict what identity(ies) they will be labeled with by someone else?
This problem remains a theoretical gap and methodological challenge, evidenced by the fact that models of social-cognition often sidestep the issue by treating identities as already known.
We build on insights from existing models to develop a new framework, entitled Latent Cognitive Social Spaces, that can incorporate multiple social cues including sentiment information, socio-demographic characteristics, and institutional associations to estimate the most culturally expected identity.
arXiv Detail & Related papers (2021-05-10T15:59:31Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.