Are Information Retrieval Approaches Good at Harmonising Longitudinal Survey Questions in Social Science?
- URL: http://arxiv.org/abs/2504.20679v1
- Date: Tue, 29 Apr 2025 12:00:33 GMT
- Title: Are Information Retrieval Approaches Good at Harmonising Longitudinal Survey Questions in Social Science?
- Authors: Wing Yan Li, Zeqiang Wang, Jon Johnson, Suparna De,
- Abstract summary: We present a new information retrieval task to identify concept equivalence across question and response options.<n>This paper investigates multiple unsupervised approaches on a survey dataset spanning 1946-2020.<n>We show that IR-specialised neural models achieve the highest overall performance with other approaches performing comparably.
- Score: 2.769064123193329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated detection of semantically equivalent questions in longitudinal social science surveys is crucial for long-term studies informing empirical research in the social, economic, and health sciences. Retrieving equivalent questions faces dual challenges: inconsistent representation of theoretical constructs (i.e. concept/sub-concept) across studies as well as between question and response options, and the evolution of vocabulary and structure in longitudinal text. To address these challenges, our multi-disciplinary collaboration of computer scientists and survey specialists presents a new information retrieval (IR) task of identifying concept (e.g. Housing, Job, etc.) equivalence across question and response options to harmonise longitudinal population studies. This paper investigates multiple unsupervised approaches on a survey dataset spanning 1946-2020, including probabilistic models, linear probing of language models, and pre-trained neural networks specialised for IR. We show that IR-specialised neural models achieve the highest overall performance with other approaches performing comparably. Additionally, the re-ranking of the probabilistic model's results with neural models only introduces modest improvements of 0.07 at most in F1-score. Qualitative post-hoc evaluation by survey specialists shows that models generally have a low sensitivity to questions with high lexical overlap, particularly in cases where sub-concepts are mismatched. Altogether, our analysis serves to further research on harmonising longitudinal studies in social science.
Related papers
- Random Forest-of-Thoughts: Uncertainty-aware Reasoning for Computational Social Science [9.870701840926923]
We propose a novel large language model prompting method called Random Forest of Thoughts (RFoT)<n>RFoT allows LLMs to perform deliberate decision-making by generating diverse thought space and randomly selecting the sub-thoughts to build the forest of thoughts.<n>Our experiments show that RFoT significantly enhances language models' abilities on two novel social survey analysis problems requiring non-trivial reasoning.
arXiv Detail & Related papers (2025-02-26T00:52:44Z) - Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.<n>As a testbed, we use country-level results from two global cultural surveys.<n>We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z) - Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries [91.70689724416698]
We present Quriosity, a collection of 13.5K naturally occurring questions from three diverse sources.<n>Our analysis reveals a significant presence of causal questions (up to 42%) in the dataset.
arXiv Detail & Related papers (2024-05-30T17:55:28Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - A Survey on Interpretable Cross-modal Reasoning [64.37362731950843]
Cross-modal reasoning (CMR) has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.
This survey delves into the realm of interpretable cross-modal reasoning (I-CMR)
This survey presents a comprehensive overview of the typical methods with a three-level taxonomy for I-CMR.
arXiv Detail & Related papers (2023-09-05T05:06:48Z) - Around the GLOBE: Numerical Aggregation Question-Answering on
Heterogeneous Genealogical Knowledge Graphs with Deep Neural Networks [0.934612743192798]
We present a new end-to-end methodology for numerical aggregation QA for genealogical trees.
The proposed architecture, GLOBE, outperforms the state-of-the-art models and pipelines by achieving 87% accuracy for this task.
This study may have practical implications for genealogical information centers and museums.
arXiv Detail & Related papers (2023-07-30T12:09:00Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Penalized Estimation and Forecasting of Multiple Subject Intensive
Longitudinal Data [7.780531445879182]
We present a novel modeling framework that addresses a number of topical challenges and open questions in the psychological literature on modeling dynamic processes.
How can we model and forecast ILD when the length of individual time series and the number of variables collected are roughly equivalent?
Second, how can we best take advantage of the cross-sectional (between-person) information inherent to most ILD scenarios while acknowledging individuals differ both quantitatively and qualitatively?
arXiv Detail & Related papers (2020-07-09T20:34:23Z) - Random Features for Kernel Approximation: A Survey on Algorithms,
Theory, and Beyond [35.32894170512829]
In this survey, we systematically review the work on random features from the past ten years.
First, the motivations, characteristics and contributions of representative random features based algorithms are summarized.
Second, we review theoretical results that center around the following key question: how many random features are needed to ensure a high approximation quality.
Third, we provide a comprehensive evaluation of popular random features based algorithms on several large-scale benchmark datasets.
arXiv Detail & Related papers (2020-04-23T13:44:48Z) - Uncovering the Data-Related Limits of Human Reasoning Research: An
Analysis based on Recommender Systems [1.7478203318226309]
Cognitive science pursues the goal of modeling human-like intelligence from a theory-driven perspective.
Syllogistic reasoning is one of the core domains of human reasoning research.
Recent analyses of models' predictive performances revealed a stagnation in improvement.
arXiv Detail & Related papers (2020-03-11T10:12:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.