Related papers: Knowing Your Uncertainty -- On the application of LLM in social sciences

Knowing Your Uncertainty -- On the application of LLM in social sciences

URL: http://arxiv.org/abs/2512.05461v1
Date: Fri, 05 Dec 2025 06:36:15 GMT
Title: Knowing Your Uncertainty -- On the application of LLM in social sciences
Authors: Bolun Zhang, Linzhuo Li, Yunqi Chen, Qinlin Zhao, Zihan Zhu, Xiaoyuan Yi, Xing Xie,
Abstract summary: Large language models (LLMs) are rapidly being integrated into computational social science research.<n>This article argues that applying LLMs to social scientific tasks requires explicit assessment of uncertainty.
Score: 37.703249716862054
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) are rapidly being integrated into computational social science research, yet their blackboxed training and designed stochastic elements in inference pose unique challenges for scientific inquiry. This article argues that applying LLMs to social scientific tasks requires explicit assessment of uncertainty-an expectation long established in both quantitative methodology in the social sciences and machine learning. We introduce a unified framework for evaluating LLM uncertainty along two dimensions: the task type (T), which distinguishes between classification, short-form, and long-form generation, and the validation type (V), which captures the availability of reference data or evaluative criteria. Drawing from both computer science and social science literature, we map existing uncertainty quantification (UQ) methods to this T-V typology and offer practical recommendations for researchers. Our framework provides both a methodological safeguard and a practical guide for integrating LLMs into rigorous social science research.

Related papers

Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research [0.0]
We introduce a framework that situates large language models (LLMs) usage along two dimensions, interpretive depth and autonomy.<n>We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science.
arXiv Detail & Related papers (2025-10-29T11:55:21Z)
Generative Large Language Models for Knowledge Representation: A Systematic Review of Concept Map Generation [1.163826615891678]
The rise of generative large language models (LLMs) has opened new opportunities for automating knowledge representation through concept maps.<n>This review systematically synthesizes the emerging body of research on LLM-enabled concept map generation.<n>Findings reveal six major methodological categories: human-in-the-loop systems, weakly supervised learning models, fine-tuned domain-specific LLMs, pre-trained LLMs with prompt engineering, hybrid systems integrating knowledge bases, and modular frameworks combining symbolic and statistical tools.
arXiv Detail & Related papers (2025-09-18T02:36:54Z)
Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning [53.82037883518254]
We introduce SciReas, a diverse suite of existing benchmarks for scientific reasoning tasks.<n>We then propose KRUX, a probing framework for studying the distinct roles of reasoning and knowledge in scientific tasks.
arXiv Detail & Related papers (2025-08-26T17:04:23Z)
Chain of Methodologies: Scaling Test Time Computation without Training [77.85633949575046]
Large Language Models (LLMs) often struggle with complex reasoning tasks due to insufficient in-depth insights in their training data.<n>This paper introduces the Chain of the (CoM) framework that enhances structured thinking by integrating human methodological insights.
arXiv Detail & Related papers (2025-06-08T03:46:50Z)
Navigating the Risks of Using Large Language Models for Text Annotation in Social Science Research [3.276333240221372]
Large language models (LLMs) have the potential to revolutionize computational social science.<n>We conduct a systematic evaluation of the promises and risks associated with using LLMs for text classification tasks.
arXiv Detail & Related papers (2025-03-27T23:33:36Z)
Intelligent Computing Social Modeling and Methodological Innovations in Political Science in the Era of Large Language Models [18.364402500460248]
The recent wave of artificial intelligence, epitomized by large language models (LLMs), presents opportunities and challenges for methodological innovation in political science.<n>This paper proposes the "Intelligent Computing Social Modeling" (ICSM) method to address these issues.
arXiv Detail & Related papers (2024-10-07T06:30:59Z)
Systematic Task Exploration with LLMs: A Study in Citation Text Generation [63.50597360948099]
Large language models (LLMs) bring unprecedented flexibility in defining and executing complex, creative natural language generation (NLG) tasks. We propose a three-component research framework that consists of systematic input manipulation, reference data, and output measurement. We use this framework to explore citation text generation -- a popular scholarly NLP task that lacks consensus on the task definition and evaluation metric.
arXiv Detail & Related papers (2024-07-04T16:41:08Z)
Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence [5.147767778946168]
We critically assess 23 state-of-the-art Large Language Models (LLMs) benchmarks. Our research uncovered significant limitations, including biases, difficulties in measuring genuine reasoning, adaptability, implementation inconsistencies, prompt engineering complexity, diversity, and the overlooking of cultural and ideological norms.
arXiv Detail & Related papers (2024-02-15T11:08:10Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
Rethinking Model Evaluation as Narrowing the Socio-Technical Gap [47.632123167141245]
We argue that model evaluation practices must take on a critical task to cope with the challenges and responsibilities brought by this homogenization.<n>We urge the community to develop evaluation methods based on real-world contexts and human requirements.
arXiv Detail & Related papers (2023-06-01T00:01:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.