Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research
- URL: http://arxiv.org/abs/2510.25432v1
- Date: Wed, 29 Oct 2025 11:55:21 GMT
- Title: Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research
- Authors: Ali Sanaei, Ali Rajabzadeh,
- Abstract summary: We introduce a framework that situates large language models (LLMs) usage along two dimensions, interpretive depth and autonomy.<n>We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are increasingly utilized by researchers across a wide range of domains, and qualitative social science is no exception; however, this adoption faces persistent challenges, including interpretive bias, low reliability, and weak auditability. We introduce a framework that situates LLM usage along two dimensions, interpretive depth and autonomy, thereby offering a straightforward way to classify LLM applications in qualitative research and to derive practical design recommendations. We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science that use LLMs as a tool and not strictly as the subject of study. Rather than granting models expansive freedom, our approach encourages researchers to decompose tasks into manageable segments, much as they would when delegating work to capable undergraduate research assistants. By maintaining low levels of autonomy and selectively increasing interpretive depth only where warranted and under supervision, one can plausibly reap the benefits of LLMs while preserving transparency and reliability.
Related papers
- LLM-Assisted Thematic Analysis: Opportunities, Limitations, and Recommendations [5.660100855602864]
Large Language Models (LLMs) are increasingly used to assist qualitative research in Software Engineering (SE)<n>This study investigates how experienced SE researchers conceptualize the opportunities, risks, and methodological implications of integrating LLMs into thematic analysis.
arXiv Detail & Related papers (2025-11-18T14:32:48Z) - LLM-REVal: Can We Trust LLM Reviewers Yet? [70.58742663985652]
Large language models (LLMs) have inspired researchers to integrate them extensively into the academic workflow.<n>This study focuses on how the deep integration of LLMs into both peer-review and research processes may influence scholarly fairness.
arXiv Detail & Related papers (2025-10-14T10:30:20Z) - Large Language Models Penetration in Scholarly Writing and Peer Review [43.600778691549706]
We evaluate the penetration of Large Language Models across academic perspectives and dimensions.<n>Our experiments demonstrate the effectiveness of textttLLMetrica, revealing the increasing role of LLMs in scholarly processes.<n>These findings emphasize the need for transparency, accountability, and ethical practices in LLM usage to maintain academic credibility.
arXiv Detail & Related papers (2025-02-16T16:37:34Z) - Practical Considerations for Agentic LLM Systems [5.455744338342196]
This paper frames actionable insights and considerations from the research community in the context of established application paradigms.<n> Namely, we position relevant research findings into four broad categories--Planning, Memory Tools, and Control Flow--based on common practices in application-focused literature.
arXiv Detail & Related papers (2024-12-05T11:57:49Z) - A Reality check of the benefits of LLM in business [1.9181612035055007]
Large language models (LLMs) have achieved remarkable performance in language understanding and generation tasks.
This paper thoroughly examines the usefulness and readiness of LLMs for business processes.
arXiv Detail & Related papers (2024-06-09T02:36:00Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - A Survey on Large Language Model based Autonomous Agents [105.2509166861984]
Large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence.<n>This paper delivers a systematic review of the field of LLM-based autonomous agents from a holistic perspective.<n>We present a comprehensive overview of the diverse applications of LLM-based autonomous agents in the fields of social science, natural science, and engineering.
arXiv Detail & Related papers (2023-08-22T13:30:37Z) - Through the Lens of Core Competency: Survey on Evaluation of Large
Language Models [27.271533306818732]
Large language model (LLM) has excellent performance and wide practical uses.
Existing evaluation tasks are difficult to keep up with the wide range of applications in real-world scenarios.
We summarize 4 core competencies of LLM, including reasoning, knowledge, reliability, and safety.
Under this competency architecture, similar tasks are combined to reflect corresponding ability, while new tasks can also be easily added into the system.
arXiv Detail & Related papers (2023-08-15T17:40:34Z) - A Comprehensive Overview of Large Language Models [68.22178313875618]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks.
This article provides an overview of the existing literature on a broad range of LLM-related concepts.
arXiv Detail & Related papers (2023-07-12T20:01:52Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.