Exploring Qualitative Research Using LLMs
- URL: http://arxiv.org/abs/2306.13298v1
- Date: Fri, 23 Jun 2023 05:21:36 GMT
- Title: Exploring Qualitative Research Using LLMs
- Authors: Muneera Bano, Didar Zowghi, Jon Whittle
- Abstract summary: This study aimed to compare and contrast the comprehension capabilities of humans and AI driven large language models.
We conducted an experiment with small sample of Alexa app reviews, initially classified by a human analyst.
LLMs were then asked to classify these reviews and provide the reasoning behind each classification.
- Score: 8.545798128849091
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of AI driven large language models (LLMs) have stirred discussions
about their role in qualitative research. Some view these as tools to enrich
human understanding, while others perceive them as threats to the core values
of the discipline. This study aimed to compare and contrast the comprehension
capabilities of humans and LLMs. We conducted an experiment with small sample
of Alexa app reviews, initially classified by a human analyst. LLMs were then
asked to classify these reviews and provide the reasoning behind each
classification. We compared the results with human classification and
reasoning. The research indicated a significant alignment between human and
ChatGPT 3.5 classifications in one third of cases, and a slightly lower
alignment with GPT4 in over a quarter of cases. The two AI models showed a
higher alignment, observed in more than half of the instances. However, a
consensus across all three methods was seen only in about one fifth of the
classifications. In the comparison of human and LLMs reasoning, it appears that
human analysts lean heavily on their individual experiences. As expected, LLMs,
on the other hand, base their reasoning on the specific word choices found in
app reviews and the functional components of the app itself. Our results
highlight the potential for effective human LLM collaboration, suggesting a
synergistic rather than competitive relationship. Researchers must continuously
evaluate LLMs role in their work, thereby fostering a future where AI and
humans jointly enrich qualitative research.
Related papers
- LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing [106.45895712717612]
Large language models (LLMs) have shown remarkable versatility in various generative tasks.
This study focuses on the topic of LLMs assist NLP Researchers.
To our knowledge, this is the first work to provide such a comprehensive analysis.
arXiv Detail & Related papers (2024-06-24T01:30:22Z) - Explicit and Implicit Large Language Model Personas Generate Opinions but Fail to Replicate Deeper Perceptions and Biases [14.650234624251716]
Large language models (LLMs) are increasingly being used in human-centered social scientific tasks.
These tasks are highly subjective and dependent on human factors, such as one's environment, attitudes, beliefs, and lived experiences.
We examine the role of prompting LLMs with human-like personas and ask the models to answer as if they were a specific human.
arXiv Detail & Related papers (2024-06-20T16:24:07Z) - Human-Instruction-Free LLM Self-Alignment with Limited Samples [64.69906311787055]
We propose an algorithm that can self-align large language models (LLMs) iteratively without active human involvement.
Unlike existing works, our algorithm relies on neither human-crafted instructions nor labeled rewards, significantly reducing human involvement.
We show that our method can unlock the LLMs' self-generalization ability to perform alignment with near-zero human supervision.
arXiv Detail & Related papers (2024-01-06T14:00:12Z) - Framework-Based Qualitative Analysis of Free Responses of Large Language
Models: Algorithmic Fidelity [1.7947441434255664]
Large-scale generative Language Models (LLMs) can simulate free responses to interview questions like those traditionally analyzed using qualitative research methods.
Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative methods.
arXiv Detail & Related papers (2023-09-06T15:00:44Z) - Exploring the psychology of LLMs' Moral and Legal Reasoning [0.0]
Large language models (LLMs) exhibit expert-level performance in tasks across a wide range of different domains.
Ethical issues raised by LLMs and the need to align future versions makes it important to know how state of the art models reason about moral and legal issues.
We replicate eight studies from the experimental literature with instances of Google's Gemini Pro, Anthropic's Claude 2.1, OpenAI's GPT-4, and Meta's Llama 2 Chat 70b.
We find that alignment with human responses shifts from one experiment to another, and that models differ amongst themselves as to their overall
arXiv Detail & Related papers (2023-08-02T16:36:58Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z) - Revisiting the Reliability of Psychological Scales on Large Language
Models [66.31055885857062]
This study aims to determine the reliability of applying personality assessments to Large Language Models (LLMs)
By shedding light on the personalization of LLMs, our study endeavors to pave the way for future explorations in this field.
arXiv Detail & Related papers (2023-05-31T15:03:28Z) - Can Large Language Models Be an Alternative to Human Evaluations? [80.81532239566992]
Large language models (LLMs) have demonstrated exceptional performance on unseen tasks when only the task instructions are provided.
We show that the result of LLM evaluation is consistent with the results obtained by expert human evaluation.
arXiv Detail & Related papers (2023-05-03T07:28:50Z) - Who's Thinking? A Push for Human-Centered Evaluation of LLMs using the
XAI Playbook [30.985555463848264]
We draw parallels between the relatively mature field of XAI and the rapidly evolving research boom around large language models.
We argue that humans' tendencies should rest front and center when evaluating deployed large language models.
arXiv Detail & Related papers (2023-03-10T22:15:49Z) - Can ChatGPT Assess Human Personalities? A General Evaluation Framework [70.90142717649785]
Large Language Models (LLMs) have produced impressive results in various areas, but their potential human-like psychology is still largely unexplored.
This paper presents a generic evaluation framework for LLMs to assess human personalities based on Myers Briggs Type Indicator (MBTI) tests.
arXiv Detail & Related papers (2023-03-01T06:16:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.