LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis
- URL: http://arxiv.org/abs/2310.15100v1
- Date: Mon, 23 Oct 2023 17:05:59 GMT
- Title: LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis
- Authors: Shih-Chieh Dai, Aiping Xiong, Lun-Wei Ku
- Abstract summary: Thematic analysis (TA) has been widely used for analyzing qualitative data in many disciplines and fields.
Human coders develop and deepen their data interpretation and coding over multiple iterations, making TA labor-intensive and time-consuming.
We propose a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct TA with in-context learning (ICL)
- Score: 18.775126929754833
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Thematic analysis (TA) has been widely used for analyzing qualitative data in
many disciplines and fields. To ensure reliable analysis, the same piece of
data is typically assigned to at least two human coders. Moreover, to produce
meaningful and useful analysis, human coders develop and deepen their data
interpretation and coding over multiple iterations, making TA labor-intensive
and time-consuming. Recently the emerging field of large language models (LLMs)
research has shown that LLMs have the potential replicate human-like behavior
in various tasks: in particular, LLMs outperform crowd workers on
text-annotation tasks, suggesting an opportunity to leverage LLMs on TA. We
propose a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct
TA with in-context learning (ICL). This framework provides the prompt to frame
discussions with a LLM (e.g., GPT-3.5) to generate the final codebook for TA.
We demonstrate the utility of this framework using survey datasets on the
aspects of the music listening experience and the usage of a password manager.
Results of the two case studies show that the proposed framework yields similar
coding quality to that of human coders but reduces TA's labor and time demands.
Related papers
- The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks.
This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - CIBench: Evaluating Your LLMs with a Code Interpreter Plugin [68.95137938214862]
We propose an interactive evaluation framework, named CIBench, to comprehensively assess LLMs' ability to utilize code interpreters for data science tasks.
The evaluation dataset is constructed using an LLM-human cooperative approach and simulates an authentic workflow by leveraging consecutive and interactive IPython sessions.
We conduct extensive experiments to analyze the ability of 24 LLMs on CIBench and provide valuable insights for future LLMs in code interpreter utilization.
arXiv Detail & Related papers (2024-07-15T07:43:55Z) - Benchmarking Large Language Models for Molecule Prediction Tasks [7.067145619709089]
Large Language Models (LLMs) stand at the forefront of a number of Natural Language Processing (NLP) tasks.
This paper explores a fundamental question: Can LLMs effectively handle molecule prediction tasks?
We identify several classification and regression prediction tasks across six standard molecule datasets.
We compare their performance with existing Machine Learning (ML) models, which include text-based models and those specifically designed for analysing the geometric structure of molecules.
arXiv Detail & Related papers (2024-03-08T05:59:56Z) - Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation [128.01050030936028]
We propose an information refinement training method named InFO-RAG.
InFO-RAG is low-cost and general across various tasks.
It improves the performance of LLaMA2 by an average of 9.39% relative points.
arXiv Detail & Related papers (2024-02-28T08:24:38Z) - Benchmarking LLMs on the Semantic Overlap Summarization Task [9.656095701778975]
This paper comprehensively evaluates Large Language Models (LLMs) on the Semantic Overlap Summarization (SOS) task.
We report well-established metrics like ROUGE, BERTscore, and SEM-F1$ on two different datasets of alternative narratives.
arXiv Detail & Related papers (2024-02-26T20:33:50Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Under the Surface: Tracking the Artifactuality of LLM-Generated Data [21.002983022237604]
This work delves into the expanding role of large language models (LLMs) in generating artificial data.
To the best of our knowledge, this is the first study to aggregate various types of LLM-generated text data.
Despite artificial data's capability to match human performance, this paper reveals significant hidden disparities.
arXiv Detail & Related papers (2024-01-26T07:53:27Z) - Large Language Models for Code Analysis: Do LLMs Really Do Their Job? [13.48555476110316]
Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks.
This paper offers a comprehensive evaluation of LLMs' capabilities in performing code analysis tasks.
arXiv Detail & Related papers (2023-10-18T22:02:43Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.