Fugu-MT 論文翻訳(概要): Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis

論文の概要: Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis

arxiv url: http://arxiv.org/abs/2512.00046v1
Date: Mon, 17 Nov 2025 13:03:27 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-07 19:06:32.405751
Title: Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis
Title（参考訳）: 帰納的符号化によるテキストアノテーション:定性データ解析における人間専門家とLLMの比較
Authors: Angelina Parfenova, Andreas Marfurt, Alexander Denzler, Juergen Pfeffer,
Abstract要約: この研究は、6つのオープンソースの大規模言語モデル(LLM)の性能を、人間の専門家と比較して評価する。人間のコーダーは、複雑な文をラベル付けするときに常にうまく機能するが、単純な文では苦労するが、LSMは反対の傾向を示す。
参考スコア（独自算出の注目度）: 44.08932633077333
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper investigates the automation of qualitative data analysis, focusing on inductive coding using large language models (LLMs). Unlike traditional approaches that rely on deductive methods with predefined labels, this research investigates the inductive process where labels emerge from the data. The study evaluates the performance of six open-source LLMs compared to human experts. As part of the evaluation, experts rated the perceived difficulty of the quotes they coded. The results reveal a peculiar dichotomy: human coders consistently perform well when labeling complex sentences but struggle with simpler ones, while LLMs exhibit the opposite trend. Additionally, the study explores systematic deviations in both human and LLM generated labels by comparing them to the golden standard from the test set. While human annotations may sometimes differ from the golden standard, they are often rated more favorably by other humans. In contrast, some LLMs demonstrate closer alignment with the true labels but receive lower evaluations from experts.
Abstract（参考訳）: 本稿では,大規模言語モデル(LLM)を用いた帰納的符号化に着目し,定性データ解析の自動化について検討する。事前に定義されたラベルを持つ帰納的手法に依存する従来の手法とは異なり、この研究はラベルがデータから現れる帰納的過程を研究する。この研究は、人間の専門家と比較して、6つのオープンソースLLMの性能を評価した。評価の一環として、専門家は、彼らがコーディングした引用の難しさについて評価した。人間のコーダーは、複雑な文をラベル付けするときに常にうまく機能するが、単純な文では苦労するが、LSMは反対の傾向を示す。さらに,本研究では,ヒトおよびLDM生成ラベルの系統的偏差について,テストセットの黄金標準と比較することによって検討した。人間のアノテーションは時に黄金の標準と異なることがあるが、他の人間より好意的に評価されることが多い。対照的に、一部のLCMは真のラベルとの密接な整合性を示しているが、専門家から低い評価を受けている。

論文の概要: Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis

関連論文リスト