Fugu-MT 論文翻訳(概要): Inference Gap in Domain Expertise and Machine Intelligence in Named Entity Recognition: Creation of and Insights from a Substance Use-related Dataset

論文の概要: Inference Gap in Domain Expertise and Machine Intelligence in Named Entity Recognition: Creation of and Insights from a Substance Use-related Dataset

arxiv url: http://arxiv.org/abs/2508.19467v1
Date: Tue, 26 Aug 2025 23:09:43 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-28 19:07:41.442506
Title: Inference Gap in Domain Expertise and Machine Intelligence in Named Entity Recognition: Creation of and Insights from a Substance Use-related Dataset
Title（参考訳）: ドメインエキスパートにおける推論ギャップと名前付きエンティティ認識におけるマシンインテリジェンス
Authors: Sumon Kanti Dey, Jeanne M. Powell, Azra Ismail, Jeanmarie Perrone, Abeed Sarker,
Abstract要約: 非医療オピオイドの使用は公衆衛生上の緊急の課題である。ソーシャルメディアの物語から、自己報告結果の2つのカテゴリを抽出するために、名前付きエンティティ認識(NER)フレームワークを提案する。我々は、ゼロショットと少数ショットのインコンテキスト学習設定下で、微調整エンコーダモデルと最先端の大規模言語モデル(LLM)の両方を評価する。
参考スコア（独自算出の注目度）: 6.343399421398501
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nonmedical opioid use is an urgent public health challenge, with far-reaching clinical and social consequences that are often underreported in traditional healthcare settings. Social media platforms, where individuals candidly share first-person experiences, offer a valuable yet underutilized source of insight into these impacts. In this study, we present a named entity recognition (NER) framework to extract two categories of self-reported consequences from social media narratives related to opioid use: ClinicalImpacts (e.g., withdrawal, depression) and SocialImpacts (e.g., job loss). To support this task, we introduce RedditImpacts 2.0, a high-quality dataset with refined annotation guidelines and a focus on first-person disclosures, addressing key limitations of prior work. We evaluate both fine-tuned encoder-based models and state-of-the-art large language models (LLMs) under zero- and few-shot in-context learning settings. Our fine-tuned DeBERTa-large model achieves a relaxed token-level F1 of 0.61 [95% CI: 0.43-0.62], consistently outperforming LLMs in precision, span accuracy, and adherence to task-specific guidelines. Furthermore, we show that strong NER performance can be achieved with substantially less labeled data, emphasizing the feasibility of deploying robust models in resource-limited settings. Our findings underscore the value of domain-specific fine-tuning for clinical NLP tasks and contribute to the responsible development of AI tools that may enhance addiction surveillance, improve interpretability, and support real-world healthcare decision-making. The best performing model, however, still significantly underperforms compared to inter-expert agreement (Cohen's kappa: 0.81), demonstrating that a gap persists between expert intelligence and current state-of-the-art NER/AI capabilities for tasks requiring deep domain knowledge.
Abstract（参考訳）: 非医療オピオイドの使用は公衆衛生上の緊急の課題であり、従来の医療環境では報告されていないような、臨床や社会的な影響が広範囲に及んでいる。個人が一人称体験を共有できるソーシャルメディアプラットフォームは、これらの影響に関する貴重な知識を提供する。本研究では,オピオイドの使用に関連するソーシャルメディアの物語から,自己申告結果の2つのカテゴリを抽出する,名前付きエンティティ認識(NER)フレームワークを提案する。このタスクをサポートするために、RedditImpacts 2.0を紹介します。我々は、ゼロショットと少数ショットのインコンテキスト学習設定下で、微調整エンコーダモデルと最先端の大規模言語モデル(LLM)の両方を評価する。細調整のDeBERTa-largeモデルでは,ゆるやかなトークンレベルF1を0.61(95% CI: 0.43-0.62)で達成する。さらに、リソース制限された設定でロバストなモデルをデプロイする可能性を強調し、ラベル付きデータを大幅に少なくすることで、強力なNER性能を実現することができることを示す。我々の研究は、臨床NLPタスクにおけるドメイン固有の微調整の価値を強調し、中毒の監視を強化し、解釈可能性を改善し、現実の医療意思決定を支援するAIツールの開発に寄与する。しかしながら、最高のパフォーマンスモデルは、専門家間の合意(Cohen's kappa: 0.81)と比較すると、依然として大幅にパフォーマンスが低下しており、深いドメイン知識を必要とするタスクに対して、専門家の知性と現在の最先端のNER/AI能力の間のギャップが持続することを示している。

論文の概要: Inference Gap in Domain Expertise and Machine Intelligence in Named Entity Recognition: Creation of and Insights from a Substance Use-related Dataset

関連論文リスト