Fugu-MT 論文翻訳(概要): OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive

論文の概要: OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive

arxiv url: http://arxiv.org/abs/2511.09914v2
Date: Fri, 14 Nov 2025 03:04:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-17 14:38:02.173617
Title: OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive
Title（参考訳）: OIDA-QA: Opioid Industry Documents Archiveの解析のためのマルチモーダルベンチマーク
Authors: Xuan Shen, Brian Wingenroth, Zichao Wang, Jason Kuen, Wanrong Zhu, Ruiyi Zhang, Yiwei Wang, Lichun Ma, Anqi Liu, Hongfu Liu, Tong Sun, Kevin S. Hawkins, Kate Tasker, G. Caleb Alexander, Jiuxiang Gu,
Abstract要約: オピオイド危機は公衆衛生にとって重要な瞬間である。 UCSF-JHU Opioid Industry Documents Archive(OIDA)に公開されているデータと文書本稿では,文書属性に応じて元のデータセットを整理することで,この問題に対処する。
参考スコア（独自算出の注目度）: 50.468138755368805
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The opioid crisis represents a significant moment in public health that reveals systemic shortcomings across regulatory systems, healthcare practices, corporate governance, and public policy. Analyzing how these interconnected systems simultaneously failed to protect public health requires innovative analytic approaches for exploring the vast amounts of data and documents disclosed in the UCSF-JHU Opioid Industry Documents Archive (OIDA). The complexity, multimodal nature, and specialized characteristics of these healthcare-related legal and corporate documents necessitate more advanced methods and models tailored to specific data types and detailed annotations, ensuring the precision and professionalism in the analysis. In this paper, we tackle this challenge by organizing the original dataset according to document attributes and constructing a benchmark with 400k training documents and 10k for testing. From each document, we extract rich multimodal information-including textual content, visual elements, and layout structures-to capture a comprehensive range of features. Using multiple AI models, we then generate a large-scale dataset comprising 360k training QA pairs and 10k testing QA pairs. Building on this foundation, we develop domain-specific multimodal Large Language Models (LLMs) and explore the impact of multimodal inputs on task performance. To further enhance response accuracy, we incorporate historical QA pairs as contextual grounding for answering current queries. Additionally, we incorporate page references within the answers and introduce an importance-based page classifier, further improving the precision and relevance of the information provided. Preliminary results indicate the improvements with our AI assistant in document information extraction and question-answering tasks. The dataset is available at: https://huggingface.co/datasets/opioidarchive/oida-qa
Abstract（参考訳）: オピオイド危機は、規制システム、医療実践、企業統治、公共政策にまたがるシステム的欠陥を明らかにする公衆衛生の重要な瞬間である。これらの相互接続システムがどのようにして公衆衛生を保護できなかったかを分析するには、UCSF-JHU Opioid Industry Documents Archive (OIDA)で公開されている大量のデータと文書を探索するための革新的な分析手法が必要である。これらの医療関連法律および企業文書の複雑さ、マルチモーダル性、特殊特性は、特定のデータタイプや詳細なアノテーションに合わせたより高度な方法とモデルを必要とし、分析における精度と専門性を保証する。本稿では、文書属性に基づいて元のデータセットを整理し、400kのトレーニング文書と10kのテスト用ベンチマークを構築することで、この問題に対処する。各文書から、テキストコンテンツ、視覚要素、レイアウト構造を含む豊富なマルチモーダル情報を抽出し、包括的特徴を捉える。複数のAIモデルを使用して、360kトレーニングQAペアと10kテストQAペアからなる大規模なデータセットを生成する。この基礎の上にドメイン固有のマルチモーダル言語モデル(LLM)を開発し,タスク性能に対するマルチモーダル入力の影響について検討する。応答精度をさらに高めるため、過去のQAペアを現在のクエリに応答するためのコンテキストグラウンドとして組み込んだ。さらに、回答にページ参照を組み込んで、重要度に基づくページ分類を導入し、提供された情報の精度と関連性をさらに向上する。予備的な結果は、文書情報抽出および質問応答タスクにおけるAIアシスタントの改善を示している。データセットは以下の通りである。 https://huggingface.co/datasets/opioidarchive/oida-qa

論文の概要: OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive

関連論文リスト