Fugu-MT 論文翻訳(概要): Long-Context Long-Form Question Answering for Legal Domain

論文の概要: Long-Context Long-Form Question Answering for Legal Domain

arxiv url: http://arxiv.org/abs/2602.07190v1
Date: Fri, 06 Feb 2026 20:51:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-10 20:26:24.493013
Title: Long-Context Long-Form Question Answering for Legal Domain
Title（参考訳）: 法的領域に対する長期質問応答
Authors: Anagha Kulkarni, Parin Rajesh Jhaveri, Prasha Shrestha, Yu Tong Han, Reza Amini, Behrouz Madahian,
Abstract要約: 法律文書の慣用性を考慮した長文質問応答の課題を長文回答の文脈で解決する。本稿では, (a) ソース文書からの検索を改善するために, ドメイン固有語彙を分解し, (b) 複雑な文書レイアウトを解析し, セクションとフットノートを分離し, それらを適切にリンクし, (c) 正確なドメイン固有語彙を用いて包括的回答を生成することのできる質問応答システムを提案する。
参考スコア（独自算出の注目度）: 1.2776569352615768
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Legal documents have complex document layouts involving multiple nested sections, lengthy footnotes and further use specialized linguistic devices like intricate syntax and domain-specific vocabulary to ensure precision and authority. These inherent characteristics of legal documents make question answering challenging, and particularly so when the answer to the question spans several pages (i.e. requires long-context) and is required to be comprehensive (i.e. a long-form answer). In this paper, we address the challenges of long-context question answering in context of long-form answers given the idiosyncrasies of legal documents. We propose a question answering system that can (a) deconstruct domain-specific vocabulary for better retrieval from source documents, (b) parse complex document layouts while isolating sections and footnotes and linking them appropriately, (c) generate comprehensive answers using precise domain-specific vocabulary. We also introduce a coverage metric that classifies the performance into recall-based coverage categories allowing human users to evaluate the recall with ease. We curate a QA dataset by leveraging the expertise of professionals from fields such as law and corporate tax. Through comprehensive experiments and ablation studies, we demonstrate the usability and merit of the proposed system.
Abstract（参考訳）: 法律文書は、複数のネストされたセクション、長い脚注を含む複雑な文書レイアウトを持ち、さらに複雑な構文やドメイン固有の語彙のような特殊な言語装置を使用して精度と権威を確保する。これらの法的文書の固有の特徴は、質問に答えることが難しく、特に質問に対する回答が複数のページ(長文)にまたがる場合には、包括的でなければならない(長文の回答)。本稿では,法文書の慣用性を考慮した長文質問応答の課題について考察する。質問応答システムを提案する。 (a)ソース文書からの検索を改善するためにドメイン固有の語彙をデコンストラクトする。 b) セクションと脚注を分離し、それらを適切にリンクしながら、複雑な文書レイアウトを解析する。 (c) 正確なドメイン固有語彙を用いて包括的回答を生成する。また,ユーザによるリコールの容易な評価を可能にするリコールベースのカバレッジカテゴリに,パフォーマンスを分類するカバレッジ指標も導入した。法律や法人税などの分野から専門家の専門知識を活用することで、QAデータセットをキュレートする。総合的な実験とアブレーション研究を通じて,提案システムのユーザビリティとメリットを実証する。

論文の概要: Long-Context Long-Form Question Answering for Legal Domain

関連論文リスト