Fugu-MT 論文翻訳(概要): A New HOPE: Domain-agnostic Automatic Evaluation of Text Chunking

論文の概要: A New HOPE: Domain-agnostic Automatic Evaluation of Text Chunking

arxiv url: http://arxiv.org/abs/2505.02171v1
Date: Sun, 04 May 2025 16:22:27 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-06 18:49:35.463292
Title: A New HOPE: Domain-agnostic Automatic Evaluation of Text Chunking
Title（参考訳）: 新しいHOPE:テキストチャンキングのドメインに依存しない自動評価
Authors: Henrik Brådland, Morten Goodwin, Per-Arne Andersen, Alexander S. Nossum, Aditya Gupta,
Abstract要約: 文書チャンキングは検索強化世代(RAG)に根本的に影響する現在、さまざまなチャンキングメソッドの影響を分析するためのフレームワークはありません。本稿では,チャンキングプロセスの本質的特徴を3段階に定義する新しい手法を提案する。
参考スコア（独自算出の注目度）: 44.47350338664039
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Document chunking fundamentally impacts Retrieval-Augmented Generation (RAG) by determining how source materials are segmented before indexing. Despite evidence that Large Language Models (LLMs) are sensitive to the layout and structure of retrieved data, there is currently no framework to analyze the impact of different chunking methods. In this paper, we introduce a novel methodology that defines essential characteristics of the chunking process at three levels: intrinsic passage properties, extrinsic passage properties, and passages-document coherence. We propose HOPE (Holistic Passage Evaluation), a domain-agnostic, automatic evaluation metric that quantifies and aggregates these characteristics. Our empirical evaluations across seven domains demonstrate that the HOPE metric correlates significantly (p > 0.13) with various RAG performance indicators, revealing contrasts between the importance of extrinsic and intrinsic properties of passages. Semantic independence between passages proves essential for system performance with a performance gain of up to 56.2% in factual correctness and 21.1% in answer correctness. On the contrary, traditional assumptions about maintaining concept unity within passages show minimal impact. These findings provide actionable insights for optimizing chunking strategies, thus improving RAG system design to produce more factually correct responses.
Abstract（参考訳）: 文書チャンキングは、索引付けの前にソースがどのようにセグメント化されているかを決定することによって、検索拡張生成(RAG)に根本的な影響を与える。大規模な言語モデル(LLM)が取得したデータのレイアウトや構造に敏感であるという証拠はありますが、現時点で、さまざまなチャンキングメソッドの影響を分析するためのフレームワークはありません。本稿では, チャンキングプロセスの本質的特性を, 内在性通路特性, 外在性通路特性, 内在性文書コヒーレンスという3つのレベルで定義する手法を提案する。本稿では,これらの特徴を定量化し集約するドメインに依存しない自動評価指標HOPE(Holistic Passage Evaluation)を提案する。 7つの領域にわたる経験的評価の結果,HOPE測定値は様々なRAG性能指標と有意に相関し,外生特性と内生特性の対比が示された。パス間のセマンティックな独立性は、実際の正しさは56.2%、答えの正しさは21.1%まで性能が向上するシステムのパフォーマンスに不可欠であることが証明されている。反対に、通路内の概念統一を維持するという伝統的な仮定は、最小限の影響を示す。これらの結果は,チャンキング戦略を最適化するための実用的な洞察を与え,RAGシステム設計を改善して,より現実的に正しい応答を生成する。

関連論文リスト

Dynamic Context Selection for Retrieval-Augmented Generation: Mitigating Distractors and Positional Bias [1.7674345486888503]
Retrieval Augmented Generation (RAG)は,大規模コーパスから抽出した外部知識を組み込むことで,言語モデルの性能を向上させる。標準的なRAGシステムは、関連する情報を見逃したり、意味的に無関係な経路を導入することができる固定トップk検索戦略に依存している。本稿では,クエリ固有の情報要求に基づいて検索する文書の最適個数を動的に予測するコンテキストサイズ分類器を提案する。
論文参考訳（メタデータ） (2025-12-16T11:30:40Z)
Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation [57.97548022208733]
キー値抽出における表面的選択が精度と安定性のシフトを引き起こすことを示す。生成前の文脈表現を適応的に標準化する戦略であるコンテキスト正規化を導入する。
論文参考訳（メタデータ） (2025-10-15T06:28:25Z)
Evaluating the Robustness of Dense Retrievers in Interdisciplinary Domains [0.6432265982168868]
評価ベンチマークの特徴は、検索モデルにおけるドメイン適応の真の利点を歪める可能性がある。トピックの多様性,境界重なり,意味的複雑性といった,大きく異なる特徴を持つ2つのベンチマークが,微調整のメリットの認識に影響を及ぼす可能性があることを示す。
論文参考訳（メタデータ） (2025-06-16T23:54:08Z)
AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset [95.45316956434608]
嗜好学習は、大きな言語モデルと人間の価値の整合に不可欠である。私たちの作業は、好みのデータセット設計をアドホックなスケーリングからコンポーネント対応の最適化にシフトします。
論文参考訳（メタデータ） (2025-04-04T17:33:07Z)
Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability [1.8274323268621635]
Real Explainer(RealExp)は、Shapley値を個々の特徴と特徴相関の重要度に分解する、解釈可能性の手法である。 RealExpは、個々の特徴とそれらの相互作用を正確に定量化することで、解釈可能性を高める。
論文参考訳（メタデータ） (2024-12-02T10:50:50Z)
Exploring Information Retrieval Landscapes: An Investigation of a Novel Evaluation Techniques and Comparative Document Splitting Methods [0.0]
本研究では, 教科書の構造的性質, 記事の簡潔さ, 小説の物語的複雑さについて, 明確な検索戦略が必要であることを示した。オープンソースのモデルを用いて,質問対と回答対の包括的データセットを生成する新しい評価手法を提案する。評価には、SequenceMatcher、BLEU、METEOR、BERT Scoreなどの重み付けされたスコアを使用して、システムの正確性と妥当性を評価する。
論文参考訳（メタデータ） (2024-09-13T02:08:47Z)
Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models [0.8602553195689513]
Entity-Aspect Sentiment Triplet extract (EASTE)は、Aspect-Based Sentiment Analysisタスクである。本研究は,EASTEタスクにおける高性能化を目標とし,モデルサイズ,タイプ,適応技術がタスクパフォーマンスに与える影響について検討する。最終的には、複雑な感情分析における詳細な洞察と最先端の成果を提供する。
論文参考訳（メタデータ） (2024-07-04T16:48:14Z)
Keypoint Description by Symmetry Assessment -- Applications in Biometrics [49.547569925407814]
有限展開によりキーポイント周辺の近傍を記述するモデルに基づく特徴抽出器を提案する。そのような関数の等曲線は、原点(キーポイント)と推定されたパラメータがよく定義された幾何学的解釈を持つように、高度に対称な w.r.t である。
論文参考訳（メタデータ） (2023-11-03T00:49:25Z)
TransFA: Transformer-based Representation for Face Attribute Evaluation [87.09529826340304]
我々はtextbfTransFA を用いたtextbfattribute 評価のための新しい textbf Transformer 表現を提案する。提案するTransFAは,最先端手法と比較して優れた性能を示す。
論文参考訳（メタデータ） (2022-07-12T10:58:06Z)
GO FIGURE: A Meta Evaluation of Factuality in Summarization [131.1087461486504]
本稿では,現実性評価指標を評価するメタ評価フレームワークGO FIGUREを紹介する。 10個の実測値のベンチマーク分析により、我々のフレームワークが堅牢で効率的な評価を提供することが明らかとなった。また、QAメトリクスは、ドメイン間の事実性を測定する標準的なメトリクスよりも一般的に改善されているが、パフォーマンスは、質問を生成する方法に大きく依存していることも明らかにしている。
論文参考訳（メタデータ） (2020-10-24T08:30:20Z)
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding [71.2260967797055]
アスペクトベース感情分析のための弱教師付きアプローチを提案する。 We learn sentiment, aspects> joint topic embeddeds in the word embedding space。次に、ニューラルネットワークを用いて単語レベルの識別情報を一般化する。
論文参考訳（メタデータ） (2020-10-13T21:33:24Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。