Fugu-MT 論文翻訳(概要): MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

論文の概要: MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

arxiv url: http://arxiv.org/abs/2605.11254v1
Date: Mon, 11 May 2026 21:26:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.437422
Title: MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval
Title（参考訳）: MIRA:マルチカテゴリ統合検索のためのLLM支援ベンチマーク
Authors: Mehmet Deniz Türkmen, Suchana Datta, Dwaipayan Roy, Daniel Hienert, Philipp Mayr, Derek Greene,
Abstract要約: MIRAは、大規模社会科学検索プラットフォームに基づく新しいベンチマークである。異種カテゴリーにまたがるカテゴリーを意識してランク付けするように設計されている。 4つの異なるカテゴリの学術的項目をカバーし、多面的評価を可能にする。
参考スコア（独自算出の注目度）: 7.510578759254574
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Users increasingly expect modern search systems to offer a unified interface that seamlessly retrieves information from diverse data sources and formats. However, current information retrieval (IR) evaluation benchmarks have not kept pace with this development, primarily due to the lack of test collections that represent the diversity of contemporary search domains. We address this critical gap with MIRA, a novel benchmark based on a large-scale social science search platform. MIRA is designed for category-aware ranking across heterogeneous categories - Publications, Research Data, Variables, and Instruments & Tools - within a single, unified evaluation framework. The proposed collection is distinctive in several ways: (1) it is built upon real user queries, providing a more realistic basis for evaluation; (2) it covers scholarly items from four distinct categories, enabling multi-faceted evaluation; and (3) it leverages a Large Language Model to generate topic descriptions and narratives, as well as for relevance assessment with respect to these topics, substantially reducing the labor and cost of test collection generation. We release this resource to benefit the community by providing a foundational testbed for the research on multi-faceted, category-aware, integrated, or cross-category information retrieval.
Abstract（参考訳）: ユーザは、多様なデータソースやフォーマットから情報をシームレスに取得する統一されたインターフェースを、現代的な検索システムに提供することを、ますます期待している。しかし、現在の情報検索(IR)評価ベンチマークは、現代の検索領域の多様性を表すテストコレクションが欠如していることから、この開発に追随していない。大規模ソーシャルサイエンス検索プラットフォームに基づく新しいベンチマークであるMIRAと、この重要なギャップに対処する。 MIRAは、単一の統一された評価フレームワーク内で、異種カテゴリ(パブリケーション、リサーチデータ、変数、およびインスツルメンツ&ツール)にまたがるカテゴリを意識したランク付けのために設計されている。提案したコレクションは,(1)実際のユーザクエリに基づいて構築され,より現実的な評価基盤を提供し,(2)4つの異なるカテゴリの学術的項目を網羅し,多面的評価を可能にし,(3)大規模言語モデルを用いてトピック記述や物語を生成するとともに,これらのトピックに対する関連性評価を行い,テストコレクション生成の労力とコストを大幅に削減する。我々はこのリソースを,多面的,カテゴリ対応,統合的,横断的な情報検索研究のための基礎的なテストベッドを提供することで,コミュニティに利益をもたらすためにリリースする。

論文の概要: MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

関連論文リスト