Fugu-MT 論文翻訳(概要): Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM

論文の概要: Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM

arxiv url: http://arxiv.org/abs/2509.02558v1
Date: Tue, 02 Sep 2025 17:53:57 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-04 15:17:04.136538
Title: Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM
Title（参考訳）: BRIGHT: Anserini, Pyserini, RankLLMによる再現可能なベースライン
Authors: Yijun Ge, Sahel Sharifymoghaddam, Jimmy Lin,
Abstract要約: BRIGHTベンチマークは、さまざまなドメインに対する推論集約的なクエリからなるデータセットである。本稿では,大規模言語モデルを用いたリストワイズ・リランクを適用し,推論集約的なクエリに対するリランクの影響をさらに調査する。これらのベースラインは、一般的な検索とツールキットであるAnserini、Pyserini、RanLLMに組み込まれている。
参考スコア（独自算出の注目度）: 44.67715098747863
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The BRIGHT benchmark is a dataset consisting of reasoning-intensive queries over diverse domains. We explore retrieval results on BRIGHT using a range of retrieval techniques, including sparse, dense, and fusion methods, and establish reproducible baselines. We then apply listwise reranking with large language models (LLMs) to further investigate the impact of reranking on reasoning-intensive queries. These baselines are integrated into popular retrieval and reranking toolkits Anserini, Pyserini, and RankLLM, with two-click reproducibility that makes them easy to build upon and convenient for further development. While attempting to reproduce the results reported in the original BRIGHT paper, we find that the provided BM25 scores differ notably from those that we obtain using Anserini and Pyserini. We discover that this difference is due to BRIGHT's implementation of BM25, which applies BM25 on the query rather than using the standard bag-of-words approach, as in Anserini, to construct query vectors. This difference has become increasingly relevant due to the rise of longer queries, with BRIGHT's lengthy reasoning-intensive queries being a prime example, and further accentuated by the increasing usage of retrieval-augmented generation, where LLM prompts can grow to be much longer than ''traditional'' search engine queries. Our observation signifies that it may be time to reconsider BM25 approaches going forward in order to better accommodate emerging applications. To facilitate this, we integrate query-side BM25 into both Anserini and Pyserini.
Abstract（参考訳）: BRIGHTベンチマークは、さまざまなドメインに対する推論集約的なクエリからなるデータセットである。本研究では,スパース法,高密度法,融合法などの検索手法を用いてBRIGHT上の検索結果を探索し,再現可能なベースラインを確立する。次に,大規模言語モデル (LLM) をリストワイズに再ランク付けし,推論集約クエリに対する再ランク付けの影響について検討する。これらのベースラインは、一般的な検索とリランクツールキットAnserini、Pyserini、RanLLMに統合されている。 BRIGHTの論文で報告された結果を再現しようとしたところ、提供されたBM25スコアはアンセリーニとピセリニで得られたスコアと顕著に異なることがわかった。この違いは BRIGHT が BM25 を実装したことによるものである。これは BM25 がクエリベクトルを構築するのに標準のbacker-of-words アプローチを使うのではなく、クエリに適用されるためである。この違いは、BRIGHTの長大な推論集約クエリが主要な例であり、LLMプロンプトが'伝統的な'検索エンジンクエリよりもずっと長く成長する検索拡張生成の利用の増加によってさらに強調される、長いクエリの増加により、ますます関連性が高まっている。今後のBM25のアプローチを再考して,新たなアプリケーションに適合する時が来たことを,我々の観察は示唆している。これを容易にするために、クエリサイドBM25をAnseriniとPyseriniの両方に統合する。

論文の概要: Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM

関連論文リスト