Fugu-MT 論文翻訳(概要): HAKARI-Bench: A Lightweight Benchmark for Comparing Retrieval Architectures and Efficiency Settings under Unified Conditions

論文の概要: HAKARI-Bench: A Lightweight Benchmark for Comparing Retrieval Architectures and Efficiency Settings under Unified Conditions

arxiv url: http://arxiv.org/abs/2606.22778v1
Date: Mon, 22 Jun 2026 02:42:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 04:47:12.244754
Title: HAKARI-Bench: A Lightweight Benchmark for Comparing Retrieval Architectures and Efficiency Settings under Unified Conditions
Title（参考訳）: Hakari-Bench: 統一条件下での検索アーキテクチャと効率設定を比較する軽量ベンチマーク
Authors: Yuichi Tateno,
Abstract要約: 既存の検索スイートを小さなデータセット(Nano-sets)に再構成するベンチマークであるHakaRI-Benchを紹介する。 5つの検索ファミリ(BM25, 密度, スパース, 遅延相互作用, リランカ)とそれらの効率変数の同条件でモデルに依存しない比較を可能にする。総合ランキングでは、公式のMTEB検索v2、MTEB v2検索、Spearman >0.97の英語BEIR(full)を再現している。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the rapid spread of retrieval-augmented generation and semantic search, choosing the right embedding and retrieval configuration is increasingly hard. Large retrieval benchmarks are comprehensive but too heavy to rerun during development, and there is little infrastructure for comparing production settings--dimensionality reduction, quantization, reranking--across many models under identical conditions. We present HAKARI-Bench, a lightweight benchmark that reconstructs existing retrieval suites into small datasets (Nano-sets): 35 benchmarks and 551 tasks across 43 languages in a unified format, enabling same-condition, model-agnostic comparison of five retrieval families (BM25, dense, sparse, late interaction, rerankers) and their efficiency variants. Across 55 models, its overall ranking reproduces the official MTEB retrieval v2, MMTEB v2 retrieval, and English BEIR (full) at Spearman >0.97. HAKARI-Bench does not replace full evaluation; it enables rapid model selection, regression detection, and reading the quality-efficiency Pareto frontier. Code, data, and leaderboard are released under the MIT license.
Abstract（参考訳）: 検索強化された生成とセマンティック検索の急速な普及により、適切な埋め込みと検索設定を選択することはますます困難になっている。大規模な検索ベンチマークは包括的ではあるが、開発中に再実行するには重すぎるため、生産環境を比較するためのインフラはほとんどない - 次元の削減、量子化、再ランク付け - 同一条件下で多くのモデルにまたがる。既存の検索スイートを小さなデータセット(ナノセット)に再構成する軽量なベンチマークであるHakaRI-Benchについて述べる。35のベンチマークと551のタスクを43言語で統一した形式で実行し、5つの検索ファミリ(BM25, 密度, スパース, 遅延相互作用, リランカ)とそれらの効率変数の同条件, モデルに依存しない比較を可能にする。 55モデル全体では、公式のMTEB検索v2、MTEB v2検索、Spearman >0.97での英語BEIR(full)を再現している。 HakaRI-Benchは完全な評価に取って代わらず、高速なモデル選択、回帰検出、品質効率のParetoフロンティアを読むことができる。コード、データ、およびリーダーボードはMITライセンス下でリリースされている。

論文の概要: HAKARI-Bench: A Lightweight Benchmark for Comparing Retrieval Architectures and Efficiency Settings under Unified Conditions

関連論文リスト