Fugu-MT 論文翻訳(概要): KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

論文の概要: KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

arxiv url: http://arxiv.org/abs/2606.22807v1
Date: Mon, 22 Jun 2026 03:36:31 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 04:25:43.34708
Title: KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking
Title（参考訳）: KaLM-Reranker-V1:圧縮文書リグレードのための高速だが遅くないインタラクション
Authors: Xinping Zhao, Jiaxin Xu, Ziqi Dai, Xin Zhang, Shouzheng Huang, Danyu Tang, Xinshuo Hu, Meishan Zhang, Baotian Hu, Min Zhang,
Abstract要約: 我々は、クエリとパスを分離する高速だが遅延相互作用(FBNL)リランカである KaLM-Reranker-V1 を提案する。エンコーダ-デコーダアーキテクチャに基づいて構築されたKaLM-Reranker-V1は、エンコーダを使用して、Matryoshka埋め込みプーリングでパスをプリエンコードする。我々は,それぞれ0.27B,1B,4Bの活性化パラメータを持つNano,Small,Largeの3つのサイズでKaLM-Reranker-V1をインスタンス化する。
参考スコア（独自算出の注目度）: 45.466940160087866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As retrieval systems scale, high-quality reranking becomes increasingly important. However, most existing rerankers, whether encoder-based or decoder-based, jointly encode the query and passage, tightly coupling their computation and limiting deployment efficiency as well as flexibility. We present KaLM-Reranker-V1, a fast but not late-interaction (FBNL) reranker that decouples query and passage computation while retaining expressive relevance modeling. Built on an encoder-decoder architecture, KaLM-Reranker-V1 uses the encoder to pre-encode passages with Matryoshka embedding pooling, while the decoder models the system instruction, user instruction, and query intent; cross-attention then captures relevance between the query context and passage representations. This design makes KaLM-Reranker-V1 efficient through decoupled passage encoding, yet not late interaction, by preserving rich relevance modeling through cross-attention. We instantiate KaLM-Reranker-V1 in three sizes, Nano, Small, and Large, with 0.27B, 1B, and 4B activated parameters, respectively. Extensive experiments on BEIR, MIRACL, and LMEB demonstrate that KaLM-Reranker-V1 achieves strong reranking performance with superior efficiency. On BEIR, KaLM-Reranker-V1 achieves state-of-the-art performance, on par with strong industrial models such as the Qwen3-Reranker series; on MIRACL, despite not being extensively trained on multilingual data, KaLM-Reranker-V1 still shows excellent reranking performance. Moreover, on LMEB, reranking models demonstrate a clear advantage, with even the 0.27B Nano model remaining competitive with 7-12B embedding models.
Abstract（参考訳）: 検索システムがスケールするにつれて、高品質なリランクがますます重要になる。しかし、エンコーダベースであれデコーダベースであれ、既存のリランカのほとんどは、クエリとパスを共同でエンコードし、計算を密結合させ、デプロイメント効率と柔軟性を制限している。本稿では,FBNLリランカである KaLM-Reranker-V1 について述べる。エンコーダ-デコーダアーキテクチャに基づいて構築されたKaLM-Reranker-V1は、エンコーダを使用して、Materyoshka埋め込みプーリングでパスをプリエンコードし、デコーダはシステム命令、ユーザ命令、クエリインテントをモデル化する。この設計により、KLM-Reranker-V1は、クロスアテンションによるリッチな関連モデリングを保ちながら、切り離されたパスエンコーディングにより効率的になる。我々はそれぞれ0.27B,1B,4Bの活性化パラメータを持つNano,Small,Largeの3つのサイズでKaLM-Reranker-V1をインスタンス化する。 BEIR, MIRACL, LMEB の大規模な実験により, KaLM-Reranker-V1 が高効率で高い再ランク性能を実現することを示した。 BEIRでは、KLM-Reranker-V1はQwen3-Rerankerシリーズのような強力な産業モデルと同等の最先端性能を実現している。さらに、LMEBでは、再ランクモデルは明らかな優位性を示し、0.27Bのナノモデルでさえ7-12Bの埋め込みモデルと競合するままである。

論文の概要: KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

関連論文リスト