Fugu-MT 論文翻訳(概要): Granite Embedding R2 Models

論文の概要: Granite Embedding R2 Models

arxiv url: http://arxiv.org/abs/2508.21085v1
Date: Tue, 26 Aug 2025 19:06:29 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-01 19:45:10.811162
Title: Granite Embedding R2 Models
Title（参考訳）: R2モデルのグラファイト埋め込み
Authors: Parul Awasthy, Aashka Trivedi, Yulong Li, Meet Doshi, Riyaz Bhat, Vignesh P, Vishwajeet Kumar, Yushu Yang, Bhavani Iyer, Abraham Daniels, Rudra Murthy, Ken Barker, Martin Franz, Madison Lee, Todd Ward, Salim Roukos, David Cox, Luis Lastras, Jaydeep Sen, Radu Florian,
Abstract要約: Granite Embedding R2モデルは、エンタープライズスケールの高密度検索アプリケーションのために開発された高性能な英語エンコーダベースの埋め込みモデルである。これらのモデルは、標準ベンチマーク、IBMが開発した評価スイート、および実世界のエンタープライズユースケースにおいて、例外的な汎用性を示している。すべてのモデルはApache 2.0ライセンスの下で公開されており、無制限の研究と商用利用を可能にしている。
参考スコア（独自算出の注目度）: 19.202549503734428
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce the Granite Embedding R2 models, a comprehensive family of high-performance English encoder-based embedding models engineered for enterprise-scale dense retrieval applications. Building upon our first-generation release, these models deliver substantial improvements, including 16x expanded context length (8,192 tokens), state-of-the-art performance across diverse retrieval domains - text, code, long-document search, multi-turn conversational, and tabular data - and measurable speed advantages of 19-44\% over leading competitors while maintaining superior accuracy. Our release encompasses both bi-encoder and cross-encoder architectures, featuring a highly effective 22-layer retriever model and its efficient 12-layer counterpart, alongside a high-quality reranker model, all trained exclusively on enterprise-appropriate data with comprehensive governance oversight. The models demonstrate exceptional versatility across standard benchmarks, IBM-developed evaluation suites, and real-world enterprise use cases, establishing new performance standards for open-source embedding models. In an era where retrieval speed and accuracy are paramount for competitive advantage, the Granite R2 models deliver a compelling combination of cutting-edge performance, enterprise-ready licensing, and transparent data provenance that organizations require for mission-critical deployments. All models are publicly available under the Apache 2.0 license at https://huggingface.co/collections/ibm-granite, enabling unrestricted research and commercial use.
Abstract（参考訳）: 本稿では,エンコーダをベースとしたエンコーダを用いたエンコーダを用いたエンコーダの包括的ファミリであるGranite Embedding R2モデルについて紹介する。第一世代のリリースをベースとして、これらのモデルは、16倍のコンテキスト長(8,192トークン)、テキスト、コード、長いドキュメント検索、マルチターンの会話データ、表計算データなど、さまざまな検索領域にわたる最先端のパフォーマンス、主要な競合よりも19～44倍の速度優位性など、大幅に改善されている。当社のリリースでは,22層レトリバーモデルと12層レトリバーモデル,および高品質のリランカモデルを特徴として,両エンコーダアーキテクチャとクロスエンコーダアーキテクチャの両方を対象としています。これらのモデルは、標準ベンチマーク、IBMが開発した評価スイート、および実世界のエンタープライズユースケースにまたがる例外的な汎用性を示し、オープンソースの埋め込みモデルの新たなパフォーマンス標準を確立します。競争上の優位性のために、検索速度と精度が最優先される時代において、Granite R2モデルは、最先端のパフォーマンス、エンタープライズ対応のライセンス、そして組織がミッションクリティカルなデプロイメントに必要とする透過的なデータ証明という魅力的な組み合わせを提供します。すべてのモデルはApache 2.0ライセンスでhttps://huggingface.co/collections/ibm-graniteで公開されている。

論文の概要: Granite Embedding R2 Models

関連論文リスト