Granite Embedding Models
- URL: http://arxiv.org/abs/2502.20204v1
- Date: Thu, 27 Feb 2025 15:45:16 GMT
- Title: Granite Embedding Models
- Authors: Parul Awasthy, Aashka Trivedi, Yulong Li, Mihaela Bornea, David Cox, Abraham Daniels, Martin Franz, Gabe Goodhart, Bhavani Iyer, Vishwajeet Kumar, Luis Lastras, Scott McCarley, Rudra Murthy, Vignesh P, Sara Rosenthal, Salim Roukos, Jaydeep Sen, Sukriti Sharma, Avirup Sil, Kate Soule, Arafat Sultan, Radu Florian,
- Abstract summary: We introduce the Granite Embedding models, a family of encoder-based embedding models designed for retrieval tasks.<n>This report provides the technical details of training these highly effective 12 layer embedding models, along with their efficient 6 layer distilled counterparts.<n>We publicly release all our Granite Embedding models under the Apache 2.0 license, allowing both research and commercial use at https://huggingface.co/collections/ibm-granite.
- Score: 26.86244952892162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce the Granite Embedding models, a family of encoder-based embedding models designed for retrieval tasks, spanning dense-retrieval and sparse retrieval architectures, with both English and Multilingual capabilities. This report provides the technical details of training these highly effective 12 layer embedding models, along with their efficient 6 layer distilled counterparts. Extensive evaluations show that the models, developed with techniques like retrieval oriented pretraining, contrastive finetuning, knowledge distillation, and model merging significantly outperform publicly available models of similar sizes on both internal IBM retrieval and search tasks, and have equivalent performance on widely used information retrieval benchmarks, while being trained on high-quality data suitable for enterprise use. We publicly release all our Granite Embedding models under the Apache 2.0 license, allowing both research and commercial use at https://huggingface.co/collections/ibm-granite.
Related papers
- Command A: An Enterprise-Ready Large Language Model [180.18356391290172]
Command A is an agent-optimised and multilingual-capable model.
It offers best-in-class Retrieval Augmented Generation capabilities.
arXiv Detail & Related papers (2025-04-01T12:08:07Z) - Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence [88.74800617923083]
We introduce Granite Vision, a lightweight large language model with vision capabilities.<n>Our model is trained on a comprehensive instruction-following dataset.<n> Granite Vision achieves strong results in standard benchmarks related to visual document understanding.
arXiv Detail & Related papers (2025-02-14T05:36:32Z) - KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model [27.25688303240741]
KaLM-Embedding is a general multilingual embedding model that leverages a large quantity of cleaner, more diverse, and domain-specific training data.<n>Our model has been trained with key techniques proven to enhance performance.
arXiv Detail & Related papers (2025-01-02T03:17:51Z) - MERLOT: A Distilled LLM-based Mixture-of-Experts Framework for Scalable Encrypted Traffic Classification [19.476061046309052]
We present a scalable mixture-of-expert (MoE) based refinement of distilled large language model optimized for encrypted traffic classification.
Experiments on 10 datasets show superior or competitive performance over state-of-the-art models.
arXiv Detail & Related papers (2024-11-20T03:01:41Z) - VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks [60.5257456681402]
We study the potential for building universal embeddings capable of handling a wide range of downstream tasks.<n>We build a series of VLM2Vec models on SoTA VLMs like Phi-3.5-V, LLaVA-1.6 and evaluate them on MMEB's evaluation split.<n>Our results show that VLM2Vec achieves an absolute average improvement of 10% to 20% over existing multimodal embedding models.
arXiv Detail & Related papers (2024-10-07T16:14:05Z) - EmbedLLM: Learning Compact Representations of Large Language Models [28.49433308281983]
We propose EmbedLLM, a framework designed to learn compact vector representations of Large Language Models.
We introduce an encoder-decoder approach for learning such embeddings, along with a systematic framework to evaluate their effectiveness.
Empirical results show that EmbedLLM outperforms prior methods in model routing both in accuracy and latency.
arXiv Detail & Related papers (2024-10-03T05:43:24Z) - Compact Language Models via Pruning and Knowledge Distillation [61.56557874432008]
Minitron models exhibit up to a 16% improvement in MMLU scores compared to training from scratch.
Deriving 8B and 4B models from an already pretrained 15B model using our approach requires up to 40x fewer training tokens per model compared to training from scratch.
arXiv Detail & Related papers (2024-07-19T21:47:57Z) - Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development.<n>This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models [5.2094499417507105]
This report describes the training dataset creation and recipe behind the family of textttarctic-embed text embedding models.
At the time of their release, each model achieved state-of-the-art retrieval accuracy for models of their size on the MTEB Retrieval leaderboard.
arXiv Detail & Related papers (2024-05-08T19:05:18Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.