GENIUS: A Generative Framework for Universal Multimodal Search
- URL: http://arxiv.org/abs/2503.19868v1
- Date: Tue, 25 Mar 2025 17:32:31 GMT
- Title: GENIUS: A Generative Framework for Universal Multimodal Search
- Authors: Sungyeon Kim, Xinliang Zhu, Xiaofan Lin, Muhammet Bastan, Douglas Gray, Suha Kwak,
- Abstract summary: This paper proposes GENIUS, a universal generative retrieval framework supporting diverse tasks across multiple modalities and domains.<n>At its core, GENIUS introduces modality-decoupled semantic quantization, transforming multimodal data into discrete IDs encoding both modality and semantics.<n>To enhance generalization, we propose a query augmentation that interpolates between a query and its target, allowing GENIUS to adapt to varied query forms.
- Score: 26.494338650656594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative retrieval is an emerging approach in information retrieval that generates identifiers (IDs) of target data based on a query, providing an efficient alternative to traditional embedding-based retrieval methods. However, existing models are task-specific and fall short of embedding-based retrieval in performance. This paper proposes GENIUS, a universal generative retrieval framework supporting diverse tasks across multiple modalities and domains. At its core, GENIUS introduces modality-decoupled semantic quantization, transforming multimodal data into discrete IDs encoding both modality and semantics. Moreover, to enhance generalization, we propose a query augmentation that interpolates between a query and its target, allowing GENIUS to adapt to varied query forms. Evaluated on the M-BEIR benchmark, it surpasses prior generative methods by a clear margin. Unlike embedding-based retrieval, GENIUS consistently maintains high retrieval speed across database size, with competitive performance across multiple benchmarks. With additional re-ranking, GENIUS often achieves results close to those of embedding-based methods while preserving efficiency.
Related papers
- UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities [53.76854299076118]
UniversalRAG is a novel RAG framework designed to retrieve and integrate knowledge from heterogeneous sources with diverse modalities and granularities.
We propose a modality-aware routing mechanism that dynamically identifies the most appropriate modality-specific corpus and performs targeted retrieval within it.
We validate UniversalRAG on 8 benchmarks spanning multiple modalities, showing its superiority over modality-specific and unified baselines.
arXiv Detail & Related papers (2025-04-29T13:18:58Z) - SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs [70.79124435220695]
We propose a novel unified Semantic-enhanced generative Cross-mOdal REtrieval framework (SemCORE)
We first construct a Structured natural language IDentifier (SID) that effectively aligns target identifiers with generative models optimized for natural language comprehension and generation.
We then introduce a Generative Semantic Verification (GSV) strategy enabling fine-grained target discrimination.
arXiv Detail & Related papers (2025-04-17T17:59:27Z) - Unifying Generative and Dense Retrieval for Sequential Recommendation [37.402860622707244]
We propose LIGER, a hybrid model that combines the strengths of sequential dense retrieval and generative retrieval.<n> LIGER integrates sequential dense retrieval into generative retrieval, mitigating performance differences and enhancing cold-start item recommendation.<n>This hybrid approach provides insights into the trade-offs between these approaches and demonstrates improvements in efficiency and effectiveness for recommendation systems in small-scale benchmarks.
arXiv Detail & Related papers (2024-11-27T23:36:59Z) - GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval [20.807374287510623]
We propose GenCRF: a Generative Clustering and Reformulation Framework to capture diverse intentions adaptively.
We show that GenCRF achieves state-of-the-art performance, surpassing previous query reformulation SOTAs by up to 12% on nDCG@10.
arXiv Detail & Related papers (2024-09-17T05:59:32Z) - Better RAG using Relevant Information Gain [1.5604249682593647]
A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG)<n>We propose a novel simple optimization metric based on relevant information gain, a probabilistic measure of the total information relevant to a query for a set of retrieved results.<n>When used as a drop-in replacement for the retrieval component of a RAG system, this method yields state-of-the-art performance on question answering tasks.
arXiv Detail & Related papers (2024-07-16T18:09:21Z) - ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval.
ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Generative Retrieval as Multi-Vector Dense Retrieval [71.75503049199897]
Generative retrieval generates identifiers of relevant documents in an end-to-end manner.
Prior work has demonstrated that generative retrieval with atomic identifiers is equivalent to single-vector dense retrieval.
We show that generative retrieval and multi-vector dense retrieval share the same framework for measuring the relevance to a query of a document.
arXiv Detail & Related papers (2024-03-31T13:29:43Z) - How Does Generative Retrieval Scale to Millions of Passages? [68.98628807288972]
We conduct the first empirical study of generative retrieval techniques across various corpus scales.
We scale generative retrieval to millions of passages with a corpus of 8.8M passages and evaluating model sizes up to 11B parameters.
While generative retrieval is competitive with state-of-the-art dual encoders on small corpora, scaling to millions of passages remains an important and unsolved challenge.
arXiv Detail & Related papers (2023-05-19T17:33:38Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.