Constrained Auto-Regressive Decoding Constrains Generative Retrieval
- URL: http://arxiv.org/abs/2504.09935v1
- Date: Mon, 14 Apr 2025 06:54:49 GMT
- Title: Constrained Auto-Regressive Decoding Constrains Generative Retrieval
- Authors: Shiguang Wu, Zhaochun Ren, Xin Xin, Jiyuan Yang, Mengqi Zhang, Zhumin Chen, Maarten de Rijke, Pengjie Ren,
- Abstract summary: Generative retrieval seeks to replace traditional search index data structures with a single large-scale neural network.<n>In this paper, we examine the inherent limitations of constrained auto-regressive generation from two essential perspectives: constraints and beam search.
- Score: 71.71161220261655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative retrieval seeks to replace traditional search index data structures with a single large-scale neural network, offering the potential for improved efficiency and seamless integration with generative large language models. As an end-to-end paradigm, generative retrieval adopts a learned differentiable search index to conduct retrieval by directly generating document identifiers through corpus-specific constrained decoding. The generalization capabilities of generative retrieval on out-of-distribution corpora have gathered significant attention. In this paper, we examine the inherent limitations of constrained auto-regressive generation from two essential perspectives: constraints and beam search. We begin with the Bayes-optimal setting where the generative retrieval model exactly captures the underlying relevance distribution of all possible documents. Then we apply the model to specific corpora by simply adding corpus-specific constraints. Our main findings are two-fold: (i) For the effect of constraints, we derive a lower bound of the error, in terms of the KL divergence between the ground-truth and the model-predicted step-wise marginal distributions. (ii) For the beam search algorithm used during generation, we reveal that the usage of marginal distributions may not be an ideal approach. This paper aims to improve our theoretical understanding of the generalization capabilities of the auto-regressive decoding retrieval paradigm, laying a foundation for its limitations and inspiring future advancements toward more robust and generalizable generative retrieval.
Related papers
- Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance.<n>Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws.<n>We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation [21.764973680014368]
RetroLLM is a unified framework that integrates retrieval and generation into a single, cohesive process.<n>To mitigate false pruning in the process of constrained evidence generation, we introduce hierarchical FM-Index constraints.<n>Experiments on five open-domain QA datasets demonstrate RetroLLM's superior performance across both in-domain and out-of-domain tasks.
arXiv Detail & Related papers (2024-12-16T16:03:25Z) - Learning to Rank in Generative Retrieval [62.91492903161522]
Generative retrieval aims to generate identifier strings of relevant passages as the retrieval target.
We propose a learning-to-rank framework for generative retrieval, dubbed LTRGR.
This framework only requires an additional learning-to-rank training phase to enhance current generative retrieval systems.
arXiv Detail & Related papers (2023-06-27T05:48:14Z) - How Does Generative Retrieval Scale to Millions of Passages? [68.98628807288972]
We conduct the first empirical study of generative retrieval techniques across various corpus scales.
We scale generative retrieval to millions of passages with a corpus of 8.8M passages and evaluating model sizes up to 11B parameters.
While generative retrieval is competitive with state-of-the-art dual encoders on small corpora, scaling to millions of passages remains an important and unsolved challenge.
arXiv Detail & Related papers (2023-05-19T17:33:38Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.