GripRank: Bridging the Gap between Retrieval and Generation via the
Generative Knowledge Improved Passage Ranking
- URL: http://arxiv.org/abs/2305.18144v2
- Date: Tue, 15 Aug 2023 17:10:57 GMT
- Title: GripRank: Bridging the Gap between Retrieval and Generation via the
Generative Knowledge Improved Passage Ranking
- Authors: Jiaqi Bai, Hongcheng Guo, Jiaheng Liu, Jian Yang, Xinnian Liang, Zhao
Yan and Zhoujun Li
- Abstract summary: We propose the GeneRative Knowledge Improved Passage Ranking (GripRank) approach for knowledge-intensive language tasks.
The GPE is a generative language model used to measure how likely the candidate passages can generate the proper answer.
We conduct experiments on four datasets across three knowledge-intensive language tasks.
- Score: 42.98064495920065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-enhanced text generation has shown remarkable progress on
knowledge-intensive language tasks, such as open-domain question answering and
knowledge-enhanced dialogue generation, by leveraging passages retrieved from a
large passage corpus for delivering a proper answer given the input query.
However, the retrieved passages are not ideal for guiding answer generation
because of the discrepancy between retrieval and generation, i.e., the
candidate passages are all treated equally during the retrieval procedure
without considering their potential to generate a proper answer. This
discrepancy makes a passage retriever deliver a sub-optimal collection of
candidate passages to generate the answer. In this paper, we propose the
GeneRative Knowledge Improved Passage Ranking (GripRank) approach, addressing
the above challenge by distilling knowledge from a generative passage estimator
(GPE) to a passage ranker, where the GPE is a generative language model used to
measure how likely the candidate passages can generate the proper answer. We
realize the distillation procedure by teaching the passage ranker learning to
rank the passages ordered by the GPE. Furthermore, we improve the distillation
quality by devising a curriculum knowledge distillation mechanism, which allows
the knowledge provided by the GPE can be progressively distilled to the ranker
through an easy-to-hard curriculum, enabling the passage ranker to correctly
recognize the provenance of the answer from many plausible candidates. We
conduct extensive experiments on four datasets across three knowledge-intensive
language tasks. Experimental results show advantages over the state-of-the-art
methods for both passage ranking and answer generation on the KILT benchmark.
Related papers
- Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs)
Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG.
This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z) - Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models [11.716595438057997]
We propose passage-specific prompt tuning for reranking in open-domain question answering (PSPT)
PSPT is a parameter-efficient method that fine-tunes learnable passage-specific soft prompts.
We conducted extensive experiments utilizing the Llama-2-chat-7B model across three publicly available open-domain question answering datasets.
arXiv Detail & Related papers (2024-05-31T07:43:42Z) - Distillation Enhanced Generative Retrieval [96.69326099136289]
Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target.
In this work, we identify a viable direction to further enhance generative retrieval via distillation and propose a feasible framework, named DGR.
We conduct experiments on four public datasets, and the results indicate that DGR achieves state-of-the-art performance among the generative retrieval methods.
arXiv Detail & Related papers (2024-02-16T15:48:24Z) - Learning to Rank in Generative Retrieval [62.91492903161522]
Generative retrieval aims to generate identifier strings of relevant passages as the retrieval target.
We propose a learning-to-rank framework for generative retrieval, dubbed LTRGR.
This framework only requires an additional learning-to-rank training phase to enhance current generative retrieval systems.
arXiv Detail & Related papers (2023-06-27T05:48:14Z) - Hindsight: Posterior-guided training of retrievers for improved
open-ended generation [41.59136233128446]
We propose an additional guide retriever that is allowed to use the target output and "in hindsight" retrieve relevant passages during training.
For informative conversations from the Wizard of Wikipedia dataset, with posterior-guided training, the retriever finds passages with higher relevance in the top-10.
arXiv Detail & Related papers (2021-10-14T22:24:57Z) - Phrase Retrieval Learns Passage Retrieval, Too [77.57208968326422]
We study whether phrase retrieval can serve as the basis for coarse-level retrieval including passages and documents.
We show that a dense phrase-retrieval system, without any retraining, already achieves better passage retrieval accuracy.
We also show that phrase filtering and vector quantization can reduce the size of our index by 4-10x.
arXiv Detail & Related papers (2021-09-16T17:42:45Z) - Retrieval-Free Knowledge-Grounded Dialogue Response Generation with
Adapters [52.725200145600624]
We propose KnowExpert to bypass the retrieval process by injecting prior knowledge into the pre-trained language models with lightweight adapters.
Experimental results show that KnowExpert performs comparably with the retrieval-based baselines.
arXiv Detail & Related papers (2021-05-13T12:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.