FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation
- URL: http://arxiv.org/abs/2209.14290v1
- Date: Wed, 28 Sep 2022 17:54:55 GMT
- Title: FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation
- Authors: Sebastian Hofst\"atter, Jiecao Chen, Karthik Raman, Hamed Zamani
- Abstract summary: We introduce FiD-Light to increase the efficiency of the state-of-the-art retrieval-augmented FiD model.
We adapt FiD-Light with re-ranking capabilities through source pointers to improve the top-ranked precision.
- Score: 19.17759446168802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-augmented generation models offer many benefits over standalone
language models: besides a textual answer to a given query they provide
provenance items retrieved from an updateable knowledge base. However, they are
also more complex systems and need to handle long inputs. In this work, we
introduce FiD-Light to strongly increase the efficiency of the state-of-the-art
retrieval-augmented FiD model, while maintaining the same level of
effectiveness. Our FiD-Light model constrains the information flow from the
encoder (which encodes passages separately) to the decoder (using concatenated
encoded representations). Furthermore, we adapt FiD-Light with re-ranking
capabilities through textual source pointers, to improve the top-ranked
provenance precision. Our experiments on a diverse set of seven knowledge
intensive tasks (KILT) show FiD-Light consistently improves the Pareto frontier
between query latency and effectiveness. FiD-Light with source pointing sets
substantial new state-of-the-art results on six KILT tasks for combined text
generation and provenance retrieval evaluation, while maintaining reasonable
efficiency.
Related papers
- Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement [22.386864304549285]
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving relevant documents from external sources and incorporating them into the context.<n>We propose Dynamic Parametric RAG (DyPRAG), a novel framework that leverages a lightweight parameter translator model to efficiently convert documents into parametric knowledge.
arXiv Detail & Related papers (2025-03-31T09:46:35Z) - LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation [41.97593224447291]
This paper presents a systematic method to enhance visual grounding by utilizing decoder layers of the Large Language Models (LLMs)
We demonstrate that intermediate hidden states from early LLM layers retain strong spatial-semantic correlations that are beneficial to grounding tasks.
Experiments show that our adaptation strategy significantly enhances the performance on complex free-form text queries.
arXiv Detail & Related papers (2025-03-18T00:50:40Z) - Odysseus Navigates the Sirens' Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation [18.835969818281125]
Large Language Models (LLMs) are increasingly required to generate text that is both factually accurate and diverse across various open-ended applications.
We introduce Dynamic Focus Decoding (DFD), a novel plug-and-play approach that resolves this trade-off without requiring additional data, knowledge, or models.
DFD adaptively adjusts the decoding focus based on distributional differences across layers, leveraging the modular and hierarchical nature of factual knowledge within LLMs.
arXiv Detail & Related papers (2025-03-11T05:27:28Z) - AnyRefill: A Unified, Data-Efficient Framework for Left-Prompt-Guided Vision Tasks [116.8706375364465]
We present a novel Left-Prompt-Guided (LPG) paradigm to address a diverse range of reference-based vision tasks.
We propose AnyRefill, that effectively adapts Text-to-Image (T2I) models to various vision tasks.
arXiv Detail & Related papers (2025-02-16T15:12:40Z) - SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models [34.3296459569307]
Large language models (LLMs) have demonstrated remarkable capabilities, but their outputs can sometimes be unreliable or factually incorrect.
We introduce Self Logits Evolution Decoding (SLED), a novel decoding framework that enhances the truthfulness of LLMs.
We show that SLED consistently improves factual accuracy by up to 20% compared to existing decoding methods.
arXiv Detail & Related papers (2024-11-01T17:33:34Z) - Unleashing the Power of LLMs as Multi-Modal Encoders for Text and Graph-Structured Data [42.18348019901044]
Graph-structured information offers rich contextual information that can enhance language models.
Existing methods for integrating graph and text embeddings are limited in their ability to fully exploit the heterogeneous nature of these modalities.
We propose Janus, a framework that leverages Large Language Models (LLMs) to jointly encode text and graph data.
arXiv Detail & Related papers (2024-10-15T03:40:20Z) - TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings [61.9257731511557]
We propose Text Guided LLaVA (TG-LLaVA) to optimize vision-language models (VLMs)
We use learnable latent embeddings as a bridge to analyze textual instruction and add the analysis results to the vision encoder as guidance.
With the guidance of text, the vision encoder can extract text-related features, similar to how humans focus on the most relevant parts of an image when considering a question.
arXiv Detail & Related papers (2024-09-15T00:38:34Z) - Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation [96.78845113346809]
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks.
This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics to detect unfaithful sentences.
We also introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation.
arXiv Detail & Related papers (2024-06-19T16:42:57Z) - Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT [120.39362661689333]
We present an improved version of Lumina-T2X, showcasing stronger generation performance with increased training and inference efficiency.
Thanks to these improvements, Lumina-Next not only improves the quality and efficiency of basic text-to-image generation but also demonstrates superior resolution extrapolation capabilities.
arXiv Detail & Related papers (2024-06-05T17:53:26Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [71.85120354973073]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.
Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)
We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Improving Language Models via Plug-and-Play Retrieval Feedback [42.786225163763376]
Large language models (LLMs) exhibit remarkable performance across various NLP tasks.
They often generate incorrect or hallucinated information, which hinders their practical applicability in real-world scenarios.
We introduce ReFeed, a novel pipeline designed to enhance LLMs by providing automatic retrieval feedback in a plug-and-play framework.
arXiv Detail & Related papers (2023-05-23T12:29:44Z) - Controllable Data Augmentation Through Deep Relighting [75.96144853354362]
We explore how to augment a varied set of image datasets through relighting so as to improve the ability of existing models to be invariant to illumination changes.
We develop a tool, based on an encoder-decoder network, that is able to quickly generate multiple variations of the illumination of various input scenes.
We demonstrate that by training models on datasets that have been augmented with our pipeline, it is possible to achieve higher performance on localization benchmarks.
arXiv Detail & Related papers (2021-10-26T20:02:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.