Retrieval augmentation of large language models for lay language
generation
- URL: http://arxiv.org/abs/2211.03818v2
- Date: Thu, 25 Jan 2024 09:30:11 GMT
- Title: Retrieval augmentation of large language models for lay language
generation
- Authors: Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Abstract summary: We introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation.
The abstract and the corresponding lay language summary are written by domain experts, assuring the quality of our dataset.
We derive two specialized paired corpora from CELLS to address key challenges in lay language generation: generating background explanations and simplifying the original abstract.
- Score: 12.686922203465896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent lay language generation systems have used Transformer models trained
on a parallel corpus to increase health information accessibility. However, the
applicability of these models is constrained by the limited size and topical
breadth of available corpora. We introduce CELLS, the largest (63k pairs) and
broadest-ranging (12 journals) parallel corpus for lay language generation. The
abstract and the corresponding lay language summary are written by domain
experts, assuring the quality of our dataset. Furthermore, qualitative
evaluation of expert-authored plain language summaries has revealed background
explanation as a key strategy to increase accessibility. Such explanation is
challenging for neural models to generate because it goes beyond simplification
by adding content absent from the source. We derive two specialized paired
corpora from CELLS to address key challenges in lay language generation:
generating background explanations and simplifying the original abstract. We
adopt retrieval-augmented models as an intuitive fit for the task of background
explanation generation, and show improvements in summary quality and simplicity
while maintaining factual correctness. Taken together, this work presents the
first comprehensive study of background explanation for lay language
generation, paving the path for disseminating scientific knowledge to a broader
audience. CELLS is publicly available at:
https://github.com/LinguisticAnomalies/pls_retrieval.
Related papers
- Boosting the Capabilities of Compact Models in Low-Data Contexts with Large Language Models and Retrieval-Augmented Generation [2.9921619703037274]
We propose a retrieval augmented generation (RAG) framework backed by a large language model (LLM) to correct the output of a smaller model for the linguistic task of morphological glossing.
We leverage linguistic information to make up for the lack of data and trainable parameters, while allowing for inputs from written descriptive grammars interpreted and distilled through an LLM.
We show that a compact, RAG-supported model is highly effective in data-scarce settings, achieving a new state-of-the-art for this task and our target languages.
arXiv Detail & Related papers (2024-10-01T04:20:14Z) - Improving Long Text Understanding with Knowledge Distilled from Summarization Model [17.39913210351487]
We propose our emphGist Detector to leverage the gist detection ability of a summarization model.
Gist Detector first learns the gist detection knowledge distilled from a summarization model, and then produces gist-aware representations.
We evaluate our method on three different tasks: long document classification, distantly supervised open-domain question answering, and non-parallel text style transfer.
arXiv Detail & Related papers (2024-05-08T10:49:39Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Large Language Models Meet Knowledge Graphs to Answer Factoid Questions [57.47634017738877]
We propose a method for exploring pre-trained Text-to-Text Language Models enriched with additional information from Knowledge Graphs.
We procure easily interpreted information with Transformer-based models through the linearization of the extracted subgraphs.
Final re-ranking of the answer candidates with the extracted information boosts Hits@1 scores of the pre-trained text-to-text language models by 4-6%.
arXiv Detail & Related papers (2023-10-03T15:57:00Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer [1.911678487931003]
Retrieval-based language models are increasingly employed in question-answering tasks.
We develop the first Norwegian retrieval-based model by adapting the REALM framework.
We show that this type of training improves the reader's performance on extractive question-answering.
arXiv Detail & Related papers (2023-04-19T13:40:47Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Incorporating Linguistic Knowledge for Abstractive Multi-document
Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model.
We process the dependency information into the linguistic-guided attention mechanism.
With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z) - Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
Pre-trained Language Models [62.41139712595334]
We propose a novel pre-training paradigm for Chinese -- Lattice-BERT.
We construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers.
We show that our model can bring an average increase of 1.5% under the 12-layer setting.
arXiv Detail & Related papers (2021-04-15T02:36:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.