Retrieval Augmented Generation and Representative Vector Summarization
for large unstructured textual data in Medical Education
- URL: http://arxiv.org/abs/2308.00479v1
- Date: Tue, 1 Aug 2023 12:04:50 GMT
- Title: Retrieval Augmented Generation and Representative Vector Summarization
for large unstructured textual data in Medical Education
- Authors: S. S. Manathunga and Y. A. Illangasekara
- Abstract summary: Retrieval Augmented Generation (RAG) allows to easily attach and manipulate a non-parametric knowledgebases to Large Language Models.
A combined extractive and abstractive summarization method for large unstructured textual data using representative vectors is proposed.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models are increasingly being used for various tasks including
content generation and as chatbots. Despite their impressive performances in
general tasks, LLMs need to be aligned when applying for domain specific tasks
to mitigate the problems of hallucination and producing harmful answers.
Retrieval Augmented Generation (RAG) allows to easily attach and manipulate a
non-parametric knowledgebases to LLMs. Applications of RAG in the field of
medical education are discussed in this paper. A combined extractive and
abstractive summarization method for large unstructured textual data using
representative vectors is proposed.
Related papers
- Enhancing RAG with Active Learning on Conversation Records: Reject Incapables and Answer Capables [17.76687504479359]
Retrieval-augmented generation (RAG) is a key technique for leveraging external knowledge and reducing hallucinations in large language models (LLMs)
This paper proposes using the vast amount of conversations from widespread LLM usage to build high-quality datasets.
We introduce AL4RAG, which uses active learning to select the most suitable conversation samples for annotation.
arXiv Detail & Related papers (2025-02-13T08:42:29Z) - Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis [0.1984949535188529]
Vision language models (VLMs) address this by incorporating a pretrained vision backbone for processing images and a cross-modal projector.
We developed intelligent assistants finetuned from LLaVA models to enhance multimodal understanding in low-dose radiation therapy.
arXiv Detail & Related papers (2025-01-26T02:48:01Z) - Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation [81.18701211912779]
We introduce an Adaptive Multi-Aspect Retrieval-augmented over KGs (Amar) framework.<n>This method retrieves knowledge including entities, relations, and subgraphs, and converts each piece of retrieved text into prompt embeddings.<n>Our method has achieved state-of-the-art performance on two common datasets.
arXiv Detail & Related papers (2024-12-24T16:38:04Z) - RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization [0.27624021966289597]
This paper introduces EYEGLAXS, a framework that leverages Large Language Models (LLMs) for extractive summarization.
EYEGLAXS focuses on extractive summarization to ensure factual and grammatical integrity.
The system sets new performance benchmarks on well-known datasets like PubMed and ArXiv.
arXiv Detail & Related papers (2024-08-28T13:52:19Z) - Retrieval-Augmented Generation for Natural Language Processing: A Survey [25.11304732038443]
retrieval-augmented generation (RAG) leverages an external knowledge database to augment large language models.
This paper reviews all significant techniques of RAG, especially in the retriever and the retrieval fusions.
RAG is used in representative natural language processing tasks and industrial scenarios.
arXiv Detail & Related papers (2024-07-18T06:06:53Z) - Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs)
Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG.
This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z) - Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks.
We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM.
We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z) - Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models [10.04914417538886]
Large-scale language models (LLMs) have achieved remarkable success across various language tasks but suffer from hallucinations and temporal misalignment.
We propose a new textitDistill-Retrieve-Read framework instead of the previous textitRetrieve-then-Read.
arXiv Detail & Related papers (2024-04-27T13:11:42Z) - Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation [128.01050030936028]
We propose an information refinement training method named InFO-RAG.
InFO-RAG is low-cost and general across various tasks.
It improves the performance of LLaMA2 by an average of 9.39% relative points.
arXiv Detail & Related papers (2024-02-28T08:24:38Z) - Question-Answering Based Summarization of Electronic Health Records
using Retrieval Augmented Generation [0.0]
We propose a method that mitigates shortcomings by combining semantic search, retrieval augmented generation and question-answering.
Our approach is quite efficient; requires minimal to no training; does not suffer from the 'hallucination' problem of LLMs.
It ensures diversity, since the summary will not have repeated content but diverse answers to specific questions.
arXiv Detail & Related papers (2024-01-03T00:09:34Z) - Local Large Language Models for Complex Structured Medical Tasks [0.0]
This paper introduces an approach that combines the language reasoning capabilities of large language models with the benefits of local training to tackle complex, domain-specific tasks.
Specifically, the authors demonstrate their approach by extracting structured condition codes from pathology reports.
arXiv Detail & Related papers (2023-08-03T12:36:13Z) - An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians.
Recent studies have achieved promising results in automatic impression generation using large-scale medical text data.
These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.