Related papers: Retrieve to Explain: Evidence-driven Predictions with Language Models

Retrieve to Explain: Evidence-driven Predictions with Language Models

URL: http://arxiv.org/abs/2402.04068v3
Date: Tue, 18 Jun 2024 10:42:54 GMT
Title: Retrieve to Explain: Evidence-driven Predictions with Language Models
Authors: Ravi Patel, Angus Brayne, Rogier Hintzen, Daniel Jaroslawicz, Georgiana Neculae, Dane Corneil,
Abstract summary: We introduce Retrieve to Explain (R2E), a retrieval-based language model. R2E scores and ranks all possible answers to a research question based on evidence retrieved from a document corpus. We assess on the challenging task of drug target identification from scientific literature.
Score: 0.791663505497707
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Language models hold incredible promise for enabling scientific discovery by synthesizing massive research corpora. Many complex scientific research questions have multiple plausible answers, each supported by evidence of varying strength. However, existing language models lack the capability to quantitatively and faithfully compare answer plausibility in terms of supporting evidence. To address this issue, we introduce Retrieve to Explain (R2E), a retrieval-based language model. R2E scores and ranks all possible answers to a research question based on evidence retrieved from a document corpus. The architecture represents each answer only in terms of its supporting evidence, with the answer itself masked. This allows us to extend feature attribution methods, such as Shapley values, to transparently attribute each answer's score back to its supporting evidence at inference time. The architecture also allows R2E to incorporate new evidence without retraining, including non-textual data modalities templated into natural language. We assess on the challenging task of drug target identification from scientific literature, a human-in-the-loop process where failures are extremely costly and explainability is paramount. When predicting whether drug targets will subsequently be confirmed as efficacious in clinical trials, R2E not only matches non-explainable literature-based models but also surpasses a genetics-based target identification approach used throughout the pharmaceutical industry.

Related papers

Introducing Answered with Evidence -- a framework for evaluating whether LLM responses to biomedical questions are founded in evidence [1.3250161978024673]
Large language models (LLMs) for biomedical question answering raise concerns about the accuracy and evidentiary support of their responses.<n>We analyzed thousands of physician-submitted questions using a comparative pipeline that included: (1) Alexandria, fka the Atropos Evidence Library, a retrieval-augmented generation (RAG) system based on novel observational studies, and (2) two PubMed-based retrieval-augmented systems (System and Perplexity)<n>We found that PubMed-based systems provided evidence-supported answers for approximately 44% of questions, while the novel evidence source did so for about 50%.
arXiv Detail & Related papers (2025-06-30T18:00:52Z)
Enhancing LLM Generation with Knowledge Hypergraph for Evidence-Based Medicine [22.983780823136925]
Evidence-based medicine (EBM) plays a crucial role in the application of large language models (LLMs) in healthcare.<n>We propose using LLMs to gather scattered evidence from multiple sources and present a knowledge hypergraph-based evidence management model.<n>Our approach outperforms existing RAG techniques in application domains of interest to EBM, such as medical quizzing, hallucination detection, and decision support.
arXiv Detail & Related papers (2025-03-18T09:17:31Z)
Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z)
A generative framework to bridge data-driven models and scientific theories in language neuroscience [84.76462599023802]
We present generative explanation-mediated validation, a framework for generating concise explanations of language selectivity in the brain. We show that explanatory accuracy is closely related to the predictive power and stability of the underlying statistical models.
arXiv Detail & Related papers (2024-10-01T15:57:48Z)
Evidence-Enhanced Triplet Generation Framework for Hallucination Alleviation in Generative Question Answering [41.990482015732574]
We propose a novel evidence-enhanced triplet generation framework, EATQA, to predict all the combinations of (Question, Evidence, Answer) triplet. We bridge the distribution gap to distill the knowledge from evidence in inference stage. Our framework ensures the model to learn the logical relation between query, evidence and answer, which simultaneously improves the evidence generation and query answering.
arXiv Detail & Related papers (2024-08-27T13:07:07Z)
Explainable Biomedical Hypothesis Generation via Retrieval Augmented Generation enabled Large Language Models [46.05020842978823]
Large Language Models (LLMs) have emerged as powerful tools to navigate this complex data landscape. RAGGED is a comprehensive workflow designed to support investigators with knowledge integration and hypothesis generation.
arXiv Detail & Related papers (2024-07-17T07:44:18Z)
Uncertainty Estimation of Large Language Models in Medical Question Answering [60.72223137560633]
Large Language Models (LLMs) show promise for natural language generation in healthcare, but risk hallucinating factually incorrect information. We benchmark popular uncertainty estimation (UE) methods with different model sizes on medical question-answering datasets. Our results show that current approaches generally perform poorly in this domain, highlighting the challenge of UE for medical applications.
arXiv Detail & Related papers (2024-07-11T16:51:33Z)
Answering real-world clinical questions using large language model based systems [2.2605659089865355]
Large language models (LLMs) could potentially address both challenges by either summarizing published literature or generating new studies based on real-world data (RWD) We evaluated the ability of five LLM-based systems in answering 50 clinical questions and had nine independent physicians review the responses for relevance, reliability, and actionability.
arXiv Detail & Related papers (2024-06-29T22:39:20Z)
Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study [61.74571814707054]
We evaluate whether every generated sentence is grounded in retrieved documents or the model's pre-training data. Across 3 datasets and 4 model families, our findings reveal that a significant fraction of generated sentences are consistently ungrounded. Our results show that while larger models tend to ground their outputs more effectively, a significant portion of correct answers remains compromised by hallucinations.
arXiv Detail & Related papers (2024-04-10T14:50:10Z)
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables [22.18384189336634]
HeterFC is a word-level Heterogeneous-graph-based model for Fact Checking over unstructured and structured information. We perform information propagation via a relational graph neural network, interactions between claims and evidence. We introduce a multitask loss function to account for potential inaccuracies in evidence retrieval.
arXiv Detail & Related papers (2024-02-20T14:10:40Z)
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification [60.10193972862099]
This work proposes a framework to characterize and recover simplification-induced information loss in form of question-and-answer pairs. QA pairs are designed to help readers deepen their knowledge of a text.
arXiv Detail & Related papers (2024-01-29T19:00:01Z)
A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes. We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z)
Grow-and-Clip: Informative-yet-Concise Evidence Distillation for Answer Explanation [22.20733260041759]
We argue that the evidences of an answer is critical to enhancing the interpretability of QA models. We are the first to explicitly define the concept of evidence as the supporting facts in a context which are informative, concise, and readable. We propose Grow-and-Clip Evidence Distillation (GCED) algorithm to extract evidences from the contexts by trade-off informativeness, conciseness, and readability.
arXiv Detail & Related papers (2022-01-13T17:18:17Z)
Text Mining to Identify and Extract Novel Disease Treatments From Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show. We then build a pipeline for systematically pre-processing the text. Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)
Commonsense Evidence Generation and Injection in Reading Comprehension [57.31927095547153]
We propose a Commonsense Evidence Generation and Injection framework in reading comprehension, named CEGI. The framework injects two kinds of auxiliary commonsense evidence into comprehensive reading to equip the machine with the ability of rational thinking. Experiments on the CosmosQA dataset demonstrate that the proposed CEGI model outperforms the current state-of-the-art approaches.
arXiv Detail & Related papers (2020-05-11T16:31:08Z)
Evidence Inference 2.0: More Data, Better Models [22.53884716373888]
The Evidence Inference dataset was recently released to facilitate research toward this end. This paper collects additional annotations to expand the Evidence Inference dataset by 25%. The updated corpus, documentation, and code for new baselines and evaluations are available at http://evidence-inference.ebm-nlp.com/.
arXiv Detail & Related papers (2020-05-08T17:16:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.