Verif.ai: Towards an Open-Source Scientific Generative
Question-Answering System with Referenced and Verifiable Answers
- URL: http://arxiv.org/abs/2402.18589v1
- Date: Fri, 9 Feb 2024 10:25:01 GMT
- Title: Verif.ai: Towards an Open-Source Scientific Generative
Question-Answering System with Referenced and Verifiable Answers
- Authors: Milo\v{s} Ko\v{s}prdi\'c, Adela Ljaji\'c, Bojana Ba\v{s}aragin, Darija
Medvecki, Nikola Milo\v{s}evi\'c
- Abstract summary: We present the current progress of the project Verifai, an open-source scientific generative question-answering system with referenced and verified answers.
The components of the system are (1) an information retrieval system combining semantic and lexical search techniques over scientific papers (Mistral 7B) taking top answers and generating answers with references to the papers from which the claim was derived, and (3) a verification engine that cross-checks the generated claim and the abstract or paper from which the claim was derived.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present the current progress of the project Verif.ai, an
open-source scientific generative question-answering system with referenced and
verified answers. The components of the system are (1) an information retrieval
system combining semantic and lexical search techniques over scientific papers
(PubMed), (2) a fine-tuned generative model (Mistral 7B) taking top answers and
generating answers with references to the papers from which the claim was
derived, and (3) a verification engine that cross-checks the generated claim
and the abstract or paper from which the claim was derived, verifying whether
there may have been any hallucinations in generating the claim. We are
reinforcing the generative model by providing the abstract in context, but in
addition, an independent set of methods and models are verifying the answer and
checking for hallucinations. Therefore, we believe that by using our method, we
can make scientists more productive, while building trust in the use of
generative language models in scientific environments, where hallucinations and
misinformation cannot be tolerated.
Related papers
- Explainable Artifacts for Synthetic Western Blot Source Attribution [18.798003207293746]
Recent advancements in artificial intelligence have enabled generative models to produce synthetic scientific images that are indistinguishable from pristine ones.
This study aims to identify explainable artifacts generated by state-of-the-art generative models and leverage them for open-set identification and source attribution.
arXiv Detail & Related papers (2024-09-27T16:18:13Z) - Analysis of Plan-based Retrieval for Grounded Text Generation [78.89478272104739]
hallucinations occur when a language model is given a generation task outside its parametric knowledge.
A common strategy to address this limitation is to infuse the language models with retrieval mechanisms.
We analyze how planning can be used to guide retrieval to further reduce the frequency of hallucinations.
arXiv Detail & Related papers (2024-08-20T02:19:35Z) - Scientific QA System with Verifiable Answers [0.0]
We introduce the VerifAI project, a pioneering open-source scientific question-answering system.
The components of the system are (1) an Information Retrieval system combining semantic and lexical search techniques over scientific papers (Mistral 7B) and retrieved articles to generate claims with references to the articles from which it was derived, (2) a Retrieval-Augmented Generation (RAG) module using fine-tuned generative model (Mistral 7B) and retrieved articles to generate claims with references to the articles from which it was derived, and (3) a Verification engine, based on a fine-tuned DeBERTa
arXiv Detail & Related papers (2024-07-16T08:21:02Z) - Retrieve to Explain: Evidence-driven Predictions with Language Models [0.791663505497707]
We introduce Retrieve to Explain (R2E), a retrieval-based language model.
R2E scores and ranks all possible answers to a research question based on evidence retrieved from a document corpus.
We assess on the challenging task of drug target identification from scientific literature.
arXiv Detail & Related papers (2024-02-06T15:13:17Z) - NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning [59.16962123636579]
This paper proposes a new take on Prolog-based inference engines.
We replace handcrafted rules with a combination of neural language modeling, guided generation, and semi dense retrieval.
Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA.
arXiv Detail & Related papers (2022-09-16T00:54:44Z) - Science Checker: Extractive-Boolean Question Answering For Scientific
Fact Checking [0.0]
We propose a multi-task approach for verifying the scientific questions based on a joint reasoning from facts and evidence in research articles.
With our light and fast proposed architecture, we achieved an average error rate of 4% and a F1-score of 95.6%.
arXiv Detail & Related papers (2022-04-26T12:35:23Z) - Don't Say What You Don't Know: Improving the Consistency of Abstractive
Summarization by Constraining Beam Search [54.286450484332505]
We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source.
We present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
arXiv Detail & Related papers (2022-03-16T07:13:52Z) - RerrFact: Reduced Evidence Retrieval Representations for Scientific
Claim Verification [4.052777228128475]
We propose a modular approach that sequentially carries out binary classification for every prediction subtask.
We carry out two-step stance predictions that first differentiate non-relevant rationales and then identify supporting or refuting rationales for a given claim.
Experimentally, our system RerrFact with no fine-tuning, simple design, and a fraction of model parameters fairs competitively on the leaderboard.
arXiv Detail & Related papers (2022-02-05T21:52:45Z) - Improving Faithfulness in Abstractive Summarization with Contrast
Candidate Generation and Selection [54.38512834521367]
We study contrast candidate generation and selection as a model-agnostic post-processing technique.
We learn a discriminative correction model by generating alternative candidate summaries.
This model is then used to select the best candidate as the final output summary.
arXiv Detail & Related papers (2021-04-19T05:39:24Z) - Commonsense Evidence Generation and Injection in Reading Comprehension [57.31927095547153]
We propose a Commonsense Evidence Generation and Injection framework in reading comprehension, named CEGI.
The framework injects two kinds of auxiliary commonsense evidence into comprehensive reading to equip the machine with the ability of rational thinking.
Experiments on the CosmosQA dataset demonstrate that the proposed CEGI model outperforms the current state-of-the-art approaches.
arXiv Detail & Related papers (2020-05-11T16:31:08Z) - A Controllable Model of Grounded Response Generation [122.7121624884747]
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process.
We propose a framework that we call controllable grounded response generation (CGRG)
We show that using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.
arXiv Detail & Related papers (2020-05-01T21:22:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.