Molecular Facts: Desiderata for Decontextualization in LLM Fact   Verification
        - URL: http://arxiv.org/abs/2406.20079v1
- Date: Fri, 28 Jun 2024 17:43:48 GMT
- Title: Molecular Facts: Desiderata for Decontextualization in LLM Fact   Verification
- Authors: Anisha Gunjal, Greg Durrett, 
- Abstract summary: We argue that fully atomic facts are not the right representation, and define two criteria for molecular facts: decontextuality, or how well they can stand alone, and minimality.
We present a baseline methodology for generating molecular facts automatically, aiming to add the right amount of information.
- Score: 56.39904484784127
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Automatic factuality verification of large language model (LLM) generations is becoming more and more widely used to combat hallucinations. A major point of tension in the literature is the granularity of this fact-checking: larger chunks of text are hard to fact-check, but more atomic facts like propositions may lack context to interpret correctly. In this work, we assess the role of context in these atomic facts. We argue that fully atomic facts are not the right representation, and define two criteria for molecular facts: decontextuality, or how well they can stand alone, and minimality, or how little extra information is added to achieve decontexuality. We quantify the impact of decontextualization on minimality, then present a baseline methodology for generating molecular facts automatically, aiming to add the right amount of information. We compare against various methods of decontextualization and find that molecular facts balance minimality with fact verification accuracy in ambiguous settings. 
 
      
        Related papers
        - Atomic Reasoning for Scientific Table Claim Verification [83.14588611859826]
 Non-experts are susceptible to misleading claims based on scientific tables due to their high information density and perceived credibility.<n>Existing table claim verification models, including state-of-the-art large language models (LLMs), often struggle with precise fine-grained reasoning.<n>Inspired by Cognitive Load Theory, we propose that enhancing a model's ability to interpret table-based claims involves reducing cognitive load.
 arXiv  Detail & Related papers  (2025-06-08T02:46:22Z)
- MedScore: Factuality Evaluation of Free-Form Medical Answers [54.722181966548895]
 We propose MedScore, a new approach to decomposing medical answers into condition-aware valid facts.<n>Our method extracts up to three times more valid facts than existing methods.
 arXiv  Detail & Related papers  (2025-05-24T01:23:09Z)
- FactReasoner: A Probabilistic Approach to Long-Form Factuality   Assessment for Large Language Models [59.171510592986735]
 We propose FactReasoner, a new factuality assessor that relies on probabilistic reasoning to assess the factuality of a long-form generated response.
Our experiments on labeled and unlabeled benchmark datasets demonstrate clearly that FactReasoner improves considerably over state-of-the-art prompt-based approaches.
 arXiv  Detail & Related papers  (2025-02-25T19:01:48Z)
- From Models to Microtheories: Distilling a Model's Topical Knowledge for   Grounded Question Answering [86.36792996924244]
 microtheories are sentences encapsulating an LM's core knowledge about a topic.
We show that, when added to a general corpus (e.g., Wikipedia), microtheories can supply critical, topical information not necessarily present in the corpus.
We also show that, in a human evaluation in the medical domain, our distilled microtheories contain a significantly higher concentration of topically critical facts.
 arXiv  Detail & Related papers  (2024-12-23T16:32:55Z)
- DnDScore: Decontextualization and Decomposition for Factuality   Verification in Long-Form Text Generation [48.134780006638984]
 decomposition and decontextualization have been explored independently, but their interactions in a complete system have not been investigated.
We conduct an evaluation of different decomposition, decontextualization, and verification strategies and find that the choice of strategy matters in the resulting factuality scores.
We introduce DnDScore, a decontextualization aware verification method which validates subclaims in the context of contextual information.
 arXiv  Detail & Related papers  (2024-12-17T18:54:01Z)
- Do Automatic Factuality Metrics Measure Factuality? A Critical   Evaluation [21.650619533772232]
 This work investigates whether superficial attributes of summary texts suffice to predict factuality''
We then evaluate how factuality metrics respond to factual corrections in inconsistent summaries and find that only a few show meaningful improvements.
Motivated by these insights, we show that one can game'' (most) automatic factuality metrics, i.e., reliably inflate factuality'' scores by appending innocuous sentences to generated summaries.
 arXiv  Detail & Related papers  (2024-11-25T18:15:15Z)
- FactLens: Benchmarking Fine-Grained Fact Verification [6.814173254027381]
 We advocate for a shift toward fine-grained verification, where complex claims are broken down into smaller sub-claims for individual verification.
We introduce FactLens, a benchmark for evaluating fine-grained fact verification, with metrics and automated evaluators of sub-claim quality.
Our results show alignment between automated FactLens evaluators and human judgments, and we discuss the impact of sub-claim characteristics on the overall verification performance.
 arXiv  Detail & Related papers  (2024-11-08T21:26:57Z)
- FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to   Improve Fact Verification with Knowledge Graphs [0.0]
 We present FactGenius, a novel method that enhances fact-checking by combining zero-shot prompting of large language models with fuzzy text matching on knowledge graphs.
The evaluation of FactGenius on the FactKG, a benchmark dataset for fact verification, demonstrates that it significantly outperforms existing baselines.
 arXiv  Detail & Related papers  (2024-06-03T13:24:37Z)
- Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule   Understanding and Generation [42.08917809689811]
 We propose Atomas, a multi-modal molecular representation learning framework to jointly learn representations from SMILES string and text.
In the retrieval task, Atomas exhibits robust generalization ability and outperforms the baseline by 30.8% of recall@1 on average.
In the generation task, Atomas achieves state-of-the-art results in both molecule captioning task and molecule generation task.
 arXiv  Detail & Related papers  (2024-04-23T12:35:44Z)
- Linking Surface Facts to Large-Scale Knowledge Graphs [23.380979397966286]
 Open Information Extraction (OIE) methods extract facts from natural language text in the form of ("subject"; "relation"; "object") triples.
Knowledge Graphs (KGs) contain facts in a canonical (i.e., unambiguous) form, but their coverage is limited by a static schema.
We propose a new benchmark with novel evaluation protocols that can, for example, measure fact linking performance on a granular triple slot level.
 arXiv  Detail & Related papers  (2023-10-23T13:18:49Z)
- The Perils & Promises of Fact-checking with Large Language Models [55.869584426820715]
 Large Language Models (LLMs) are increasingly trusted to write academic papers, lawsuits, and news articles.
We evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions.
Our results show the enhanced prowess of LLMs when equipped with contextual information.
While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy.
 arXiv  Detail & Related papers  (2023-10-20T14:49:47Z)
- MolXPT: Wrapping Molecules with Text for Generative Pre-training [141.0924452870112]
 MolXPT is a unified language model of text and molecules pre-trained on SMILES wrapped by text.
MolXPT outperforms strong baselines of molecular property prediction on MoleculeNet.
 arXiv  Detail & Related papers  (2023-05-18T03:58:19Z)
- Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for
  Misinformation [67.69725605939315]
 Misinformation emerges in times of uncertainty when credible information is limited.
This is challenging for NLP-based fact-checking as it relies on counter-evidence, which may not yet be available.
 arXiv  Detail & Related papers  (2022-10-25T09:40:48Z)
- A Molecular Multimodal Foundation Model Associating Molecule Graphs with
  Natural Language [63.60376252491507]
 We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
 arXiv  Detail & Related papers  (2022-09-12T00:56:57Z)
- Factuality Enhanced Language Models for Open-Ended Text Generation [60.27166549575472]
 We design the FactualityPrompts test set and metrics to measure the factuality of LM generations.
We find that larger LMs are more factual than smaller ones, although a previous study suggests that larger LMs can be less truthful in terms of misconceptions.
We propose a factuality-enhanced training method that uses TopicPrefix for better awareness of facts and sentence completion.
 arXiv  Detail & Related papers  (2022-06-09T17:16:43Z)
- The Role of Context in Detecting Previously Fact-Checked Claims [27.076320857009655]
 We focus on claims made in a political debate, where context really matters.
We study the impact of modeling the context of the claim both on the source side, as well as on the target side, in the fact-checking explanation document.
 arXiv  Detail & Related papers  (2021-04-15T12:39:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.