CoP: Factual Inconsistency Detection by Controlling the Preference
- URL: http://arxiv.org/abs/2212.01611v2
- Date: Fri, 31 Mar 2023 03:39:01 GMT
- Title: CoP: Factual Inconsistency Detection by Controlling the Preference
- Authors: Shuaijie She, Xiang Geng, Shujian Huang, Jiajun Chen
- Abstract summary: We propose an unsupervised framework named CoP by controlling the preference of the generation model with the help of prompt.
With the properly designed prompt, our framework could evaluate specific preferences and serve as measurements for fine-grained categories of inconsistency.
Experiments show that our framework achieves new SOTA results on three factual inconsistency detection tasks.
- Score: 45.4045488637761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstractive summarization is the process of generating a summary given a
document as input. Although significant progress has been made, the factual
inconsistency between the document and the generated summary still limits its
practical applications. Previous work found that the probabilities assigned by
the generation model reflect its preferences for the generated summary,
including the preference for factual consistency, and the preference for the
language or knowledge prior as well. To separate the preference for factual
consistency, we propose an unsupervised framework named CoP by controlling the
preference of the generation model with the help of prompt. More specifically,
the framework performs an extra inference step in which a text prompt is
introduced as an additional input. In this way, another preference is described
by the generation probability of this extra inference process. The difference
between the above two preferences, i.e. the difference between the
probabilities, could be used as measurements for detecting factual
inconsistencies. Interestingly, we found that with the properly designed
prompt, our framework could evaluate specific preferences and serve as
measurements for fine-grained categories of inconsistency, such as
entity-related inconsistency, coreference-related inconsistency, etc. Moreover,
our framework could also be extended to the supervised setting to learn better
prompt from the labeled data as well. Experiments show that our framework
achieves new SOTA results on three factual inconsistency detection tasks.
Related papers
- A novel framework for MCDM based on Z numbers and soft likelihood function [0.0]
This paper devises a novel framework of soft likelihood function based on information volume of fuzzy membership and credibility measure.
An application is provided to verify the validity and correctness of the proposed framework.
arXiv Detail & Related papers (2024-12-26T18:47:19Z) - Using Similarity to Evaluate Factual Consistency in Summaries [2.7595794227140056]
Abstractive summarisers generate fluent summaries, but the factuality of the generated text is not guaranteed.
We propose a new zero-shot factuality evaluation metric, Sentence-BERTScore (SBERTScore), which compares sentences between the summary and the source document.
Our experiments indicate that each technique has different strengths, with SBERTScore particularly effective in identifying correct summaries.
arXiv Detail & Related papers (2024-09-23T15:02:38Z) - The Penalized Inverse Probability Measure for Conformal Classification [0.5172964916120902]
The work introduces the Penalized Inverse Probability (PIP) nonconformity score, and its regularized version RePIP, that allow the joint optimization of both efficiency and informativeness.
The work shows how PIP-based conformal classifiers exhibit precisely the desired behavior in comparison with other nonconformity measures and strike a good balance between informativeness and efficiency.
arXiv Detail & Related papers (2024-06-13T07:37:16Z) - AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - Interpretable Automatic Fine-grained Inconsistency Detection in Text
Summarization [56.94741578760294]
We propose the task of fine-grained inconsistency detection, the goal of which is to predict the fine-grained types of factual errors in a summary.
Motivated by how humans inspect factual inconsistency in summaries, we propose an interpretable fine-grained inconsistency detection model, FineGrainFact.
arXiv Detail & Related papers (2023-05-23T22:11:47Z) - SWING: Balancing Coverage and Faithfulness for Dialogue Summarization [67.76393867114923]
We propose to utilize natural language inference (NLI) models to improve coverage while avoiding factual inconsistencies.
We use NLI to compute fine-grained training signals to encourage the model to generate content in the reference summaries that have not been covered.
Experiments on the DialogSum and SAMSum datasets confirm the effectiveness of the proposed approach.
arXiv Detail & Related papers (2023-01-25T09:33:11Z) - Revisiting text decomposition methods for NLI-based factuality scoring
of summaries [9.044665059626958]
We show that fine-grained decomposition is not always a winning strategy for factuality scoring.
We also show that small changes to previously proposed entailment-based scoring methods can result in better performance.
arXiv Detail & Related papers (2022-11-30T09:54:37Z) - Document-Level Relation Extraction with Sentences Importance Estimation
and Focusing [52.069206266557266]
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences.
We propose a Sentence Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss.
Experimental results on two domains show that our SIEF not only improves overall performance, but also makes DocRE models more robust.
arXiv Detail & Related papers (2022-04-27T03:20:07Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.