FacTool: Factuality Detection in Generative AI -- A Tool Augmented
Framework for Multi-Task and Multi-Domain Scenarios
- URL: http://arxiv.org/abs/2307.13528v2
- Date: Wed, 26 Jul 2023 15:17:49 GMT
- Title: FacTool: Factuality Detection in Generative AI -- A Tool Augmented
Framework for Multi-Task and Multi-Domain Scenarios
- Authors: I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng,
Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu
- Abstract summary: A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models.
We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
- Score: 87.12753459582116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emergence of generative pre-trained models has facilitated the synthesis
of high-quality text, but it has also posed challenges in identifying factual
errors in the generated text. In particular: (1) A wider range of tasks now
face an increasing risk of containing factual errors when handled by generative
models. (2) Generated texts tend to be lengthy and lack a clearly defined
granularity for individual facts. (3) There is a scarcity of explicit evidence
available during the process of fact checking. With the above challenges in
mind, in this paper, we propose FacTool, a task and domain agnostic framework
for detecting factual errors of texts generated by large language models (e.g.,
ChatGPT). Experiments on four different tasks (knowledge-based QA, code
generation, mathematical reasoning, and scientific literature review) show the
efficacy of the proposed method. We release the code of FacTool associated with
ChatGPT plugin interface at https://github.com/GAIR-NLP/factool .
Related papers
- DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning [24.99797253885887]
We argue that the key to accomplishing this task lies in distinguishing writing styles of different authors.
We propose DeTeCtive, a multi-task auxiliary, multi-level contrastive learning framework.
Our method is compatible with a range of text encoders.
arXiv Detail & Related papers (2024-10-28T12:34:49Z) - FactCheck Editor: Multilingual Text Editor with End-to-End fact-checking [1.985242455423935]
'FactCheck Editor' is an advanced text editor designed to automate fact-checking and correct factual inaccuracies.
It supports over 90 languages and utilizes transformer models to assist humans in the labor-intensive process of fact verification.
arXiv Detail & Related papers (2024-04-30T11:55:20Z) - Successor Features for Efficient Multisubject Controlled Text Generation [48.37713738712319]
We introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) and language model rectification.
SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters.
To the best of our knowledge, our research represents the first application of successor features in text generation.
arXiv Detail & Related papers (2023-11-03T00:17:08Z) - Deliberate then Generate: Enhanced Prompting Framework for Text
Generation [70.10319005141888]
Deliberate then Generate (DTG) prompting framework consists of error detection instructions and candidates that may contain errors.
We conduct extensive experiments on 20+ datasets across 7 text generation tasks, including summarization, translation, dialogue, and more.
We show that DTG consistently outperforms existing prompting methods and achieves state-of-the-art performance on multiple text generation tasks.
arXiv Detail & Related papers (2023-05-31T13:23:04Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - Event Transition Planning for Open-ended Text Generation [55.729259805477376]
Open-ended text generation tasks require models to generate a coherent continuation given limited preceding context.
We propose a novel two-stage method which explicitly arranges the ensuing events in open-ended text generation.
Our approach can be understood as a specially-trained coarse-to-fine algorithm.
arXiv Detail & Related papers (2022-04-20T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.