Related papers: Fact-Checking Complex Claims with Program-Guided Reasoning

Fact-Checking Complex Claims with Program-Guided Reasoning

URL: http://arxiv.org/abs/2305.12744v1
Date: Mon, 22 May 2023 06:11:15 GMT
Title: Fact-Checking Complex Claims with Program-Guided Reasoning
Authors: Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov
Abstract summary: Program-Guided Fact-Checking (ProgramFC) is a novel fact-checking model that decomposes complex claims into simpler sub-tasks. We first leverage the in-context learning ability of large language models to generate reasoning programs. We execute the program by delegating each sub-task to the corresponding sub-task handler.
Score: 99.7212240712869
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning. In this paper, we present Program-Guided Fact-Checking (ProgramFC), a novel fact-checking model that decomposes complex claims into simpler sub-tasks that can be solved using a shared library of specialized functions. We first leverage the in-context learning ability of large language models to generate reasoning programs to guide the verification process. Afterward, we execute the program by delegating each sub-task to the corresponding sub-task handler. This process makes our model both explanatory and data-efficient, providing clear explanations of its reasoning process and requiring minimal training data. We evaluate ProgramFC on two challenging fact-checking datasets and show that it outperforms seven fact-checking baselines across different settings of evidence availability, with explicit output programs that benefit human debugging. Our codes and data are publicly available at https://github.com/mbzuai-nlp/ProgramFC.

Related papers

Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models. DisCIPL uses a Planner model to generate a task-specific inference program. Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z)
Code Review Automation Via Multi-task Federated LLM -- An Empirical Study [4.8342038441006805]
The study explores five simple techniques for multi-task training, including two sequential methods, one parallel method, and two cumulative methods. The results indicate that sequentially training a federated LLM (FedLLM) for our code review multi-task use case is less efficient in terms of time, computation, and performance metrics, compared to training separate models for each task.
arXiv Detail & Related papers (2024-12-20T08:46:46Z)
Outcome-Refining Process Supervision for Code Generation [28.6680126802249]
Large Language Models struggle with complex programming tasks that require deep algorithmic reasoning. We propose Outcome-Refining Process Supervision, a novel paradigm that treats outcome refinement itself as the process to be supervised. Our approach achieves significant improvements across 5 models and 3 datasets: an average of 26.9% increase in correctness and 42.2% in efficiency.
arXiv Detail & Related papers (2024-12-19T17:59:42Z)
Towards Neural Synthesis for SMT-Assisted Proof-Oriented Programming [8.34623776815378]
We curate a dataset of 600K lines of open-source F* programs and proofs. This dataset includes software used in production systems ranging from Windows and Linux to Python and Firefox. We investigate the use of AI to synthesize programs and their proofs in F*, with promising results.
arXiv Detail & Related papers (2024-05-03T00:14:33Z)
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training. In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk. In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z)
FactLLaMA: Optimizing Instruction-Following Language Models with External Knowledge for Automated Fact-Checking [10.046323978189847]
We propose combining the power of instruction-following language models with external evidence retrieval to enhance fact-checking performance. Our approach involves leveraging search engines to retrieve relevant evidence for a given input claim. Then, we instruct-tune an open-sourced language model, called LLaMA, using this evidence, enabling it to predict the veracity of the input claim more accurately.
arXiv Detail & Related papers (2023-09-01T04:14:39Z)
When Do Program-of-Thoughts Work for Reasoning? [51.2699797837818]
We propose complexity-impacted reasoning score (CIRS) to measure correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity. Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
arXiv Detail & Related papers (2023-08-29T17:22:39Z)
Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion [15.068025336990287]
We show that existing code completion models do not yield good results on our completion task. We query a program analyzer for information relevant to a given function call, and consider ways to provide the analyzer results to different code completion models during inference and training. Our experiments show that providing access to the function implementation and function usages greatly improves the argument completion performance.
arXiv Detail & Related papers (2023-06-01T06:25:58Z)
GPT is becoming a Turing machine: Here are some ways to program it [16.169056235216576]
We show that GPT-3 models can be triggered to execute programs that involve loops. We show that prompts that may not even cover one full task example can trigger algorithmic behaviour.
arXiv Detail & Related papers (2023-03-25T00:43:41Z)
Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs. We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process. Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z)
Exploring Decomposition for Table-based Fact Verification [18.584226291619217]
We improve fact verification by decomposing complex statements into simpler subproblems. Our proposed approach achieves the new state-of-the-art performance, an 82.7% accuracy, on the TabFact benchmark.
arXiv Detail & Related papers (2021-09-22T20:15:05Z)
CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions. We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD. Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z)
Generating Fact Checking Explanations [52.879658637466605]
A crucial piece of the puzzle that is still missing is to understand how to automate the most elaborate part of the process. This paper provides the first study of how these explanations can be generated automatically based on available claim context. Our results indicate that optimising both objectives at the same time, rather than training them separately, improves the performance of a fact checking system.
arXiv Detail & Related papers (2020-04-13T05:23:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.