NL2FOL: Translating Natural Language to First-Order Logic for Logical Fallacy Detection
- URL: http://arxiv.org/abs/2405.02318v1
- Date: Thu, 18 Apr 2024 00:20:48 GMT
- Title: NL2FOL: Translating Natural Language to First-Order Logic for Logical Fallacy Detection
- Authors: Abhinav Lalwani, Lovish Chopra, Christopher Hahn, Caroline Trippel, Zhijing Jin, Mrinmaya Sachan,
- Abstract summary: We design a process to reliably detect logical fallacies by translating natural language to First-order Logic.
We then utilize Satisfiability Modulo Theory (SMT) solvers to reason about the validity of the formula.
Our approach is robust, interpretable and does not require training data or fine-tuning.
- Score: 45.28949266878263
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Logical fallacies are common errors in reasoning that undermine the logic of an argument. Automatically detecting logical fallacies has important applications in tracking misinformation and validating claims. In this paper, we design a process to reliably detect logical fallacies by translating natural language to First-order Logic (FOL) step-by-step using Large Language Models (LLMs). We then utilize Satisfiability Modulo Theory (SMT) solvers to reason about the validity of the formula and classify inputs as either a fallacy or valid statement. Our model also provides a novel means of utilizing LLMs to interpret the output of the SMT solver, offering insights into the counter-examples that illustrate why a given sentence is considered a logical fallacy. Our approach is robust, interpretable and does not require training data or fine-tuning. We evaluate our model on a mixed dataset of fallacies and valid sentences. The results demonstrate improved performance compared to end-to-end LLMs, with our classifier achieving an F1-score of 71\% on the Logic dataset. The approach is able to generalize effectively, achieving an F1-score of 73% on the challenge set, LogicClimate, outperforming state-of-the-art models by 21% despite its much smaller size.
Related papers
- LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models [52.03659714625452]
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks.
But, can they really "reason" over the natural language?
This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied.
arXiv Detail & Related papers (2024-04-23T21:08:49Z) - A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning [73.77088902676306]
We take a closer look at the self-verification abilities of large language models (LLMs) in the context of logical reasoning.
Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods.
arXiv Detail & Related papers (2023-11-14T07:13:10Z) - Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers.
LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z) - LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - RobustLR: Evaluating Robustness to Logical Perturbation in Deductive
Reasoning [25.319674132967553]
Transformers have been shown to be able to perform deductive reasoning on a logical rulebase containing rules and statements written in English natural language.
We propose RobustLR to evaluate the robustness of these models to minimal logical edits in rulebases.
We find that the models trained in prior works do not perform consistently on the different perturbations in RobustLR.
arXiv Detail & Related papers (2022-05-25T09:23:50Z) - On the Paradox of Learning to Reason from Data [86.13662838603761]
We show that BERT can attain near-perfect accuracy on in-distribution test examples while failing to generalize to other data distributions over the exact same problem space.
Our study provides an explanation for this paradox: instead of learning to emulate the correct reasoning function, BERT has in fact learned statistical features that inherently exist in logical reasoning problems.
arXiv Detail & Related papers (2022-05-23T17:56:48Z) - Logical Fallacy Detection [40.06349885733248]
We propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text.
We show that a simple structure-aware classifier outperforms the best language model by 5.46% on Logic and 4.51% on LogicClimate.
arXiv Detail & Related papers (2022-02-28T13:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.