Related papers: Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

URL: http://arxiv.org/abs/2407.00075v2
Date: Tue, 01 Oct 2024 20:42:41 GMT
Title: Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Authors: Anton Xue, Avishree Khare, Rajeev Alur, Surbhi Goel, Eric Wong,
Abstract summary: We study how to subvert large language models (LLMs) from following prompt-specified rules. We prove that although LLMs can faithfully follow such rules, maliciously crafted prompts can mislead even idealized, theoretically constructed models.
Score: 20.057611113206324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study how to subvert large language models (LLMs) from following prompt-specified rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form ``if $P$ and $Q$, then $R$'' for some propositions $P$, $Q$, and $R$. We prove that although LLMs can faithfully follow such rules, maliciously crafted prompts can mislead even idealized, theoretically constructed models. Empirically, we find that the reasoning behavior of LLMs aligns with that of our theoretical constructions, and popular attack algorithms find adversarial prompts with characteristics predicted by our theory. Our logic-based framework provides a novel perspective for mechanistically understanding the behavior of LLMs in rule-based settings such as jailbreak attacks.

Related papers

Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus [13.276829763453433]
Large language models (LLMs) are capable of solving a wide range of tasks, yet they have struggled with reasoning. We propose $textbfAdditional Logic Training (ALT)$, which aims to enhance LLMs' reasoning capabilities by program-generated logical reasoning samples.
arXiv Detail & Related papers (2024-11-19T13:31:53Z)
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models [87.49676980090555]
Large Language Models (LLMs) have demonstrated notable capabilities across various tasks, showcasing complex problem-solving abilities. We introduce LogicGame, a novel benchmark designed to evaluate the comprehensive rule understanding, execution, and planning capabilities of LLMs.
arXiv Detail & Related papers (2024-08-28T13:16:41Z)
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models [52.03659714625452]
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied.
arXiv Detail & Related papers (2024-04-23T21:08:49Z)
Do Large Language Models Understand Logic or Just Mimick Context? [14.081178100662163]
This paper investigates the reasoning capabilities of large language models (LLMs) on two logical reasoning datasets. It is found that LLMs do not truly understand logical rules; rather, in-context learning has simply enhanced the likelihood of these models arriving at the correct answers.
arXiv Detail & Related papers (2024-02-19T12:12:35Z)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z)
Chain of Logic: Rule-Based Reasoning with Large Language Models [10.017812995997753]
Rule-based reasoning enables us to draw conclusions by accurately applying a rule to a set of facts. We introduce a new prompting method, Chain of Logic, which elicits rule-based reasoning through decomposition and recomposition. We evaluate chain of logic across eight rule-based reasoning tasks involving three distinct compositional rules from the LegalBench benchmark.
arXiv Detail & Related papers (2024-02-16T01:54:43Z)
Can LLMs Follow Simple Rules? [28.73820874333199]
Rule-following Language Evaluation Scenarios (RuLES) is a framework for measuring rule-following ability in Large Language Models. RuLES consists of 14 simple text scenarios in which the model is instructed to obey various rules while interacting with the user. We show that almost all current models struggle to follow scenario rules, even on straightforward test cases.
arXiv Detail & Related papers (2023-11-06T08:50:29Z)
Large Language Models can Learn Rules [106.40747309894236]
We present Hypotheses-to-Theories (HtT), a framework that learns a rule library for reasoning with large language models (LLMs) Experiments on relational reasoning, numerical reasoning and concept learning problems show that HtT improves existing prompting methods. The learned rules are also transferable to different models and to different forms of the same problem.
arXiv Detail & Related papers (2023-10-10T23:07:01Z)
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning [107.61997887260056]
We propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs.
arXiv Detail & Related papers (2023-09-04T11:38:02Z)
Reasoning with Language Model is Planning with World Model [27.24144881796878]
Large language models (LLMs) have shown remarkable reasoning capabilities. LLMs lack an internal $textitworld model$ to predict the world. We propose a new LLM reasoning framework, $underlineR$easoning vi$underlinea$ $underlineP$lanning $textbf(RAP)$.
arXiv Detail & Related papers (2023-05-24T10:28:28Z)
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning [101.26814728062065]
Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, Logic-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving.
arXiv Detail & Related papers (2023-05-20T22:25:38Z)
Automating Defeasible Reasoning in Law [0.0]
We study defeasible reasoning in rule-based systems, in particular about legal norms and contracts. We identify rule modifier that specify how rules interact and how they can be overridden. We then define rule transformations that eliminate these modifier, leading to a translation of rules to formulas.
arXiv Detail & Related papers (2022-05-15T17:14:15Z)
Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules. We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences. Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z)
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs [91.71504177786792]
This paper studies learning logic rules for reasoning on knowledge graphs. Logic rules provide interpretable explanations when used for prediction as well as being able to generalize to other tasks. Existing methods either suffer from the problem of searching in a large search space or ineffective optimization due to sparse rewards.
arXiv Detail & Related papers (2020-10-08T14:47:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.