Related papers: Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models

Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models

URL: http://arxiv.org/abs/2503.11336v1
Date: Fri, 14 Mar 2025 12:05:06 GMT
Title: Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models
Authors: Aissatou Diallo, Antonis Bikakis, Luke Dickens, Anthony Hunter, Rob Miller,
Abstract summary: Rule-Guided Feedback (RGF) is a framework designed to enhance Large Language Model (LLM) performance.<n>RGF implements a teacher-student paradigm where rule-following is forced through established guidelines.
Score: 7.839338724237275
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we introduce Rule-Guided Feedback (RGF), a framework designed to enhance Large Language Model (LLM) performance through structured rule adherence and strategic information seeking. RGF implements a teacher-student paradigm where rule-following is forced through established guidelines. Our framework employs a Teacher model that rigorously evaluates each student output against task-specific rules, providing constructive guidance rather than direct answers when detecting deviations. This iterative feedback loop serves two crucial purposes: maintaining solutions within defined constraints and encouraging proactive information seeking to resolve uncertainties. We evaluate RGF on diverse tasks including Checkmate-in-One puzzles, Sonnet Writing, Penguins-In-a-Table classification, GSM8k, and StrategyQA. Our findings suggest that structured feedback mechanisms can significantly enhance LLMs' performance across various domains.

Related papers

AURORA: Augmented Understanding via Structured Reasoning and Reinforcement Learning for Reference Audio-Visual Segmentation [113.75682363364004]
AURORA is a framework designed to enhance genuine reasoning and language comprehension in reference audio-visual segmentation.<n>AURORA achieves state-of-the-art performance on Ref-AVS benchmarks and generalizes effectively to unreferenced segmentation.
arXiv Detail & Related papers (2025-08-04T07:47:38Z)
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards [53.36917093757101]
Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs)<n>We introduce textbfCogDual, a novel RPLA adopting a textitcognize-then-respond reasoning paradigm.<n>By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment.
arXiv Detail & Related papers (2025-07-23T02:26:33Z)
KunLunBaizeRAG: Reinforcement Learning Driven Inference Performance Leap for Large Language Models [4.637288682081713]
KunLunBaizeRAG is a reinforcement learning-driven reasoning framework designed to enhance the reasoning capabilities of large language models (LLMs) in complex multi-hop question-answering tasks.<n>Key innovations include the RAG-driven Reasoning Alignment (RDRA) mechanism, the Search-Think Iterative Enhancement (STIE) mechanism, the Network-Local Intelligent Routing (NLR) mechanism, and a progressive hybrid training strategy.
arXiv Detail & Related papers (2025-06-24T09:48:01Z)
DMN-Guided Prompting: A Low-Code Framework for Controlling LLM Behavior [0.8747606955991705]
Decision Model and Notation (DMN) offers a standardized graphical approach for defining decision logic in a structured, user-friendly manner.<n>This paper introduces a DMN-guided prompting framework that breaks down complex decision logic into smaller, manageable components.
arXiv Detail & Related papers (2025-05-16T21:09:36Z)
GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models [0.17265013728931003]
GRPO-LEAD is a suite of novel enhancements tailored for mathematical reasoning. It introduces (1) a length-dependent accuracy reward to encourage concise and precise solutions, (2) an explicit penalty mechanism for incorrect answers to sharpen decision boundaries, and (3) a difficulty-aware advantage reweighting strategy that amplifies learning signals for challenging problems.
arXiv Detail & Related papers (2025-04-13T19:07:45Z)
Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations [65.11348389219887]
We introduce Dialectic-RAG (DRAG), a modular approach that evaluates retrieved information by comparing, contrasting, and resolving conflicting perspectives. We show the impact of our framework both as an in-context learning strategy and for constructing demonstrations to instruct smaller models.
arXiv Detail & Related papers (2025-04-07T06:55:15Z)
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization [94.31508613367296]
Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
arXiv Detail & Related papers (2024-10-11T13:52:44Z)
Retrieved In-Context Principles from Previous Mistakes [55.109234526031884]
In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes. We propose Retrieved In-Context Principles (RICP), a novel teacher-student framework.
arXiv Detail & Related papers (2024-07-08T07:32:26Z)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z)
SCREWS: A Modular Framework for Reasoning with Revisions [58.698199183147935]
We present SCREWS, a modular framework for reasoning with revisions. We show that SCREWS unifies several previous approaches under a common framework. We evaluate our framework with state-of-the-art LLMs on a diverse set of reasoning tasks.
arXiv Detail & Related papers (2023-09-20T15:59:54Z)
Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control [44.77500987121531]
Hierarchical Imitation Learning (HIL) has been proposed to recover highly-complex behaviors in long-horizon tasks from expert demonstrations. We develop a novel HIL algorithm based on Adversarial Inverse Reinforcement Learning. We also propose a Variational Autoencoder framework for learning with our objectives in an end-to-end fashion.
arXiv Detail & Related papers (2022-10-05T00:28:26Z)
Small Changes Make Big Differences: Improving Multi-turn Response Selection \\in Dialogue Systems via Fine-Grained Contrastive Learning [27.914380392295815]
Retrieve-based dialogue response selection aims to find a proper response from a candidate set given a multi-turn context. We propose a novel textbfFine-textbfGrained textbfContrastive (FGC) learning method for the response selection task based on PLMs.
arXiv Detail & Related papers (2021-11-19T11:07:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.