PSSD: Making Large Language Models Self-denial via Human Psyche Structure
- URL: http://arxiv.org/abs/2502.01344v1
- Date: Mon, 03 Feb 2025 13:37:21 GMT
- Title: PSSD: Making Large Language Models Self-denial via Human Psyche Structure
- Authors: Jinzhi Liao, Zenghua Liao, Xiang Zhao,
- Abstract summary: We present PSSD, which refers to and implements the human psyche structure such that three distinct and interconnected roles contribute to human reasoning.<n>Extensive experiments demonstrate that the proposed design not only better enhance reasoning capabilities, but also seamlessly integrate with current models.
- Score: 5.057375783924452
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The enhance of accuracy in reasoning results of LLMs arouses the community's interests, wherein pioneering studies investigate post-hoc strategies to rectify potential mistakes. Despite extensive efforts, they are all stuck in a state of resource competition demanding significant time and computing expenses. The cause of the situation lies in the failure of identifying the fundamental feature of the solutions in this line, coined as the self-denial of LLMs. In other words, LLMs should confidently determine the potential existence of mistakes and carefully execute the targeted correction. As the whole procedure conducts within LLMs, supporting and persuasive references are hard to acquire, while the absence of specific steps towards refining hidden mistakes persists even when errors are acknowledged. In response to the challenges, we present PSSD, which refers to and implements the human psyche structure such that three distinct and interconnected roles contribute to human reasoning. Specifically, PSSD leverages the recent multi-agent paradigm, and is further enhanced with three innovatively conceived roles: (1) the intuition-based id role that provides initial attempts based on benign LLMs; (2) the rule-driven superego role that summarizes rules to regulate the above attempts, and returns specific key points as guidance; and (3) the script-centric ego role that absorbs all procedural information to generate executable script for the final answer prediction. Extensive experiments demonstrate that the proposed design not only better enhance reasoning capabilities, but also seamlessly integrate with current models, leading to superior performance.
Related papers
- An Empirical Study of Reasoning Steps in Thinking Code LLMs [8.653365851909745]
Thinking Large Language Models generate explicit intermediate reasoning traces before final answers.<n>This study examines the reasoning process and quality of thinking LLMs for code generation.
arXiv Detail & Related papers (2025-11-08T06:18:48Z) - Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs [57.10533368622962]
Self-correction of large language models (LLMs) emerges as a critical component for enhancing their reasoning performance.<n>This study introduces CorrectBench, a benchmark developed to evaluate the effectiveness of self-correction strategies.<n>Our findings reveal that: 1) Self-correction methods can improve accuracy, especially for complex reasoning tasks; 2) Mixing different self-correction strategies yields further improvements, though it reduces efficiency; and 3) Reasoning LLMs (e.g., DeepSeek-R1) have limited optimization under additional self-correction methods and have high time costs.
arXiv Detail & Related papers (2025-10-17T02:40:19Z) - From <Answer> to <Think>: Multidimensional Supervision of Reasoning Process for LLM Optimization [62.07990937720985]
Dimension-level Reward Model (DRM) is a new supervision framework for Large Language Models.<n>DRM evaluates the quality of a reasoning process along three fundamental, complementary, and interpretable dimensions.<n> Experimental results show that DRM provides effective supervision signals, guides the optimization of LLMs and enhances their reasoning ability.
arXiv Detail & Related papers (2025-10-13T14:29:15Z) - ReaLM: Reflection-Enhanced Autonomous Reasoning with Small Language Models [76.28894983518164]
Small Language Models (SLMs) are a cost-effective alternative to Large Language Models (LLMs)<n>They often struggle with complex reasoning due to their limited capacity and a tendency to produce mistakes or inconsistent answers.<n>We introduce ReaLM, a reinforcement learning framework for robust and self-sufficient reasoning in vertical domains.
arXiv Detail & Related papers (2025-08-17T14:50:23Z) - Improving Fairness in LLMs Through Testing-Time Adversaries [1.7811840395202343]
Large Language Models (LLMs) push the bound-aries in natural language processing and generative AI.<n>In this work, we propose a straightforward, user-friendly and practical method to mitigate such biases.<n>Our method creates multiple variations of a given sentence by modifying specific attributes and evaluates the corresponding prediction behavior.
arXiv Detail & Related papers (2025-05-17T17:56:53Z) - Self-Corrective Task Planning by Inverse Prompting with Large Language Models [9.283971287618261]
We introduce InversePrompt, a novel self-corrective task planning approach.
Our method incorporates reasoning steps to provide clear, interpretable feedback.
Results on benchmark datasets show an average 16.3% higher success rate over existing LLM-based task planning methods.
arXiv Detail & Related papers (2025-03-10T13:35:51Z) - ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification [53.80183105328448]
Refine via Intrinsic Self-Verification (ReVISE) is an efficient framework that enables LLMs to self-correct their outputs through self-verification.
Our experiments on various reasoning tasks demonstrate that ReVISE achieves efficient self-correction and significantly improves reasoning performance.
arXiv Detail & Related papers (2025-02-20T13:50:02Z) - Investigating the Robustness of Deductive Reasoning with Large Language Models [7.494617747914778]
Large Language Models (LLMs) have been shown to achieve impressive results for many reasoning-based Natural Language Processing (NLP) tasks.
It remains unclear to which extent LLMs, in both informal and autoformalisation methods, are robust on logical deduction tasks.
arXiv Detail & Related papers (2025-02-04T17:16:51Z) - Deception in LLMs: Self-Preservation and Autonomous Goals in Large Language Models [0.0]
Recent advances in Large Language Models have incorporated planning and reasoning capabilities.<n>This has reduced errors in mathematical and logical tasks while improving accuracy.<n>Our study examines DeepSeek R1, a model trained to output reasoning tokens similar to OpenAI's o1.
arXiv Detail & Related papers (2025-01-27T21:26:37Z) - Reinforcing Thinking through Reasoning-Enhanced Reward Models [6.636512424910708]
Large Language Models (LLMs) exhibit great potential in complex multi-step reasoning through inference-time thinking.<n>LLMs struggle with deciding when to stop thinking due to limited self-awareness about their knowledge boundaries.<n>This work addresses these challenges by distilling the LLM's own reasoning processes into synthetic behavioral data.
arXiv Detail & Related papers (2024-12-31T04:50:15Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) are used to automate decision-making tasks.<n>In this paper, we evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.<n>We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types.<n>These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - Tuning-Free Accountable Intervention for LLM Deployment -- A
Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks.
We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z) - A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning [73.77088902676306]
We take a closer look at the self-verification abilities of large language models (LLMs) in the context of logical reasoning.
Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods.
arXiv Detail & Related papers (2023-11-14T07:13:10Z) - Concise and Organized Perception Facilitates Reasoning in Large Language Models [32.71672086718057]
We show that large language models (LLMs) exhibit failure patterns akin to human-like cognitive biases when dealing with disordered and irrelevant content in reasoning tasks.
We propose a novel reasoning approach named Concise and Organized Perception (COP)
COP carefully analyzes the given statements to identify the most pertinent information while eliminating redundancy efficiently.
arXiv Detail & Related papers (2023-10-05T04:47:49Z) - Large Language Models Cannot Self-Correct Reasoning Yet [78.16697476530994]
Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities.
Concerns persist regarding the accuracy and appropriateness of their generated content.
A contemporary methodology, self-correction, has been proposed as a remedy to these issues.
arXiv Detail & Related papers (2023-10-03T04:56:12Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.