Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and Generalization
- URL: http://arxiv.org/abs/2502.16600v4
- Date: Thu, 06 Mar 2025 17:56:40 GMT
- Title: Diagnosing Moral Reasoning Acquisition in Language Models: Pragmatics and Generalization
- Authors: Guangliang Liu, Lei Jiang, Xitong Zhang, Kristen Marie Johnson,
- Abstract summary: Large Language Models (LLMs) often fail to perform satisfactorily on tasks requiring moral cognizance.<n>Can current learning paradigms enable LLMs to acquire sufficient moral reasoning capabilities?<n>We show that performance improvements follow a mechanism similar to that of semantic-level tasks, and therefore remain affected by the pragmatic nature of latent morals in discourse.
- Score: 9.650922370722476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensuring that Large Language Models (LLMs) return just responses which adhere to societal values is crucial for their broader application. Prior research has shown that LLMs often fail to perform satisfactorily on tasks requiring moral cognizance, such as ethics-based judgments. While current approaches have focused on fine-tuning LLMs with curated datasets to improve their capabilities on such tasks, choosing the optimal learning paradigm to enhance the ethical responses of LLMs remains an open research debate. In this work, we aim to address this fundamental question: can current learning paradigms enable LLMs to acquire sufficient moral reasoning capabilities? Drawing from distributional semantics theory and the pragmatic nature of moral discourse, our analysis indicates that performance improvements follow a mechanism similar to that of semantic-level tasks, and therefore remain affected by the pragmatic nature of morals latent in discourse, a phenomenon we name the pragmatic dilemma. We conclude that this pragmatic dilemma imposes significant limitations on the generalization ability of current learning paradigms, making it the primary bottleneck for moral reasoning acquisition in LLMs.
Related papers
- The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach [6.0972634521845475]
This paper introduces the Priorities in Reasoning and Intrinsic Moral Evaluation (PRIME) framework.
PRIME is a comprehensive methodology for analyzing moral priorities across foundational ethical dimensions.
We apply this framework to six leading large language models (LLMs) through a dual-protocol approach.
arXiv Detail & Related papers (2025-04-27T14:26:48Z) - Addressing Moral Uncertainty using Large Language Models for Ethical Decision-Making [0.0]
We present an ethical decision-making framework that refines a pre-trained reinforcement learning (RL) model using a task-agnostic ethical layer.
An ethical layer aggregates belief scores from multiple moral perspectives using Belief Jensen-Shannon Divergence and Dempster-Shafer Theory into probability scores that also serve as the shaping reward.
This integrated learning framework helps the RL agent navigate moral uncertainty in complex environments and enables it to make morally sound decisions across diverse tasks.
arXiv Detail & Related papers (2025-02-17T19:05:55Z) - Adversarial Alignment for LLMs Requires Simpler, Reproducible, and More Measurable Objectives [52.863024096759816]
Misaligned research objectives have hindered progress in adversarial robustness research over the past decade.
We argue that realigned objectives are necessary for meaningful progress in adversarial alignment.
arXiv Detail & Related papers (2025-02-17T15:28:40Z) - LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning [49.58786377307728]
This paper adopts an exploratory approach by introducing a controlled evaluation environment for analogical reasoning.<n>We analyze the comparative dynamics of inductive, abductive, and deductive inference pipelines.<n>We investigate advanced paradigms such as hypothesis selection, verification, and refinement, revealing their potential to scale up logical inference.
arXiv Detail & Related papers (2025-02-16T15:54:53Z) - Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values [13.798198972161657]
A number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes.
This paper examines whether large language models (LLMs) adhere to fundamental fairness concepts and investigate their alignment with human preferences.
arXiv Detail & Related papers (2025-02-01T04:24:47Z) - On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models [25.029579061612456]
Large Language Models (LLMs) are increasingly being employed in real-world applications in critical domains such as healthcare.
It is important to ensure that the Chain-of-Thought (CoT) reasoning generated by these models faithfully captures their underlying behavior.
arXiv Detail & Related papers (2024-06-15T13:16:44Z) - MoralBench: Moral Evaluation of LLMs [34.43699121838648]
This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of large language models (LLMs)
We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs.
Our methodology involves a multi-faceted approach, combining quantitative analysis with qualitative insights from ethics scholars to ensure a thorough evaluation of model performance.
arXiv Detail & Related papers (2024-06-06T18:15:01Z) - A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law.
We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies.
We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z) - Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models [26.11408084129897]
Large Language Models (LLMs) are deployed as powerful tools for several natural language processing (NLP) applications.
Recent works show that modern LLMs can generate self-explanations (SEs), which elicit their intermediate reasoning steps for explaining their behavior.
We discuss the dichotomy between faithfulness and plausibility in SEs generated by LLMs.
arXiv Detail & Related papers (2024-02-07T06:32:50Z) - Enhancing Ethical Explanations of Large Language Models through
Iterative Symbolic Refinement [5.108863224378874]
This paper investigates how hybrid neuro-symbolic techniques can enhance the logical validity and alignment of ethical explanations.
We present an abductive-deductive framework named Logic-Explainer, which integrates Large Language Models with an external backward-chaining solver.
An empirical analysis demonstrates that Logic-Explainer can improve explanations generated via in-context learning methods and Chain-of-Thought.
arXiv Detail & Related papers (2024-02-01T16:39:51Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning [73.77088902676306]
We take a closer look at the self-verification abilities of large language models (LLMs) in the context of logical reasoning.
Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods.
arXiv Detail & Related papers (2023-11-14T07:13:10Z) - Large Language Models are In-Context Semantic Reasoners rather than
Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years.
Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear.
In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.