Related papers: Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making

Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making

URL: http://arxiv.org/abs/2310.13610v1
Date: Fri, 20 Oct 2023 15:59:57 GMT
Title: Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making
Authors: Yanrui Du, Sendong Zhao, Haochun Wang, Yuhan Chen, Rui Bai, Zewen Qiang, Muzhen Cai, Bing Qin
Abstract summary: We propose a unified two-stage framework known as Self-Attribution and Decision-Making (SADM) We demonstrate that our framework not only establishes a more reliable link between the generated rationale and model decision but also achieves competitive results in task performance and the quality of rationale.
Score: 24.906886146275127
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Explaining black-box model behavior with natural language has achieved impressive results in various NLP tasks. Recent research has explored the utilization of subsequences from the input text as a rationale, providing users with evidence to support the model decision. Although existing frameworks excel in generating high-quality rationales while achieving high task performance, they neglect to account for the unreliable link between the generated rationale and model decision. In simpler terms, a model may make correct decisions while attributing wrong rationales, or make poor decisions while attributing correct rationales. To mitigate this issue, we propose a unified two-stage framework known as Self-Attribution and Decision-Making (SADM). Through extensive experiments on five reasoning datasets from the ERASER benchmark, we demonstrate that our framework not only establishes a more reliable link between the generated rationale and model decision but also achieves competitive results in task performance and the quality of rationale. Furthermore, we explore the potential of our framework in semi-supervised scenarios.

Related papers

Reasoning-Based AI for Startup Evaluation (R.A.I.S.E.): A Memory-Augmented, Multi-Step Decision Framework [0.0]
We present a novel framework that bridges the gap between the interpretability of decision trees and the advanced reasoning capabilities of large language models (LLMs) to predict startup success. Our approach leverages chain-of-thought prompting to generate detailed reasoning logs, which are subsequently distilled into structured, human-understandable logical rules. Our method not only augments traditional decision-making processes but also facilitates expert intervention and continuous policy refinement.
arXiv Detail & Related papers (2025-04-16T13:53:42Z)
On the Reasoning Capacity of AI Models and How to Quantify It [0.0]
Large Language Models (LLMs) have intensified the debate surrounding the fundamental nature of their reasoning capabilities. While achieving high performance on benchmarks such as GPQA and MMLU, these models exhibit limitations in more complex reasoning tasks. We propose a novel phenomenological approach that goes beyond traditional accuracy metrics to probe the underlying mechanisms of model behavior.
arXiv Detail & Related papers (2025-01-23T16:58:18Z)
Making Large Language Models Better Planners with Reasoning-Decision Alignment [70.5381163219608]
We motivate an end-to-end decision-making model based on multimodality-augmented LLM. We propose a reasoning-decision alignment constraint between the paired CoTs and planning results. We dub our proposed large language planners with reasoning-decision alignment as RDA-Driver.
arXiv Detail & Related papers (2024-08-25T16:43:47Z)
Distribution-consistency Structural Causal Models [6.276417011421679]
We introduce a novel textitdistribution-consistency assumption, and in alignment with it, we propose the Distribution-consistency Structural Causal Models (DiscoSCMs) To concretely reveal the enhanced model capacity, we introduce a new identifiable causal parameter, textitthe probability of consistency, which holds practical significance within DiscoSCM alone.
arXiv Detail & Related papers (2024-01-29T06:46:15Z)
Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly. L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors. We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z)
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training [49.3242278912771]
Multimodal reasoning is a challenging task that requires models to reason across multiple modalities to answer questions. Existing approaches have made progress by incorporating language and visual modalities into a two-stage reasoning framework. We propose MC-CoT, a self-consistency training strategy that generates multiple rationales and answers, subsequently selecting the most accurate through a voting process.
arXiv Detail & Related papers (2023-11-23T17:09:48Z)
Rational Decision-Making Agent with Internalized Utility Judgment [91.80700126895927]
Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications. This paper proposes RadAgent, which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning. Experimental results on the ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks.
arXiv Detail & Related papers (2023-08-24T03:11:45Z)
CREST: A Joint Framework for Rationalization and Counterfactual Text Generation [5.606679908174783]
We introduce CREST (ContRastive Edits with Sparse raTionalization), a framework for selective rationalization and counterfactual text generation. CREST generates valid counterfactuals that are more natural than those produced by previous methods. New loss function that leverages CREST counterfactuals to regularize selective rationales improves both model robustness and rationale quality.
arXiv Detail & Related papers (2023-05-26T16:34:58Z)
Can Language Representation Models Think in Bets? [8.185725740857594]
transformer-based language representation models (LRMs) have achieved state-of-the-art results on difficult natural language understanding problems. This article investigates LRMs' rational decision-making ability through a carefully designed set of decision-making benchmarks and experiments.
arXiv Detail & Related papers (2022-10-14T05:01:04Z)
Attributing Fair Decisions with Attention Interventions [28.968122909973975]
We design an attention-based model that can be leveraged as an attribution framework. It can identify features responsible for both performance and fairness of the model through attention interventions and attention weight manipulation. We then design a post-processing bias mitigation strategy and compare it with a suite of baselines.
arXiv Detail & Related papers (2021-09-08T22:28:44Z)
Why do you think that? Exploring Faithful Sentence-Level Rationales Without Supervision [60.62434362997016]
We propose a differentiable training-framework to create models which output faithful rationales on a sentence level. Our model solves the task based on each rationale individually and learns to assign high scores to those which solved the task best.
arXiv Detail & Related papers (2020-10-07T12:54:28Z)
Beyond Individual and Group Fairness [90.4666341812857]
We present a new data-driven model of fairness that is guided by the unfairness complaints received by the system. Our model supports multiple fairness criteria and takes into account their potential incompatibilities.
arXiv Detail & Related papers (2020-08-21T14:14:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.