ClarifyDelphi: Reinforced Clarification Questions with Defeasibility
Rewards for Social and Moral Situations
- URL: http://arxiv.org/abs/2212.10409v3
- Date: Tue, 30 May 2023 18:59:41 GMT
- Title: ClarifyDelphi: Reinforced Clarification Questions with Defeasibility
Rewards for Social and Moral Situations
- Authors: Valentina Pyatkin, Jena D. Hwang, Vivek Srikumar, Ximing Lu, Liwei
Jiang, Yejin Choi, Chandra Bhagavatula
- Abstract summary: We present ClarifyDelphi, an interactive system that learns to ask clarification questions.
We posit that questions whose potential answers lead to diverging moral judgments are the most informative.
Our work is ultimately inspired by studies in cognitive science that have investigated the flexibility in moral cognition.
- Score: 81.70195684646681
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context is everything, even in commonsense moral reasoning. Changing contexts
can flip the moral judgment of an action; "Lying to a friend" is wrong in
general, but may be morally acceptable if it is intended to protect their life.
We present ClarifyDelphi, an interactive system that learns to ask
clarification questions (e.g., why did you lie to your friend?) in order to
elicit additional salient contexts of a social or moral situation. We posit
that questions whose potential answers lead to diverging moral judgments are
the most informative. Thus, we propose a reinforcement learning framework with
a defeasibility reward that aims to maximize the divergence between moral
judgments of hypothetical answers to a question. Human evaluation demonstrates
that our system generates more relevant, informative and defeasible questions
compared to competitive baselines. Our work is ultimately inspired by studies
in cognitive science that have investigated the flexibility in moral cognition
(i.e., the diverse contexts in which moral rules can be bent), and we hope that
research in this direction can assist both cognitive and computational
investigations of moral judgments.
Related papers
- Decoding moral judgement from text: a pilot study [0.0]
Moral judgement is a complex human reaction that engages cognitive and emotional dimensions.
We explore the feasibility of moral judgement decoding from text stimuli with passive brain-computer interfaces.
arXiv Detail & Related papers (2024-05-28T20:31:59Z) - What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts
and Rationales for Disambiguating Defeasible Social and Moral Situations [48.686872351114964]
Moral or ethical judgments rely heavily on the specific contexts in which they occur.
We introduce defeasible moral reasoning: a task to provide grounded contexts that make an action more or less morally acceptable.
We distill a high-quality dataset of 1.2M entries of contextualizations and rationales for 115K defeasible moral actions.
arXiv Detail & Related papers (2023-10-24T00:51:29Z) - Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems.
Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality.
This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z) - Ethical Frameworks and Computer Security Trolley Problems: Foundations
for Conversations [14.120888473204907]
We make and explore connections between moral questions in computer security research and ethics / moral philosophy.
We do not seek to define what is morally right or wrong, nor do we argue for one framework over another.
arXiv Detail & Related papers (2023-02-28T05:39:17Z) - MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via
Moral Discussions [71.25236662907056]
A moral dialogue system aligned with users' values could enhance conversation engagement and user connections.
We propose a framework, MoralDial, to train and evaluate moral dialogue systems.
arXiv Detail & Related papers (2022-12-21T02:21:37Z) - When to Make Exceptions: Exploring Language Models as Accounts of Human
Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions.
A central challenge for AI safety is capturing the flexibility of the human moral mind.
We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z) - AiSocrates: Towards Answering Ethical Quandary Questions [51.53350252548668]
AiSocrates is a system for deliberative exchange of different perspectives to an ethical quandary.
We show that AiSocrates generates promising answers to ethical quandary questions with multiple perspectives.
We argue that AiSocrates is a promising step toward developing an NLP system that incorporates human values explicitly by prompt instructions.
arXiv Detail & Related papers (2022-05-12T09:52:59Z) - The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems [36.90292508433193]
Moral deviations are difficult to mitigate because moral judgments are not universal.
Moral Integrity Corpus captures the moral assumptions of 38k prompt-reply pairs.
We show that current neural language models can automatically generate new RoTs that reasonably describe previously unseen interactions.
arXiv Detail & Related papers (2022-04-06T18:10:53Z) - Delphi: Towards Machine Ethics and Norms [38.8316885346292]
We identify four underlying challenges towards machine ethics and norms.
Our prototype model, Delphi, demonstrates strong promise of language-based commonsense moral reasoning.
We present Commonsense Norm Bank, a moral textbook customized for machines.
arXiv Detail & Related papers (2021-10-14T17:38:12Z) - Text-based inference of moral sentiment change [11.188112005462536]
We present a text-based framework for investigating moral sentiment change of the public via longitudinal corpora.
We build our methodology by exploring moral biases learned from diachronic word embeddings.
Our work offers opportunities for applying natural language processing toward characterizing moral sentiment change in society.
arXiv Detail & Related papers (2020-01-20T18:52:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.