Related papers: Case Repositories: Towards Case-Based Reasoning for AI Alignment

Case Repositories: Towards Case-Based Reasoning for AI Alignment

URL: http://arxiv.org/abs/2311.10934v3
Date: Sun, 26 Nov 2023 21:07:10 GMT
Title: Case Repositories: Towards Case-Based Reasoning for AI Alignment
Authors: K. J. Kevin Feng, Quan Ze Chen, Inyoung Cheong, King Xia, Amy X. Zhang
Abstract summary: Case studies commonly form the pedagogical backbone in law, ethics, and many other domains. We propose a complementary approach to constitutional AI alignment that focuses on the construction of policies through judgments on a set of cases.
Score: 9.097877374792576
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose values is AI to align, and how should AI do so? We propose a complementary approach to constitutional AI alignment, grounded in ideas from case-based reasoning (CBR), that focuses on the construction of policies through judgments on a set of cases. We present a process to assemble such a case repository by: 1) gathering a set of ``seed'' cases -- questions one may ask an AI system -- in a particular domain, 2) eliciting domain-specific key dimensions for cases through workshops with domain experts, 3) using LLMs to generate variations of cases not seen in the wild, and 4) engaging with the public to judge and improve cases. We then discuss how such a case repository could assist in AI alignment, both through directly acting as precedents to ground acceptable behaviors, and as a medium for individuals and communities to engage in moral reasoning around AI.

Related papers

Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor [83.99510317617694]
We argue that a broader conception of what rigorous AI research and practice should entail is needed.<n>We aim to provide useful language and a framework for much-needed dialogue about the AI community's work.
arXiv Detail & Related papers (2025-06-17T15:44:41Z)
The Value of Disagreement in AI Design, Evaluation, and Alignment [0.0]
Disagreements are widespread across the design, evaluation, and alignment pipelines of AI systems.<n>Standard practices in AI development often obscure or eliminate disagreement, resulting in an engineered homogenization.<n>We develop a normative framework to guide practical reasoning about disagreement in the AI lifecycle.
arXiv Detail & Related papers (2025-05-12T17:22:30Z)
Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks. We show that the learned AI control system demonstrates robustness against adversarial tampering. In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z)
Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment [2.619545850602691]
Recent sociotechnical approaches highlight the need to understand complex misalignment among multiple human and AI agents. We adapt a computational social science model of human contention to the alignment problem. Our model quantifies misalignment in large, diverse agent groups with potentially conflicting goals.
arXiv Detail & Related papers (2024-06-06T16:31:22Z)
Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits [54.648819983899614]
General purpose AI seems to have lowered the barriers for the public to use AI and harness its power. We introduce PARTICIP-AI, a framework for laypeople to speculate and assess AI use cases and their impacts.
arXiv Detail & Related papers (2024-03-21T19:12:37Z)
Human Values in Multiagent Systems [3.5027291542274357]
This paper presents a formal representation of values, grounded in the social sciences. We use this formal representation to articulate the key challenges for achieving value-aligned behaviour in multiagent systems.
arXiv Detail & Related papers (2023-05-04T11:23:59Z)
Factoring the Matrix of Domination: A Critical Review and Reimagination of Intersectionality in AI Fairness [55.037030060643126]
Intersectionality is a critical framework that allows us to examine how social inequalities persist. We argue that adopting intersectionality as an analytical framework is pivotal to effectively operationalizing fairness.
arXiv Detail & Related papers (2023-03-16T21:02:09Z)
Fairness in Agreement With European Values: An Interdisciplinary Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them. We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives. We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z)
Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research. An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system. We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z)
Relational Artificial Intelligence [5.5586788751870175]
Even though AI is traditionally associated with rational decision making, understanding and shaping the societal impact of AI in all its facets requires a relational perspective. A rational approach to AI, where computational algorithms drive decision making independent of human intervention, has shown to result in bias and exclusion. A relational approach, that focus on the relational nature of things, is needed to deal with the ethical, legal, societal, cultural, and environmental implications of AI.
arXiv Detail & Related papers (2022-02-04T15:29:57Z)
Descriptive AI Ethics: Collecting and Understanding the Public Opinion [10.26464021472619]
This work proposes a mixed AI ethics model that allows normative and descriptive research to complement each other. We discuss its implications on bridging the gap between optimistic and pessimistic views towards AI systems' deployment.
arXiv Detail & Related papers (2021-01-15T03:46:27Z)
Learning from Learning Machines: Optimisation, Rules, and Social Norms [91.3755431537592]
It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making. Recent successes of deep learning for AI suggest that more implicit specifications work better than explicit ones for solving such problems.
arXiv Detail & Related papers (2019-12-29T17:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.