Case Repositories: Towards Case-Based Reasoning for AI Alignment
- URL: http://arxiv.org/abs/2311.10934v3
- Date: Sun, 26 Nov 2023 21:07:10 GMT
- Title: Case Repositories: Towards Case-Based Reasoning for AI Alignment
- Authors: K. J. Kevin Feng, Quan Ze Chen, Inyoung Cheong, King Xia, Amy X. Zhang
- Abstract summary: Case studies commonly form the pedagogical backbone in law, ethics, and many other domains.
We propose a complementary approach to constitutional AI alignment that focuses on the construction of policies through judgments on a set of cases.
- Score: 9.097877374792576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Case studies commonly form the pedagogical backbone in law, ethics, and many
other domains that face complex and ambiguous societal questions informed by
human values. Similar complexities and ambiguities arise when we consider how
AI should be aligned in practice: when faced with vast quantities of diverse
(and sometimes conflicting) values from different individuals and communities,
with whose values is AI to align, and how should AI do so? We propose a
complementary approach to constitutional AI alignment, grounded in ideas from
case-based reasoning (CBR), that focuses on the construction of policies
through judgments on a set of cases. We present a process to assemble such a
case repository by: 1) gathering a set of ``seed'' cases -- questions one may
ask an AI system -- in a particular domain, 2) eliciting domain-specific key
dimensions for cases through workshops with domain experts, 3) using LLMs to
generate variations of cases not seen in the wild, and 4) engaging with the
public to judge and improve cases. We then discuss how such a case repository
could assist in AI alignment, both through directly acting as precedents to
ground acceptable behaviors, and as a medium for individuals and communities to
engage in moral reasoning around AI.
Related papers
- Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Quantifying Misalignment Between Agents [2.619545850602691]
Growing concerns about the AI alignment problem have emerged in recent years.
We show how misalignment can vary depending on the population of agents being observed.
Our model departs from value specification approaches and focuses instead on the morass of complex, interlocking, sometimes contradictory goals that agents may have in practice.
arXiv Detail & Related papers (2024-06-06T16:31:22Z) - Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits [54.648819983899614]
Particip-AI is a framework to gather current and future AI use cases and their harms and benefits from non-expert public.
We gather responses from 295 demographically diverse participants.
arXiv Detail & Related papers (2024-03-21T19:12:37Z) - Human Values in Multiagent Systems [3.5027291542274357]
This paper presents a formal representation of values, grounded in the social sciences.
We use this formal representation to articulate the key challenges for achieving value-aligned behaviour in multiagent systems.
arXiv Detail & Related papers (2023-05-04T11:23:59Z) - Factoring the Matrix of Domination: A Critical Review and Reimagination
of Intersectionality in AI Fairness [55.037030060643126]
Intersectionality is a critical framework that allows us to examine how social inequalities persist.
We argue that adopting intersectionality as an analytical framework is pivotal to effectively operationalizing fairness.
arXiv Detail & Related papers (2023-03-16T21:02:09Z) - Fairness in Agreement With European Values: An Interdisciplinary
Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them.
We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives.
We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z) - Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research.
An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system.
We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z) - Relational Artificial Intelligence [5.5586788751870175]
Even though AI is traditionally associated with rational decision making, understanding and shaping the societal impact of AI in all its facets requires a relational perspective.
A rational approach to AI, where computational algorithms drive decision making independent of human intervention, has shown to result in bias and exclusion.
A relational approach, that focus on the relational nature of things, is needed to deal with the ethical, legal, societal, cultural, and environmental implications of AI.
arXiv Detail & Related papers (2022-02-04T15:29:57Z) - Descriptive AI Ethics: Collecting and Understanding the Public Opinion [10.26464021472619]
This work proposes a mixed AI ethics model that allows normative and descriptive research to complement each other.
We discuss its implications on bridging the gap between optimistic and pessimistic views towards AI systems' deployment.
arXiv Detail & Related papers (2021-01-15T03:46:27Z) - Learning from Learning Machines: Optimisation, Rules, and Social Norms [91.3755431537592]
It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making.
Recent successes of deep learning for AI suggest that more implicit specifications work better than explicit ones for solving such problems.
arXiv Detail & Related papers (2019-12-29T17:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.