Learning the Value Systems of Societies from Preferences
- URL: http://arxiv.org/abs/2507.20728v1
- Date: Mon, 28 Jul 2025 11:25:55 GMT
- Title: Learning the Value Systems of Societies from Preferences
- Authors: Andrés Holgado-Sánchez, Holger Billhardt, Sascha Ossowski, Sara Degli-Esposti,
- Abstract summary: Aligning AI systems with human values and the value-based preferences of various stakeholders is key in ethical AI.<n>In value-aware AI systems, decision-making draws upon explicit computational representations of individual values.<n>We propose a method to address the problem of learning the value systems of societies.
- Score: 1.3836987591220347
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Aligning AI systems with human values and the value-based preferences of various stakeholders (their value systems) is key in ethical AI. In value-aware AI systems, decision-making draws upon explicit computational representations of individual values (groundings) and their aggregation into value systems. As these are notoriously difficult to elicit and calibrate manually, value learning approaches aim to automatically derive computational models of an agent's values and value system from demonstrations of human behaviour. Nonetheless, social science and humanities literature suggest that it is more adequate to conceive the value system of a society as a set of value systems of different groups, rather than as the simple aggregation of individual value systems. Accordingly, here we formalize the problem of learning the value systems of societies and propose a method to address it based on heuristic deep clustering. The method learns socially shared value groundings and a set of diverse value systems representing a given society by observing qualitative value-based preferences from a sample of agents. We evaluate the proposal in a use case with real data about travelling decisions.
Related papers
- Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models [13.513813405118478]
Large Language Models (LLMs) have raised concerns regarding their elusive intrinsic values.<n>This study addresses the gap by introducing the Generative Psycho-Lexical Approach (GPLA)<n>We propose a psychologically grounded five-factor value system tailored for LLMs.
arXiv Detail & Related papers (2025-02-04T16:10:55Z) - ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs [14.621675648356236]
We introduce Value, a framework of fundamental values, grounded in psychological theory and a systematic review.<n>We apply Value to measure the value alignment of humans and large language models (LLMs) across four real-world scenarios.
arXiv Detail & Related papers (2024-09-15T02:13:03Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - Measuring Value Alignment [12.696227679697493]
This paper introduces a novel formalism to quantify the alignment between AI systems and human values.
By utilizing this formalism, AI developers and ethicists can better design and evaluate AI systems to ensure they operate in harmony with human values.
arXiv Detail & Related papers (2023-12-23T12:30:06Z) - Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties [68.66719970507273]
Value pluralism is the view that multiple correct values may be held in tension with one another.
As statistical learners, AI systems fit to averages by default, washing out potentially irreducible value conflicts.
We introduce ValuePrism, a large-scale dataset of 218k values, rights, and duties connected to 31k human-written situations.
arXiv Detail & Related papers (2023-09-02T01:24:59Z) - Evaluating the Social Impact of Generative AI Systems in Systems and Society [43.32010533676472]
Generative AI systems across modalities, ranging from text (including code), image, audio, and video, have broad social impacts.
There is no official standard for means of evaluating those impacts or for which impacts should be evaluated.
We present a guide that moves toward a standard approach in evaluating a base generative AI system for any modality.
arXiv Detail & Related papers (2023-06-09T15:05:13Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z) - Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans.
We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z) - Human Values in Multiagent Systems [3.5027291542274357]
This paper presents a formal representation of values, grounded in the social sciences.
We use this formal representation to articulate the key challenges for achieving value-aligned behaviour in multiagent systems.
arXiv Detail & Related papers (2023-05-04T11:23:59Z) - ValueNet: A New Dataset for Human Value Driven Dialogue System [103.2044265617704]
We present a new large-scale human value dataset called ValueNet, which contains human attitudes on 21,374 text scenarios.
Comprehensive empirical results show that the learned value model could benefit a wide range of dialogue tasks.
ValueNet is the first large-scale text dataset for human value modeling.
arXiv Detail & Related papers (2021-12-12T23:02:52Z) - PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative
Dialogue Systems [48.99561874529323]
There are three kinds of automatic methods to evaluate the open-domain generative dialogue systems.
Due to the lack of systematic comparison, it is not clear which kind of metrics are more effective.
We propose a novel and feasible learning-based metric that can significantly improve the correlation with human judgments.
arXiv Detail & Related papers (2020-04-06T04:36:33Z) - Steps Towards Value-Aligned Systems [0.0]
Algorithmic (including AI/ML) decision-making artifacts are an established and growing part of our decision-making ecosystem.
Current literature is full of examples of how individual artifacts violate societal norms and expectations.
This discussion argues for a more structured systems-level approach for assessing value-alignment in sociotechnical systems.
arXiv Detail & Related papers (2020-02-10T22:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.