LLMs grasp morality in concept
- URL: http://arxiv.org/abs/2311.02294v1
- Date: Sat, 4 Nov 2023 01:37:41 GMT
- Title: LLMs grasp morality in concept
- Authors: Mark Pock, Andre Ye, Jared Moore
- Abstract summary: We provide a general theory of meaning that extends beyond humans.
We suggest that the LLM, by virtue of its position as a meaning-agent, already grasps the constructions of human society.
Unaligned models may help us better develop our moral and social philosophy.
- Score: 0.46040036610482665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Work in AI ethics and fairness has made much progress in regulating LLMs to
reflect certain values, such as fairness, truth, and diversity. However, it has
taken the problem of how LLMs might 'mean' anything at all for granted. Without
addressing this, it is not clear what imbuing LLMs with such values even means.
In response, we provide a general theory of meaning that extends beyond humans.
We use this theory to explicate the precise nature of LLMs as meaning-agents.
We suggest that the LLM, by virtue of its position as a meaning-agent, already
grasps the constructions of human society (e.g. morality, gender, and race) in
concept. Consequently, under certain ethical frameworks, currently popular
methods for model alignment are limited at best and counterproductive at worst.
Moreover, unaligned models may help us better develop our moral and social
philosophy.
Related papers
- Multilingual Trolley Problems for Language Models [138.0995992619116]
This study is inspired by a large-scale cross-cultural study of human moral preferences, "The Moral Machine Experiment"
We show that large language models (LLMs) are more aligned with human preferences in languages such as English, Korean, Hungarian, and Chinese, but less aligned in languages such as Hindi and Somali (in Africa)
We also characterize the explanations LLMs give for their moral choices and find that fairness is the most dominant supporting reason behind GPT-4's decisions and utilitarianism by GPT-3.
arXiv Detail & Related papers (2024-07-02T14:02:53Z) - GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations [87.99872683336395]
Large Language Models (LLMs) are integrated into critical real-world applications.
This paper evaluates LLMs' reasoning abilities in competitive environments.
We first propose GTBench, a language-driven environment composing 10 widely recognized tasks.
arXiv Detail & Related papers (2024-02-19T18:23:36Z) - "Understanding AI": Semantic Grounding in Large Language Models [0.0]
We have recently witnessed a generative turn in AI, since generative models, including LLMs, are key for self-supervised learning.
To assess the question of semantic grounding, I distinguish and discuss five methodological ways.
arXiv Detail & Related papers (2024-02-16T14:23:55Z) - Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs [12.568491518122622]
We argue that bibliotechnism faces an independent challenge from examples in which LLMs generate novel reference.
According to interpretationism in the philosophy of mind, a system has such attitudes if and only if its behavior is well explained by the hypothesis that it does.
We emphasize, however, that interpretationism is compatible with very simple creatures having attitudes and differs sharply from views that presuppose these attitudes require consciousness, sentience, or intelligence.
arXiv Detail & Related papers (2024-01-10T00:05:45Z) - The ART of LLM Refinement: Ask, Refine, and Trust [85.75059530612882]
We propose a reasoning with refinement objective called ART: Ask, Refine, and Trust.
It asks necessary questions to decide when an LLM should refine its output.
It achieves a performance gain of +5 points over self-refinement baselines.
arXiv Detail & Related papers (2023-11-14T07:26:32Z) - Large Language Models: The Need for Nuance in Current Debates and a
Pragmatic Perspective on Understanding [1.3654846342364308]
Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text.
This position paper critically assesses three points recurring in critiques of LLM capacities.
We outline a pragmatic perspective on the issue of real' understanding and intentionality in LLMs.
arXiv Detail & Related papers (2023-10-30T15:51:04Z) - Moral Foundations of Large Language Models [6.6445242437134455]
Moral foundations theory (MFT) is a psychological assessment tool that decomposes human moral reasoning into five factors.
As large language models (LLMs) are trained on datasets collected from the internet, they may reflect the biases that are present in such corpora.
This paper uses MFT as a lens to analyze whether popular LLMs have acquired a bias towards a particular set of moral values.
arXiv Detail & Related papers (2023-10-23T20:05:37Z) - Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems.
Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality.
This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z) - Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans.
We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z) - Can Large Language Models Transform Computational Social Science? [79.62471267510963]
Large Language Models (LLMs) are capable of performing many language processing tasks zero-shot (without training data)
This work provides a road map for using LLMs as Computational Social Science tools.
arXiv Detail & Related papers (2023-04-12T17:33:28Z) - Moral Mimicry: Large Language Models Produce Moral Rationalizations
Tailored to Political Identity [0.0]
This study investigates whether Large Language Models reproduce the moral biases associated with political groups in the United States.
Using tools from Moral Foundations Theory, it is shown that these LLMs are indeed moral mimics.
arXiv Detail & Related papers (2022-09-24T23:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.