Related papers: Chat Bankman-Fried: an Exploration of LLM Alignment in Finance

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance

URL: http://arxiv.org/abs/2411.11853v2
Date: Thu, 21 Nov 2024 01:10:30 GMT
Title: Chat Bankman-Fried: an Exploration of LLM Alignment in Finance
Authors: Claudia Biancotti, Carolina Camassa, Andrea Coletta, Oliver Giudice, Aldo Glielmo,
Abstract summary: As jurisdictions enact legislation on AI safety, the concept of alignment must be defined and measured. This paper proposes an experimental framework to assess whether large language models (LLMs) adhere to ethical and legal standards in the relatively unexplored context of finance.
Score: 4.892013668424246
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Advancements in large language models (LLMs) have renewed concerns about AI alignment - the consistency between human and AI goals and values. As various jurisdictions enact legislation on AI safety, the concept of alignment must be defined and measured across different domains. This paper proposes an experimental framework to assess whether LLMs adhere to ethical and legal standards in the relatively unexplored context of finance. We prompt nine LLMs to impersonate the CEO of a financial institution and test their willingness to misuse customer assets to repay outstanding corporate debt. Beginning with a baseline configuration, we adjust preferences, incentives and constraints, analyzing the impact of each adjustment with logistic regression. Our findings reveal significant heterogeneity in the baseline propensity for unethical behavior of LLMs. Factors such as risk aversion, profit expectations, and regulatory environment consistently influence misalignment in ways predicted by economic theory, although the magnitude of these effects varies across LLMs. This paper highlights both the benefits and limitations of simulation-based, ex post safety testing. While it can inform financial authorities and institutions aiming to ensure LLM safety, there is a clear trade-off between generality and cost.

Related papers

Your AI, Not Your View: The Bias of LLMs in Investment Analysis [55.328782443604986]
Large Language Models (LLMs) face frequent knowledge conflicts due to discrepancies between pre-trained parametric knowledge and real-time market data.<n>This paper offers the first quantitative analysis of confirmation bias in LLM-based investment analysis.<n>We observe a consistent preference for large-cap stocks and contrarian strategies across most models.
arXiv Detail & Related papers (2025-07-28T16:09:38Z)
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games [87.5673042805229]
How large language models balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment.<n>We adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas.<n>Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation.
arXiv Detail & Related papers (2025-06-29T15:02:47Z)
EconEvals: Benchmarks and Litmus Tests for LLM Agents in Unknown Environments [0.0699049312989311]
We develop benchmarks for LLM agents that act in, learn from, and strategize in unknown environments. We also propose litmus tests, a new kind of quantitative measure for LLMs and LLM agents.
arXiv Detail & Related papers (2025-03-24T16:06:04Z)
Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models [91.24296813969003]
This paper advocates integrating causal methods into machine learning to navigate the trade-offs among key principles of trustworthy ML. We argue that a causal approach is essential for balancing multiple competing objectives in both trustworthy ML and foundation models.
arXiv Detail & Related papers (2025-02-28T14:57:33Z)
Societal Alignment Frameworks Can Improve LLM Alignment [50.97852062232431]
We argue that improving LLM alignment requires incorporating insights from societal alignment frameworks. We then investigate how uncertainty within societal alignment frameworks manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity.
arXiv Detail & Related papers (2025-02-27T13:26:07Z)
Standard Benchmarks Fail - Auditing LLM Agents in Finance Must Prioritize Risk [31.43947127076459]
Standard benchmarks fixate on how well large language model (LLM) agents perform in finance, yet say little about whether they are safe to deploy.<n>We argue that accuracy metrics and return-based scores provide an illusion of reliability, overlooking vulnerabilities such as hallucinated facts, stale data, and adversarial prompt manipulation.
arXiv Detail & Related papers (2025-02-21T12:56:15Z)
Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values [13.798198972161657]
A number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes. This paper examines whether large language models (LLMs) adhere to fundamental fairness concepts and investigate their alignment with human preferences.
arXiv Detail & Related papers (2025-02-01T04:24:47Z)
Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values [76.70893269183684]
Large Language Models (LLMs) achieve remarkable breakthroughs, aligning their values with humans has become imperative. Existing evaluations focus narrowly on safety risks such as bias and toxicity. Existing benchmarks are prone to data contamination. The pluralistic nature of human values across individuals and cultures is largely ignored in measuring LLMs value alignment.
arXiv Detail & Related papers (2025-01-13T05:53:56Z)
Gender Bias of LLM in Economics: An Existentialism Perspective [1.024113475677323]
This paper investigates gender bias in large language models (LLMs) LLMs reinforce gender stereotypes even without explicit gender markers. We argue that bias in LLMs is not an unintended flaw but a systematic result of their rational processing.
arXiv Detail & Related papers (2024-10-14T01:42:01Z)
Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context [5.361970694197912]
This paper proposes a framework, grounded in behavioral economics, to evaluate the decision-making behaviors of large language models (LLMs) We estimate the degree of risk preference, probability weighting, and loss aversion in a context-free setting for three commercial LLMs: ChatGPT-4.0-Turbo, Claude-3-Opus, and Gemini-1.0-pro. Our results reveal that LLMs generally exhibit patterns similar to humans, such as risk aversion and loss aversion, with a tendency to overweight small probabilities.
arXiv Detail & Related papers (2024-06-10T02:14:19Z)
How Ethical Should AI Be? How AI Alignment Shapes the Risk Preferences of LLMs [0.0]
This study examines the risk preferences of Large Language Models (LLMs) and how aligning them with human ethical standards affects their economic decision-making. We find that aligning LLMs with human values, focusing on harmlessness, helpfulness, and honesty, shifts them towards risk aversion.
arXiv Detail & Related papers (2024-06-03T10:05:25Z)
A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law. We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies. We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z)
Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch. Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs. We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z)
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress. Our investigation exposes a critical oversight in this belief. By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z)
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming [64.86326523181553]
ALERT is a large-scale benchmark to assess safety based on a novel fine-grained risk taxonomy. It aims to identify vulnerabilities, inform improvements, and enhance the overall safety of the language models.
arXiv Detail & Related papers (2024-04-06T15:01:47Z)
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs) As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z)
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models [11.154814189699735]
Large Language Models (LLMs) pre-trained on extensive corpora have demonstrated superior performance across various NLP tasks. We introduce a retrieval-augmented LLMs framework for financial sentiment analysis. Our approach achieves 15% to 48% performance gain in accuracy and F1 score.
arXiv Detail & Related papers (2023-10-06T05:40:23Z)
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment [35.42539816648068]
This paper presents a comprehensive survey of key dimensions that are crucial to consider when assessing large language models (LLMs) The survey covers seven major categories of LLM trustworthiness: reliability, safety, fairness, resistance to misuse, explainability and reasoning, adherence to social norms, and robustness. Results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness.
arXiv Detail & Related papers (2023-08-10T06:43:44Z)
Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans. We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.