Towards Sustainable Investment Policies Informed by Opponent Shaping
- URL: http://arxiv.org/abs/2602.11829v1
- Date: Thu, 12 Feb 2026 11:16:28 GMT
- Title: Towards Sustainable Investment Policies Informed by Opponent Shaping
- Authors: Juan Agustin Duque, Razvan Ciuca, Ayoub Echchahed, Hugo Larochelle, Aaron Courville,
- Abstract summary: InvestESG is a proposed multi-agent simulation that captures the dynamic interplay between investors and companies under climate risk.<n>We provide a formal characterization of the conditions under which InvestESG exhibits an intertemporal social dilemma, deriving theoretical thresholds at which individual incentives diverge from collective welfare.
- Score: 11.460024756505293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Addressing climate change requires global coordination, yet rational economic actors often prioritize immediate gains over collective welfare, resulting in social dilemmas. InvestESG is a recently proposed multi-agent simulation that captures the dynamic interplay between investors and companies under climate risk. We provide a formal characterization of the conditions under which InvestESG exhibits an intertemporal social dilemma, deriving theoretical thresholds at which individual incentives diverge from collective welfare. Building on this, we apply Advantage Alignment, a scalable opponent shaping algorithm shown to be effective in general-sum games, to influence agent learning in InvestESG. We offer theoretical insights into why Advantage Alignment systematically favors socially beneficial equilibria by biasing learning dynamics toward cooperative outcomes. Our results demonstrate that strategically shaping the learning processes of economic agents can result in better outcomes that could inform policy mechanisms to better align market incentives with long-term sustainability goals.
Related papers
- Provable and Practical In-Context Policy Optimization for Self-Improvement [49.670847804409874]
We study test-time scaling, where a model improves its answer through multi-round self-reflection at inference.<n>We introduce In-Context Policy Optimization (ICPO), in which an agent optimize its response in context using self-assessed or externally observed rewards without modifying its parameters.<n>We propose Minimum-Entropy ICPO (ME-ICPO), a practical algorithm that iteratively uses its response and self-assessed reward to refine its response in-context at inference time.
arXiv Detail & Related papers (2026-03-02T00:21:50Z) - Toward a Sustainable Federated Learning Ecosystem: A Practical Least Core Mechanism for Payoff Allocation [71.86087908416255]
We introduce a payoff allocation framework based on the least core (LC) concept.<n>Unlike traditional methods, the LC prioritizes the cohesion of the federation by minimizing the maximum dissatisfaction.<n>Case studies in federated intrusion detection demonstrate that our mechanism correctly identifies pivotal contributors and strategic alliances.
arXiv Detail & Related papers (2026-02-03T11:10:50Z) - Microeconomic Foundations of Multi-Agent Learning [0.0]
Modern AI systems operate inside markets and institutions where data, behavior, and incentives are endogenous.<n>This paper develops an economic foundation for multi-agent learning by studying a principal-agent interaction in a Markov decision process with strategic externalities.
arXiv Detail & Related papers (2026-01-06T22:37:47Z) - The Role of Social Learning and Collective Norm Formation in Fostering Cooperation in LLM Multi-Agent Systems [13.628908663240564]
We introduce a CPR simulation framework that removes explicit reward signals and embeds cultural-evolutionary mechanisms.<n>We examine norm evolution across a $2times2$ grid of environmental and social initialisations.<n>Our results reveal systematic model differences in sustaining cooperation and norm formation.
arXiv Detail & Related papers (2025-10-16T07:59:31Z) - An Explainable Equity-Aware P2P Energy Trading Framework for Socio-Economically Diverse Microgrid [0.0]
This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL)<n>The framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains.
arXiv Detail & Related papers (2025-07-24T18:38:51Z) - Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games [87.5673042805229]
How large language models balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment.<n>We adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas.<n>Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation.
arXiv Detail & Related papers (2025-06-29T15:02:47Z) - InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma [8.867831781244575]
InvestESG is a novel multi-agent reinforcement learning benchmark to study the impact of ESG disclosure mandates on corporate climate investments.<n>Our experiments show that without ESG-conscious investors with sufficient capital, corporate mitigation efforts remain limited under the disclosure mandate.<n>Providing more information about global climate risks encourages companies to invest more in mitigation, even without investor involvement.
arXiv Detail & Related papers (2024-11-15T00:31:45Z) - Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents [101.17919953243107]
GovSim is a generative simulation platform designed to study strategic interactions and cooperative decision-making in large language models (LLMs)<n>We find that all but the most powerful LLM agents fail to achieve a sustainable equilibrium in GovSim, with the highest survival rate below 54%.<n>We show that agents that leverage "Universalization"-based reasoning, a theory of moral thinking, are able to achieve significantly better sustainability.
arXiv Detail & Related papers (2024-04-25T15:59:16Z) - Building a Foundation for Data-Driven, Interpretable, and Robust Policy
Design using the AI Economist [67.08543240320756]
We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations.
We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
arXiv Detail & Related papers (2021-08-06T01:30:41Z) - Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions [80.49176924360499]
We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
arXiv Detail & Related papers (2020-07-05T16:41:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.