Related papers: Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics

Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics

URL: http://arxiv.org/abs/2509.20412v1
Date: Wed, 24 Sep 2025 08:26:56 GMT
Title: Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics
Authors: Kevin Bradley Dsouza, Graham Alexander Watt, Yuri Leonenko, Juan Moreno-Cruz,
Abstract summary: Collective action problems, which require aligning individual incentives with collective goals, are classic examples of Ill-Structured Problems (ISPs)<n>We present ECHO-MIMIC, a computational framework that converts this global complexity into a tractable, Well-Structured Problem (WSP) for each agent.<n>By coupling algorithmic discovery with tailored communication, ECHO-MIMIC transforms the cognitive burden of collective action into a simple set of agent-level instructions.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Collective action problems, which require aligning individual incentives with collective goals, are classic examples of Ill-Structured Problems (ISPs). For an individual agent, the causal links between local actions and global outcomes are unclear, stakeholder objectives often conflict, and no single, clear algorithm can bridge micro-level choices with macro-level welfare. We present ECHO-MIMIC, a computational framework that converts this global complexity into a tractable, Well-Structured Problem (WSP) for each agent by discovering compact, executable heuristics and persuasive rationales. The framework operates in two stages: ECHO (Evolutionary Crafting of Heuristics from Outcomes) evolves snippets of Python code that encode candidate behavioral policies, while MIMIC (Mechanism Inference & Messaging for Individual-to-Collective Alignment) evolves companion natural language messages that motivate agents to adopt those policies. Both phases employ a large-language-model-driven evolutionary search: the LLM proposes diverse and context-aware code or text variants, while population-level selection retains those that maximize collective performance in a simulated environment. We demonstrate this framework on a canonical ISP in agricultural landscape management, where local farming decisions impact global ecological connectivity. Results show that ECHO-MIMIC discovers high-performing heuristics compared to baselines and crafts tailored messages that successfully align simulated farmer behavior with landscape-level ecological goals. By coupling algorithmic rule discovery with tailored communication, ECHO-MIMIC transforms the cognitive burden of collective action into a simple set of agent-level instructions, making previously ill-structured problems solvable in practice and opening a new path toward scalable, adaptive policy design.

Related papers

MOSAIC: A Unified Platform for Cross-Paradigm Comparison and Evaluation of Homogeneous and Heterogeneous Multi-Agent RL, LLM, VLM, and Human Decision-Makers [8.910641383873353]
Reinforcement learning (RL), large language models (LLMs), and vision-language models (VLMs) have been widely studied in isolation.<n>Existing infrastructure lacks the ability to deploy agents from different decision-making paradigms within the same environment.<n>We present MOSAIC, an open-source platform that bridges this gap by incorporating a diverse set of existing reinforcement learning environments.
arXiv Detail & Related papers (2026-03-01T20:33:19Z)
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution [64.15555230987222]
PACEvolve is a framework designed to robustly govern the agent's context and search dynamics.<n>We demonstrate that PACEvolve provides a systematic path to consistent, long-horizon self-improvement.
arXiv Detail & Related papers (2026-01-15T18:25:23Z)
Grounded Test-Time Adaptation for LLM Agents [75.62784644919803]
Large language model (LLM)-based agents struggle to generalize to novel and complex environments.<n>We propose two strategies for adapting LLM agents by leveraging environment-specific information available during deployment.
arXiv Detail & Related papers (2025-11-06T22:24:35Z)
Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting [92.57796055887995]
We introduce ECHO, a prompting framework that adapts hindsight experience replay from reinforcement learning for language model agents.<n> ECHO generates optimized trajectories for alternative goals that could have been achieved during failed attempts.<n>We evaluate ECHO on stateful versions of XMiniGrid, a text-based navigation and planning benchmark, and PeopleJoinQA, a collaborative information-gathering enterprise simulation.
arXiv Detail & Related papers (2025-10-11T18:11:09Z)
LUCIFER: Language Understanding and Context-Infused Framework for Exploration and Behavior Refinement [5.522800137785975]
In dynamic environments, the rapid obsolescence of pre-existing environmental knowledge creates a gap between an agent's internal model and its operational context.<n>We propose LUCIFER, a domain-agnostic framework that integrates a hierarchical decision-making architecture with reinforcement learning.<n>We show that LUCIFER improves exploration efficiency and decision quality, outperforming flat, goal-conditioned policies.
arXiv Detail & Related papers (2025-06-09T16:30:05Z)
COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping [56.907940167333656]
Occluded robot grasping is where the desired grasp poses are kinematically infeasible due to environmental constraints such as surface collisions.<n>Traditional robot manipulation approaches struggle with the complexity of non-prehensile or bimanual strategies commonly used by humans.<n>We introduce Constraint-based Manipulation for Bimanual Occluded Grasping (COMBO-Grasp), a learning-based approach which leverages two coordinated policies.
arXiv Detail & Related papers (2025-02-12T01:31:01Z)
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm. HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies. HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z)
LLM-POET: Evolving Complex Environments using Large Language Models [0.0]
We propose LLM-POET, a modification of the POET algorithm where the environment is both created and mutated using a Large Language Model (LLM) We found that not only could the LLM produce a diverse range of environments, but compared to the CPPNs used in Enhanced-POET for environment generation, the LLM allowed for a 34% increase in the performance gain of co-evolution.
arXiv Detail & Related papers (2024-06-07T06:23:07Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization [30.456180468318305]
In the sequential decision making setting, an agent aims to achieve systematic generalization over a large, possibly infinite, set of environments. In this paper, we provide a tractable formulation of systematic generalization by employing a causal viewpoint. Under specific structural assumptions, we provide a simple learning algorithm that guarantees any desired planning error up to an unavoidable sub-optimality term.
arXiv Detail & Related papers (2022-02-14T08:34:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.