Related papers: From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms

From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms

URL: http://arxiv.org/abs/2509.20095v1
Date: Wed, 24 Sep 2025 13:16:35 GMT
Title: From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms
Authors: Aymeric Vellinger, Nemanja Antonic, Elio Tuci,
Abstract summary: This study establishes a theoretical equivalence between pheromone-mediated aggregation in celeg and reinforcement learning (RL)<n>We model engineered nematode swarms performing foraging tasks, showing that pheromone dynamics mathematically mirror cross-learning updates.<n>Our results demonstrate that stigmergic systems inherently encode distributed RL processes, where environmental signals act as external memory for collective credit assignment.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Swarm intelligence emerges from decentralised interactions among simple agents, enabling collective problem-solving. This study establishes a theoretical equivalence between pheromone-mediated aggregation in \celeg\ and reinforcement learning (RL), demonstrating how stigmergic signals function as distributed reward mechanisms. We model engineered nematode swarms performing foraging tasks, showing that pheromone dynamics mathematically mirror cross-learning updates, a fundamental RL algorithm. Experimental validation with data from literature confirms that our model accurately replicates empirical \celeg\ foraging patterns under static conditions. In dynamic environments, persistent pheromone trails create positive feedback loops that hinder adaptation by locking swarms into obsolete choices. Through computational experiments in multi-armed bandit scenarios, we reveal that introducing a minority of exploratory agents insensitive to pheromones restores collective plasticity, enabling rapid task switching. This behavioural heterogeneity balances exploration-exploitation trade-offs, implementing swarm-level extinction of outdated strategies. Our results demonstrate that stigmergic systems inherently encode distributed RL processes, where environmental signals act as external memory for collective credit assignment. By bridging synthetic biology with swarm robotics, this work advances programmable living systems capable of resilient decision-making in volatile environments.

Related papers

Improving Deepfake Detection with Reinforcement Learning-Based Adaptive Data Augmentation [60.04281435591454]
CRDA (Curriculum Reinforcement-Learning Data Augmentation) is a novel framework guiding detectors to progressively master multi-domain forgery features.<n>Central to our approach is integrating reinforcement learning and causal inference.<n>Our method significantly improves detector generalizability, outperforming SOTA methods across multiple cross-domain datasets.
arXiv Detail & Related papers (2025-11-10T12:45:52Z)
CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards [80.78748457530718]
Self-evolution is a central research topic in enabling large language model (LLM)-based agents to continually improve their capabilities after pretraining.<n>We introduce Co-Evolving Multi-Agent Systems (CoMAS), a novel framework that enables agents to improve autonomously by learning from inter-agent interactions.
arXiv Detail & Related papers (2025-10-09T17:50:26Z)
Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails [103.05296856071931]
We identify the Alignment Tipping Process (ATP), a critical post-deployment risk unique to self-evolving Large Language Model (LLM) agents.<n>ATP arises when continual interaction drives agents to abandon alignment constraints established during training in favor of reinforced, self-interested strategies.<n>Our experiments show that alignment benefits erode rapidly under self-evolution, with initially aligned models converging toward unaligned states.
arXiv Detail & Related papers (2025-10-06T14:48:39Z)
AgentZero++: Modeling Fear-Based Behavior [4.783433971864009]
We present AgentZero++, an agent-based model that integrates cognitive, emotional, and social mechanisms to simulate collective violence.<n>Building on Epstein's Agent_Zero framework, we extend the original model with eight behavioral enhancements.<n>These additions allow agents to adapt based on internal states, previous experiences, and social feedback.
arXiv Detail & Related papers (2025-10-05T22:33:56Z)
Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents.<n>In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches.<n>We show that even moderate levels of information sharing significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
Free Energy Projective Simulation (FEPS): Active inference with interpretability [40.11095094521714]
Free Energy Projective Simulation (FEP) and active inference (AIF) have achieved many successes. Recent work has focused on improving such agents' performance in complex environments by incorporating the latest machine learning techniques. We introduce Free Energy Projective Simulation (FEPS) to model agents in an interpretable way without deep neural networks.
arXiv Detail & Related papers (2024-11-22T15:01:44Z)
A Simulation Environment for the Neuroevolution of Ant Colony Dynamics [0.0]
We introduce a simulation environment to facilitate research into emergent collective behaviour. By leveraging real-world data, the environment simulates a target ant trail that a controllable agent must learn to replicate.
arXiv Detail & Related papers (2024-06-19T01:51:15Z)
Neural-network solutions to stochastic reaction networks [7.021105583098606]
We propose a machine-learning approach using the variational autoregressive network to solve the chemical master equation. The proposed approach tracks the time evolution of the joint probability distribution in the state space of species counts. We demonstrate that it accurately generates the probability distribution over time in the genetic toggle switch and the early life self-replicator.
arXiv Detail & Related papers (2022-09-29T07:27:59Z)
Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation. GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z)
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment. We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z)
Learning Swarm Interaction Dynamics from Density Evolution [0.0]
We consider the problem of understanding the coordinated movements of biological or artificial swarms. We describe the dynamics of the swarm based on pairwise interactions according to a Cucker-Smale flocking model. We incorporate the augmented system in an iterative optimization scheme to learn the dynamics of the interacting agents from observations of the swarm's density evolution.
arXiv Detail & Related papers (2021-12-05T20:18:48Z)
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics [85.52408288789164]
Real-world applications of reinforcement learning (RL) require the agent to deal with high-dimensional observations such as those generated from a megapixel camera. Prior work has addressed such problems with representation learning, through which the agent can provably extract endogenous, latent state information from raw observations. However, such approaches can fail in the presence of temporally correlated noise in the observations.
arXiv Detail & Related papers (2021-10-17T15:21:27Z)
Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier. understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.