Related papers: Dissecting Larval Zebrafish Hunting using Deep Reinforcement Learning Trained RNN Agents

Dissecting Larval Zebrafish Hunting using Deep Reinforcement Learning Trained RNN Agents

URL: http://arxiv.org/abs/2510.03699v1
Date: Sat, 04 Oct 2025 06:40:32 GMT
Title: Dissecting Larval Zebrafish Hunting using Deep Reinforcement Learning Trained RNN Agents
Authors: Raaghav Malik, Satpreet H. Singh, Sonja Johnson-Yu, Nathan Wu, Roy Harpaz, Florian Engert, Kanaka Rajan,
Abstract summary: Larval zebrafish hunting provides a tractable setting to study how ecological and energetic constraints shape adaptive behavior.<n>We develop a minimal agent-based model, training recurrent policies with deep reinforcement learning in a bout-based zebrafish simulator.<n>Despite its simplicity, the model reproduces hallmark hunting behaviors that closely match real larval zebrafish.
Score: 1.8853228540913756
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Larval zebrafish hunting provides a tractable setting to study how ecological and energetic constraints shape adaptive behavior in both biological brains and artificial agents. Here we develop a minimal agent-based model, training recurrent policies with deep reinforcement learning in a bout-based zebrafish simulator. Despite its simplicity, the model reproduces hallmark hunting behaviors -- including eye vergence-linked pursuit, speed modulation, and stereotyped approach trajectories -- that closely match real larval zebrafish. Quantitative trajectory analyses show that pursuit bouts systematically reduce prey angle by roughly half before strike, consistent with measurements. Virtual experiments and parameter sweeps vary ecological and energetic constraints, bout kinematics (coupled vs. uncoupled turns and forward motion), and environmental factors such as food density, food speed, and vergence limits. These manipulations reveal how constraints and environments shape pursuit dynamics, strike success, and abort rates, yielding falsifiable predictions for neuroscience experiments. These sweeps identify a compact set of constraints -- binocular sensing, the coupling of forward speed and turning in bout kinematics, and modest energetic costs on locomotion and vergence -- that are sufficient for zebrafish-like hunting to emerge. Strikingly, these behaviors arise in minimal agents without detailed biomechanics, fluid dynamics, circuit realism, or imitation learning from real zebrafish data. Taken together, this work provides a normative account of zebrafish hunting as the optimal balance between energetic cost and sensory benefit, highlighting the trade-offs that structure vergence and trajectory dynamics. We establish a virtual lab that narrows the experimental search space and generates falsifiable predictions about behavior and neural coding.

Related papers

Optimization-Guided Diffusion for Interactive Scene Generation [52.23368750264419]
We present OMEGA, an optimization-guided, training-free framework that enforces structural consistency and interaction awareness during diffusion-based sampling.<n>We show that OMEGA improves generation realism, consistency, and controllability, increasing the ratio of physically and behaviorally valid scenes.<n>Our approach can also generate $5times$ more near-collision frames with a time-to-collision under three seconds.
arXiv Detail & Related papers (2025-12-08T15:56:18Z)
Data-driven simulator of multi-animal behavior with unknown dynamics via offline and online reinforcement learning [8.835312357110618]
A key challenge for realistic multi-animal simulation in biology is bridging the gap between unknown real-world transition models and their simulated counterparts.<n>We introduce a data-driven simulator for multi-animal behavior based on deep reinforcement learning and counterfactual simulation.
arXiv Detail & Related papers (2025-10-12T05:08:26Z)
Weak Form Learning for Mean-Field Partial Differential Equations: an Application to Insect Movement [0.0]
Insect species subject to infection, predation, and anisotropic environmental conditions may exhibit preferential movement patterns.<n>Data-driven modeling approaches designed to learn the underlying Fokker-Planck equations serve as ideal tools for understanding and predicting such behavior.
arXiv Detail & Related papers (2025-10-09T05:04:32Z)
Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z)
From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms [0.0]
This study establishes a theoretical equivalence between pheromone-mediated aggregation in celeg and reinforcement learning (RL)<n>We model engineered nematode swarms performing foraging tasks, showing that pheromone dynamics mathematically mirror cross-learning updates.<n>Our results demonstrate that stigmergic systems inherently encode distributed RL processes, where environmental signals act as external memory for collective credit assignment.
arXiv Detail & Related papers (2025-09-24T13:16:35Z)
Langevin Flows for Modeling Neural Latent Dynamics [81.81271685018284]
We introduce LangevinFlow, a sequential Variational Auto-Encoder where the time evolution of latent variables is governed by the underdamped Langevin equation.<n>Our approach incorporates physical priors -- such as inertia, damping, a learned potential function, and forces -- to represent both autonomous and non-autonomous processes in neural systems.<n>Our method outperforms state-of-the-art baselines on synthetic neural populations generated by a Lorenz attractor.
arXiv Detail & Related papers (2025-07-15T17:57:48Z)
Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding [12.25140375320834]
BEAST (BEhavioral Analysis via Self-supervised pretraining of Transformers) is a novel framework that pretrains experiment-specific vision transformers for diverse neuro-behavior analyses.<n>Our method establishes a powerful and versatile backbone model that accelerates behavioral analysis in scenarios where labeled data remains scarce.
arXiv Detail & Related papers (2025-07-13T06:43:05Z)
Coordinating Spinal and Limb Dynamics for Enhanced Sprawling Robot Mobility [0.047116288835793156]
A flexible spine enables undulation of the body through a wavelike motion along the spine, aiding navigation over uneven terrains and obstacles.<n>Environmental uncertainties, such as surface irregularities and variations in friction, can significantly disrupt body-limb coordination.<n>Deep reinforcement learning offers a promising framework for handling non-deterministic environments.<n>We comparatively examine learning-based control strategies and biologically inspired gait design methods on a salamander-like robot.
arXiv Detail & Related papers (2025-04-18T23:08:48Z)
Predator-prey survival pressure is sufficient to evolve swarming behaviors [22.69193229479221]
We propose a minimal predator-prey coevolution framework based on mixed cooperative-competitive multiagent reinforcement learning. Surprisingly, our analysis of this approach reveals an unexpectedly rich diversity of emergent behaviors for both prey and predators.
arXiv Detail & Related papers (2023-08-24T08:03:11Z)
From Data-Fitting to Discovery: Interpreting the Neural Dynamics of Motor Control through Reinforcement Learning [3.6159844753873087]
We study structured neural activity of a virtual robot performing legged locomotion. We find that embodied agents trained to walk exhibit smooth dynamics that avoid tangling -- or opposing neural trajectories in neighboring neural space.
arXiv Detail & Related papers (2023-05-18T16:52:27Z)
Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way. Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z)
Hybrid Physics and Deep Learning Model for Interpretable Vehicle State Prediction [75.1213178617367]
We propose a hybrid approach combining deep learning and physical motion models. We achieve interpretability by restricting the output range of the deep neural network as part of the hybrid model. The results show that our hybrid model can improve model interpretability with no decrease in accuracy compared to existing deep learning approaches.
arXiv Detail & Related papers (2021-03-11T15:21:08Z)
Social NCE: Contrastive Learning of Socially-aware Motion Representations [87.82126838588279]
Experimental results show that the proposed method dramatically reduces the collision rates of recent trajectory forecasting, behavioral cloning and reinforcement learning algorithms. Our method makes few assumptions about neural architecture designs, and hence can be used as a generic way to promote the robustness of neural motion models.
arXiv Detail & Related papers (2020-12-21T22:25:06Z)
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning [12.76337275628074]
In this work, we propose a variational dynamic model based on the conditional variational inference to model the multimodality andgenerativeity. We derive an upper bound of the negative log-likelihood of the environmental transition and use such an upper bound as the intrinsic reward for exploration. Our method outperforms several state-of-the-art environment model-based exploration approaches.
arXiv Detail & Related papers (2020-10-17T09:54:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.