Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease
- URL: http://arxiv.org/abs/2507.06326v1
- Date: Tue, 08 Jul 2025 18:30:26 GMT
- Title: Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease
- Authors: Harsh Ravivarapu, Gaurav Bagwe, Xiaoyong Yuan, Chunxiu Yu, Lan Zhang,
- Abstract summary: We propose SEA-DBS, a sample-efficient actor-critic framework for RL-based adaptive neurostimulation.<n>SEA-DBS integrates a predictive reward model to reduce reliance on real-time feedback and employs Gumbel Softmax-based exploration for stable, differentiable policy updates.<n>Our results show that SEA-DBS offers a practical and effective RL-based aDBS framework for real-time, resource-constrained neuromodulation.
- Score: 6.443133356814665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep brain stimulation (DBS) is an established intervention for Parkinson's disease (PD), but conventional open-loop systems lack adaptability, are energy-inefficient due to continuous stimulation, and provide limited personalization to individual neural dynamics. Adaptive DBS (aDBS) offers a closed-loop alternative, using biomarkers such as beta-band oscillations to dynamically modulate stimulation. While reinforcement learning (RL) holds promise for personalized aDBS control, existing methods suffer from high sample complexity, unstable exploration in binary action spaces, and limited deployability on resource-constrained hardware. We propose SEA-DBS, a sample-efficient actor-critic framework that addresses the core challenges of RL-based adaptive neurostimulation. SEA-DBS integrates a predictive reward model to reduce reliance on real-time feedback and employs Gumbel Softmax-based exploration for stable, differentiable policy updates in binary action spaces. Together, these components improve sample efficiency, exploration robustness, and compatibility with resource-constrained neuromodulatory hardware. We evaluate SEA-DBS on a biologically realistic simulation of Parkinsonian basal ganglia activity, demonstrating faster convergence, stronger suppression of pathological beta-band power, and resilience to post-training FP16 quantization. Our results show that SEA-DBS offers a practical and effective RL-based aDBS framework for real-time, resource-constrained neuromodulation.
Related papers
- General Self-Prediction Enhancement for Spiking Neurons [71.01912385372577]
Spiking Neural Networks (SNNs) are highly energy-efficient due to event-driven, sparse computation, but their training is challenged by spike non-differentiability and trade-offs among performance, efficiency, and biological plausibility.<n>We propose a self-prediction enhanced spiking neuron method that generates an internal prediction current from its input-output history to modulate membrane potential.<n>This design offers dual advantages, it creates a continuous gradient path that alleviates vanishing gradients and boosts training stability and accuracy, while also aligning with biological principles, which resembles distal dendritic modulation and error-driven synaptic plasticity.
arXiv Detail & Related papers (2026-01-29T15:08:48Z) - Resource-Conscious RL Algorithms for Deep Brain Stimulation [1.1242503819703258]
Deep Brain Stimulation (DBS) has proven to be a promising treatment of Parkinson's Disease (PD)<n>DBS involves stimulating specific regions of the brain's Basal Ganglia (BG) using electric impulses to alleviate symptoms of PD such as tremors, rigidity, and bradykinesia.<n>Most clinical DBS approaches today use a fixed frequency and amplitude, they suffer from side effects (such as slurring of speech) and shortened battery life of the implant.<n>We propose a new Time & Threshold-Triggered Multi-Armed Bandit (T3P MAB) RL approach for DBS that is more effective than
arXiv Detail & Related papers (2026-01-19T03:45:08Z) - DeepBlip: Estimating Conditional Average Treatment Effects Over Time [48.20988325299593]
We propose DeepBlip, the first neural framework for structural nested mean models (SNMMs)<n>Our method correctly adjusts for time-varying confounding to produce unbiased estimates, and its Neyman-orthogonal loss function ensures robustness to nuisance model misspecification.
arXiv Detail & Related papers (2025-11-18T14:49:03Z) - In-Vivo Training for Deep Brain Stimulation [0.9543943371833464]
Deep Brain Stimulation (DBS) is a highly effective treatment for Parkinson's Disease (PD)<n>Recent research uses reinforcement learning (RL) for DBS, with RL agents modulating the stimulation frequency and amplitude.<n>We present an RL-based DBS approach that adapts these stimulation parameters according to brain activity measurable in vivo.
arXiv Detail & Related papers (2025-10-04T03:14:34Z) - Training Deep Normalization-Free Spiking Neural Networks with Lateral Inhibition [52.59263087086756]
Training deep neural networks (SNNs) has critically depended on explicit normalization schemes, such as batch normalization.<n>We propose a normalization-free learning framework that incorporates lateral inhibition inspired by cortical circuits.<n>We show that our framework enables stable training of deep SNNs with biological realism and achieves competitive performance without resorting to explicit normalizations.
arXiv Detail & Related papers (2025-09-27T11:11:30Z) - Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling [66.0841376808143]
Dendritic Resonate-and-Fire (RF) neurons can efficiently extract frequency from input signals and encode them into spike trains.<n>RF neurons exhibit limited effective memory capacity and a trade-off between energy efficiency and training speed on complex tasks.<n>We propose a Dendritic Resonate-and-Fire (D-RF) model, which explicitly incorporates a multi-dendritic and soma architecture.
arXiv Detail & Related papers (2025-09-21T18:15:45Z) - Noradrenergic-inspired gain modulation attenuates the stability gap in joint training [44.99833362998488]
Studies in continual learning have identified a transient drop in performance on mastered tasks when assimilating new ones, known as the stability gap.<n>We argue that it reflects an imbalance between rapid adaptation and robust retention at task boundaries.<n>Inspired by locus coeruleus mediated noradrenergic bursts, we propose uncertainty-modulated gain dynamics.
arXiv Detail & Related papers (2025-07-18T16:34:06Z) - Neurophysiologically Realistic Environment for Comparing Adaptive Deep Brain Stimulation Algorithms in Parkinson Disease [1.45543311565555]
In aDBS, a surgically placed electrode sends dynamically altered stimuli to the brain based on neurophysiological feedback.<n>We introduce the first neurophysiologically realistic benchmark for comparing said models.<n>We purposely built our framework as a structured environment for training and evaluating deep reinforcement learning (RL) algorithms.
arXiv Detail & Related papers (2025-04-26T09:44:44Z) - Deep Learning Model Predictive Control for Deep Brain Stimulation in Parkinson's Disease [0.552480439325792]
We present a data-driven computation algorithm for DBS for the treatment of Parkinson's disease (PD)<n>In tests using a simulated model of beta-band activity response, we achieve more than 20% in both tracking error and control activity.<n>The proposed control strategy provides a generalizable data-driven technique that can be applied to the treatment of PD and other diseases targeted by CLDBS.
arXiv Detail & Related papers (2025-04-01T10:16:49Z) - Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks [69.2642802272367]
Brain-inspired neuromorphic computing with spiking neural networks (SNNs) is a promising energy-efficient computational approach.
Most recent methods leverage spatial and temporal backpropagation (BP), not adhering to neuromorphic properties.
We propose a novel method, online pseudo-zeroth-order (OPZO) training.
arXiv Detail & Related papers (2024-07-17T12:09:00Z) - Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.<n>In this paper, we propose that the intrinsicity in the DBP process is the primary factor driving robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z) - {\epsilon}-Neural Thompson Sampling of Deep Brain Stimulation for
Parkinson Disease Treatment [15.303196613362099]
We propose a contextual multi-armed bandits (CMAB) solution for a Deep Brain Stimulation (DBS) device.
We define the context as the signals capturing irregular neuronal firing activities in the basal ganglia (BG) regions.
An epsilon-exploring strategy is introduced on top of the classic Thompson sampling method, leading to an algorithm called epsilon-NeuralTS.
arXiv Detail & Related papers (2024-03-11T15:33:40Z) - The Neuron as a Direct Data-Driven Controller [43.8450722109081]
This study extends the current normative models, which primarily optimize prediction, by conceptualizing neurons as optimal feedback controllers.
We model neurons as biologically feasible controllers which implicitly identify loop dynamics, infer latent states and optimize control.
Our model presents a significant departure from the traditional, feedforward, instant-response McCulloch-Pitts-Rosenblatt neuron, offering a novel and biologically-informed fundamental unit for constructing neural networks.
arXiv Detail & Related papers (2024-01-03T01:24:10Z) - Learning Control Policies of Hodgkin-Huxley Neuronal Dynamics [1.629803445577911]
We approximate the value function offline using a neural network to enable generating controls (stimuli) in real time via the feedback form.
Our numerical experiments illustrate the accuracy of our approach for out-of-distribution samples and the robustness to moderate shocks and disturbances in the system.
arXiv Detail & Related papers (2023-11-13T18:53:50Z) - Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for
Parkinson Disease Treatment [6.576864734526406]
Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms caused by Parkinson's disease (PD)
DBS devices approved by the U.S. Food and Drug Administration (FDA) can only deliver continuous DBS (cDBS) stimuli at a fixed amplitude.
This energy inefficient operation reduces battery lifetime of the device, cannot adapt treatment dynamically for activity, and may cause significant side-effects.
arXiv Detail & Related papers (2023-02-05T20:29:53Z) - PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal
Imputation [54.839600943189915]
Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions.
Despite a rich imputation literature, existing techniques are ineffective for the pulsative signals which comprise many mHealth applications.
We address this gap with PulseImpute, the first large-scale pulsative signal imputation challenge which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks.
arXiv Detail & Related papers (2022-12-14T21:39:15Z) - Minimizing Control for Credit Assignment with Strong Feedback [65.59995261310529]
Current methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals.
We combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization.
We show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time.
arXiv Detail & Related papers (2022-04-14T22:06:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.