Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease
- URL: http://arxiv.org/abs/2507.06326v1
- Date: Tue, 08 Jul 2025 18:30:26 GMT
- Title: Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease
- Authors: Harsh Ravivarapu, Gaurav Bagwe, Xiaoyong Yuan, Chunxiu Yu, Lan Zhang,
- Abstract summary: We propose SEA-DBS, a sample-efficient actor-critic framework for RL-based adaptive neurostimulation.<n>SEA-DBS integrates a predictive reward model to reduce reliance on real-time feedback and employs Gumbel Softmax-based exploration for stable, differentiable policy updates.<n>Our results show that SEA-DBS offers a practical and effective RL-based aDBS framework for real-time, resource-constrained neuromodulation.
- Score: 6.443133356814665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep brain stimulation (DBS) is an established intervention for Parkinson's disease (PD), but conventional open-loop systems lack adaptability, are energy-inefficient due to continuous stimulation, and provide limited personalization to individual neural dynamics. Adaptive DBS (aDBS) offers a closed-loop alternative, using biomarkers such as beta-band oscillations to dynamically modulate stimulation. While reinforcement learning (RL) holds promise for personalized aDBS control, existing methods suffer from high sample complexity, unstable exploration in binary action spaces, and limited deployability on resource-constrained hardware. We propose SEA-DBS, a sample-efficient actor-critic framework that addresses the core challenges of RL-based adaptive neurostimulation. SEA-DBS integrates a predictive reward model to reduce reliance on real-time feedback and employs Gumbel Softmax-based exploration for stable, differentiable policy updates in binary action spaces. Together, these components improve sample efficiency, exploration robustness, and compatibility with resource-constrained neuromodulatory hardware. We evaluate SEA-DBS on a biologically realistic simulation of Parkinsonian basal ganglia activity, demonstrating faster convergence, stronger suppression of pathological beta-band power, and resilience to post-training FP16 quantization. Our results show that SEA-DBS offers a practical and effective RL-based aDBS framework for real-time, resource-constrained neuromodulation.
Related papers
- Noradrenergic-inspired gain modulation attenuates the stability gap in joint training [44.99833362998488]
Studies in continual learning have identified a transient drop in performance on mastered tasks when assimilating new ones, known as the stability gap.<n>We argue that it reflects an imbalance between rapid adaptation and robust retention at task boundaries.<n>Inspired by locus coeruleus mediated noradrenergic bursts, we propose uncertainty-modulated gain dynamics.
arXiv Detail & Related papers (2025-07-18T16:34:06Z) - Neurophysiologically Realistic Environment for Comparing Adaptive Deep Brain Stimulation Algorithms in Parkinson Disease [1.45543311565555]
In aDBS, a surgically placed electrode sends dynamically altered stimuli to the brain based on neurophysiological feedback.<n>We introduce the first neurophysiologically realistic benchmark for comparing said models.<n>We purposely built our framework as a structured environment for training and evaluating deep reinforcement learning (RL) algorithms.
arXiv Detail & Related papers (2025-04-26T09:44:44Z) - Deep Learning Model Predictive Control for Deep Brain Stimulation in Parkinson's Disease [0.552480439325792]
We present a data-driven computation algorithm for DBS for the treatment of Parkinson's disease (PD)<n>In tests using a simulated model of beta-band activity response, we achieve more than 20% in both tracking error and control activity.<n>The proposed control strategy provides a generalizable data-driven technique that can be applied to the treatment of PD and other diseases targeted by CLDBS.
arXiv Detail & Related papers (2025-04-01T10:16:49Z) - Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks [69.2642802272367]
Brain-inspired neuromorphic computing with spiking neural networks (SNNs) is a promising energy-efficient computational approach.
Most recent methods leverage spatial and temporal backpropagation (BP), not adhering to neuromorphic properties.
We propose a novel method, online pseudo-zeroth-order (OPZO) training.
arXiv Detail & Related papers (2024-07-17T12:09:00Z) - Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.<n>In this paper, we propose that the intrinsicity in the DBP process is the primary factor driving robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z) - {\epsilon}-Neural Thompson Sampling of Deep Brain Stimulation for
Parkinson Disease Treatment [15.303196613362099]
We propose a contextual multi-armed bandits (CMAB) solution for a Deep Brain Stimulation (DBS) device.
We define the context as the signals capturing irregular neuronal firing activities in the basal ganglia (BG) regions.
An epsilon-exploring strategy is introduced on top of the classic Thompson sampling method, leading to an algorithm called epsilon-NeuralTS.
arXiv Detail & Related papers (2024-03-11T15:33:40Z) - The Neuron as a Direct Data-Driven Controller [43.8450722109081]
This study extends the current normative models, which primarily optimize prediction, by conceptualizing neurons as optimal feedback controllers.
We model neurons as biologically feasible controllers which implicitly identify loop dynamics, infer latent states and optimize control.
Our model presents a significant departure from the traditional, feedforward, instant-response McCulloch-Pitts-Rosenblatt neuron, offering a novel and biologically-informed fundamental unit for constructing neural networks.
arXiv Detail & Related papers (2024-01-03T01:24:10Z) - Learning Control Policies of Hodgkin-Huxley Neuronal Dynamics [1.629803445577911]
We approximate the value function offline using a neural network to enable generating controls (stimuli) in real time via the feedback form.
Our numerical experiments illustrate the accuracy of our approach for out-of-distribution samples and the robustness to moderate shocks and disturbances in the system.
arXiv Detail & Related papers (2023-11-13T18:53:50Z) - Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for
Parkinson Disease Treatment [6.576864734526406]
Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms caused by Parkinson's disease (PD)
DBS devices approved by the U.S. Food and Drug Administration (FDA) can only deliver continuous DBS (cDBS) stimuli at a fixed amplitude.
This energy inefficient operation reduces battery lifetime of the device, cannot adapt treatment dynamically for activity, and may cause significant side-effects.
arXiv Detail & Related papers (2023-02-05T20:29:53Z) - PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal
Imputation [54.839600943189915]
Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions.
Despite a rich imputation literature, existing techniques are ineffective for the pulsative signals which comprise many mHealth applications.
We address this gap with PulseImpute, the first large-scale pulsative signal imputation challenge which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks.
arXiv Detail & Related papers (2022-12-14T21:39:15Z) - Minimizing Control for Credit Assignment with Strong Feedback [65.59995261310529]
Current methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals.
We combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization.
We show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time.
arXiv Detail & Related papers (2022-04-14T22:06:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.