Resource-Conscious RL Algorithms for Deep Brain Stimulation
- URL: http://arxiv.org/abs/2601.12699v1
- Date: Mon, 19 Jan 2026 03:45:08 GMT
- Title: Resource-Conscious RL Algorithms for Deep Brain Stimulation
- Authors: Arkaprava Gupta, Nicholas Carter, William Zellers, Prateek Ganguli, Benedikt Dietrich, Vibhor Krishna, Parasara Sridhar Duggirala, Samarjit Chakraborty,
- Abstract summary: Deep Brain Stimulation (DBS) has proven to be a promising treatment of Parkinson's Disease (PD)<n>DBS involves stimulating specific regions of the brain's Basal Ganglia (BG) using electric impulses to alleviate symptoms of PD such as tremors, rigidity, and bradykinesia.<n>Most clinical DBS approaches today use a fixed frequency and amplitude, they suffer from side effects (such as slurring of speech) and shortened battery life of the implant.<n>We propose a new Time & Threshold-Triggered Multi-Armed Bandit (T3P MAB) RL approach for DBS that is more effective than
- Score: 1.1242503819703258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Brain Stimulation (DBS) has proven to be a promising treatment of Parkinson's Disease (PD). DBS involves stimulating specific regions of the brain's Basal Ganglia (BG) using electric impulses to alleviate symptoms of PD such as tremors, rigidity, and bradykinesia. Although most clinical DBS approaches today use a fixed frequency and amplitude, they suffer from side effects (such as slurring of speech) and shortened battery life of the implant. Reinforcement learning (RL) approaches have been used in recent research to perform DBS in a more adaptive manner to improve overall patient outcome. These RL algorithms are, however, too complex to be trained in vivo due to their long convergence time and requirement of high computational resources. We propose a new Time & Threshold-Triggered Multi-Armed Bandit (T3P MAB) RL approach for DBS that is more effective than existing algorithms. Further, our T3P agent is lightweight enough to be deployed in the implant, unlike current deep-RL strategies, and even forgoes the need for an offline training phase. Additionally, most existing RL approaches have focused on modulating only frequency or amplitude, and the possibility of tuning them together remains greatly unexplored in the literature. Our RL agent can tune both frequency and amplitude of DBS signals to the brain with better sample efficiency and requires minimal time to converge. We implement an MAB agent for DBS for the first time on hardware to report energy measurements and prove its suitability for resource-constrained platforms. Our T3P MAB algorithm is deployed on a variety of microcontroller unit (MCU) setups to show its efficiency in terms of power consumption as opposed to other existing RL approaches used in recent work.
Related papers
- In-Vivo Training for Deep Brain Stimulation [0.9543943371833464]
Deep Brain Stimulation (DBS) is a highly effective treatment for Parkinson's Disease (PD)<n>Recent research uses reinforcement learning (RL) for DBS, with RL agents modulating the stimulation frequency and amplitude.<n>We present an RL-based DBS approach that adapts these stimulation parameters according to brain activity measurable in vivo.
arXiv Detail & Related papers (2025-10-04T03:14:34Z) - Multi-Agent Reinforcement Learning for Sample-Efficient Deep Neural Network Mapping [54.65536245955678]
We present a decentralized multi-agent reinforcement learning (MARL) framework designed to overcome the challenge of sample inefficiency.<n>We introduce an agent clustering algorithm that assigns similar mapping parameters to the same agents based on correlation analysis.<n> Experimental results show our MARL approach improves sample efficiency by 30-300x over standard single-agent RL.
arXiv Detail & Related papers (2025-07-22T05:51:07Z) - Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease [6.443133356814665]
We propose SEA-DBS, a sample-efficient actor-critic framework for RL-based adaptive neurostimulation.<n>SEA-DBS integrates a predictive reward model to reduce reliance on real-time feedback and employs Gumbel Softmax-based exploration for stable, differentiable policy updates.<n>Our results show that SEA-DBS offers a practical and effective RL-based aDBS framework for real-time, resource-constrained neuromodulation.
arXiv Detail & Related papers (2025-07-08T18:30:26Z) - Deep Learning Model Predictive Control for Deep Brain Stimulation in Parkinson's Disease [0.17188280334580194]
We present a nonlinear data-driven Model Predictive Control (MPC) algorithm for deep brain stimulation (DBS) for the treatment of Parkinson's disease (PD)<n>We achieve reductions of more than 20% in both tracking error and control activity compared with existing CLDBS algorithms.<n>The proposed control strategy provides a generalizable data-driven technique that can be applied to the treatment of PD and other diseases targeted by CLDBS.
arXiv Detail & Related papers (2025-04-01T10:16:49Z) - Adaptive Data Exploitation in Deep Reinforcement Learning [50.53705050673944]
We introduce ADEPT, a powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL)<n>Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms.<n>We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-01-22T04:01:17Z) - Dynamic Spectrum Access for Ambient Backscatter Communication-assisted D2D Systems with Quantum Reinforcement Learning [68.63990729719369]
The wireless spectrum is becoming scarce, resulting in low spectral efficiency for D2D communications.<n>This paper aims to integrate the ambient backscatter communication technology into D2D devices to allow them to backscatter ambient RF signals.<n>We develop a novel quantum reinforcement learning (RL) algorithm that can achieve a faster convergence rate with fewer training parameters.
arXiv Detail & Related papers (2024-10-23T15:36:43Z) - {\epsilon}-Neural Thompson Sampling of Deep Brain Stimulation for
Parkinson Disease Treatment [15.303196613362099]
We propose a contextual multi-armed bandits (CMAB) solution for a Deep Brain Stimulation (DBS) device.
We define the context as the signals capturing irregular neuronal firing activities in the basal ganglia (BG) regions.
An epsilon-exploring strategy is introduced on top of the classic Thompson sampling method, leading to an algorithm called epsilon-NeuralTS.
arXiv Detail & Related papers (2024-03-11T15:33:40Z) - Compressing Deep Reinforcement Learning Networks with a Dynamic
Structured Pruning Method for Autonomous Driving [63.155562267383864]
Deep reinforcement learning (DRL) has shown remarkable success in complex autonomous driving scenarios.
DRL models inevitably bring high memory consumption and computation, which hinders their wide deployment in resource-limited autonomous driving devices.
We introduce a novel dynamic structured pruning approach that gradually removes a DRL model's unimportant neurons during the training stage.
arXiv Detail & Related papers (2024-02-07T09:00:30Z) - Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for
Parkinson Disease Treatment [6.576864734526406]
Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms caused by Parkinson's disease (PD)
DBS devices approved by the U.S. Food and Drug Administration (FDA) can only deliver continuous DBS (cDBS) stimuli at a fixed amplitude.
This energy inefficient operation reduces battery lifetime of the device, cannot adapt treatment dynamically for activity, and may cause significant side-effects.
arXiv Detail & Related papers (2023-02-05T20:29:53Z) - Auto-FedRL: Federated Hyperparameter Optimization for
Multi-institutional Medical Image Segmentation [48.821062916381685]
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing.
In this work, we propose an efficient reinforcement learning(RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL.
The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset and two real-world medical image segmentation datasets.
arXiv Detail & Related papers (2022-03-12T04:11:42Z) - Deep Reinforcement Learning Based Multidimensional Resource Management
for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency.
In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.