Continuum Dropout for Neural Differential Equations
- URL: http://arxiv.org/abs/2511.10446v2
- Date: Tue, 18 Nov 2025 08:29:11 GMT
- Title: Continuum Dropout for Neural Differential Equations
- Authors: Jonghun Lee, YongKyung Oh, Sungil Kim, Dong-Young Lim,
- Abstract summary: We introduce Continuum Dropout, a universally applicable regularization technique for Neural Differential Equations (NDEs)<n> Continuum Dropout formulates the on-off mechanism of dropout as a process that alternates between active (evolution) and inactive (paused) states in continuous time.<n>We demonstrate that Continuum Dropout outperforms existing regularization methods for NDEs, achieving superior performance on various time series and image classification tasks.
- Score: 13.964482869838639
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Differential Equations (NDEs) excel at modeling continuous-time dynamics, effectively handling challenges such as irregular observations, missing values, and noise. Despite their advantages, NDEs face a fundamental challenge in adopting dropout, a cornerstone of deep learning regularization, making them susceptible to overfitting. To address this research gap, we introduce Continuum Dropout, a universally applicable regularization technique for NDEs built upon the theory of alternating renewal processes. Continuum Dropout formulates the on-off mechanism of dropout as a stochastic process that alternates between active (evolution) and inactive (paused) states in continuous time. This provides a principled approach to prevent overfitting and enhance the generalization capabilities of NDEs. Moreover, Continuum Dropout offers a structured framework to quantify predictive uncertainty via Monte Carlo sampling at test time. Through extensive experiments, we demonstrate that Continuum Dropout outperforms existing regularization methods for NDEs, achieving superior performance on various time series and image classification tasks. It also yields better-calibrated and more trustworthy probability estimates, highlighting its effectiveness for uncertainty-aware modeling.
Related papers
- Adaptive Partitioning and Learning for Stochastic Control of Diffusion Processes [3.058685580689604]
We study reinforcement learning for controlled diffusion processes with unbounded continuous state spaces.<n>We introduce a model-based algorithm that adaptively partitions the joint state-action space.<n>This adaptive scheme balances exploration and approximation, enabling efficient learning in unbounded domains.
arXiv Detail & Related papers (2025-12-17T00:52:19Z) - Modeling Uncertainty Trends for Timely Retrieval in Dynamic RAG [35.96258615258145]
We introduce Entropy-Trend Constraint (ETC), a training-free method that determines optimal retrieval timing by modeling the dynamics of token-level uncertainty.<n>ETC consistently outperforms strong baselines while reducing retrieval frequency.<n>It is plug-and-play, model-agnostic, and readily integrable into existing decoding pipelines.
arXiv Detail & Related papers (2025-11-13T05:28:02Z) - Adaptive Variance-Penalized Continual Learning with Fisher Regularization [0.0]
This work presents a novel continual learning framework that integrates Fisher-weighted asymmetric regularization of parameter variances.<n>Our method dynamically modulates regularization intensity according to parameter uncertainty, achieving enhanced stability and performance.
arXiv Detail & Related papers (2025-08-15T21:49:28Z) - Data-Driven Exploration for a Class of Continuous-Time Indefinite Linear--Quadratic Reinforcement Learning Problems [6.859965454961918]
We study reinforcement learning for continuous-time linear-quadratic (LQ) control problems.<n>We propose a model-free, data-driven exploration mechanism that adaptively adjusts entropy regularization by the critic.<n>Our method achieves a sublinear regret bound that matches the best-known model-free results for this class of LQ problems.
arXiv Detail & Related papers (2025-07-01T01:09:06Z) - Global Convergence of Continual Learning on Non-IID Data [51.99584235667152]
We provide a general and comprehensive theoretical analysis for continual learning of regression models.<n>We establish the almost sure convergence results of continual learning under a general data condition for the first time.
arXiv Detail & Related papers (2025-03-24T10:06:07Z) - Temporal-Difference Variational Continual Learning [77.92320830700797]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Model-Based Uncertainty in Value Functions [89.31922008981735]
We focus on characterizing the variance over values induced by a distribution over MDPs.
Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation.
We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values.
arXiv Detail & Related papers (2023-02-24T09:18:27Z) - Continuous-Time Modeling of Counterfactual Outcomes Using Neural
Controlled Differential Equations [84.42837346400151]
Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare.
Existing causal inference approaches consider regular, discrete-time intervals between observations and treatment decisions.
We propose a controllable simulation environment based on a model of tumor growth for a range of scenarios.
arXiv Detail & Related papers (2022-06-16T17:15:15Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.