Adaptive Normalization Mamba with Multi Scale Trend Decomposition and Patch MoE Encoding
- URL: http://arxiv.org/abs/2512.06929v1
- Date: Sun, 07 Dec 2025 17:14:32 GMT
- Title: Adaptive Normalization Mamba with Multi Scale Trend Decomposition and Patch MoE Encoding
- Authors: MinCheol Jeon,
- Abstract summary: AdaMamba is a unified forecasting architecture that integrates adaptive normalization, multi scale trend extraction, and contextual sequence modeling.<n>AdaMamba's combination of adaptive normalization and expert augmented contextual modeling yields consistent improvements in stability and accuracy over conventional Transformer based baselines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series forecasting in real world environments faces significant challenges non stationarity, multi scale temporal patterns, and distributional shifts that degrade model stability and accuracy. This study propose AdaMamba, a unified forecasting architecture that integrates adaptive normalization, multi scale trend extraction, and contextual sequence modeling to address these challenges. AdaMamba begins with an Adaptive Normalization Block that removes non stationary components through multi scale convolutional trend extraction and channel wise recalibration, enabling consistent detrending and variance stabilization. The normalized sequence is then processed by a Context Encoder that combines patch wise embeddings, positional encoding, and a Mamba enhanced Transformer layer with a mixture of experts feed forward module, allowing efficient modeling of both long range dependencies and local temporal dynamics. A lightweight prediction head generates multi horizon forecasts, and a denormalization mechanism reconstructs outputs by reintegrating local trends to ensure robustness under varying temporal conditions. AdaMamba provides strong representational capacity with modular extensibility, supporting deterministic prediction and compatibility with probabilistic extensions. Its design effectively mitigates covariate shift and enhances predictive reliability across heterogeneous datasets. Experimental evaluations demonstrate that AdaMamba's combination of adaptive normalization and expert augmented contextual modeling yields consistent improvements in stability and accuracy over conventional Transformer based baselines.
Related papers
- DiTS: Multimodal Diffusion Transformers Are Time Series Forecasters [50.43534351968113]
Existing generative time series models do not address the multi-dimensional properties of time series data well.<n>Inspired by Multimodal Diffusion Transformers that integrate textual guidance into video generation, we propose Diffusion Transformers for Time Series (DiTS)
arXiv Detail & Related papers (2026-02-06T10:48:13Z) - SpanNorm: Reconciling Training Stability and Performance in Deep Transformers [55.100133502295996]
We propose SpanNorm, a novel technique designed to resolve the dilemma by integrating the strengths of both paradigms.<n>We provide a theoretical analysis demonstrating that SpanNorm, combined with a principled scaling strategy, maintains bounded signal variance throughout the network.<n> Empirically, SpanNorm consistently outperforms standard normalization schemes in both dense and Mixture-of-Experts (MoE) scenarios.
arXiv Detail & Related papers (2026-01-30T05:21:57Z) - Solving Inverse Problems with FLAIR [68.87167940623318]
We present FLAIR, a training-free variational framework that leverages flow-based generative models as prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z) - ss-Mamba: Semantic-Spline Selective State-Space Model [0.0]
ss-Mamba is a novel foundation model that enhances time series forecasting by integrating semantic-aware embeddings and adaptive spline-based temporal encoding.<n>We show that ss-Mamba delivers superior accuracy, robustness, and interpretability, demonstrating its capability as a versatile and computationally efficient alternative to traditional Transformer-based models in time-series forecasting.
arXiv Detail & Related papers (2025-06-03T03:26:57Z) - Latent Convergence Modulation in Large Language Models: A Novel Approach to Iterative Contextual Realignment [0.0]
A structured modulation mechanism was introduced to regulate hidden state transitions.<n>Lattice adjustments contributed to reductions in perplexity fluctuations, entropy variance, and lexical instability.
arXiv Detail & Related papers (2025-02-10T09:46:33Z) - Ister: Inverted Seasonal-Trend Decomposition Transformer for Explainable Multivariate Time Series Forecasting [10.32586981170693]
Inverted Seasonal-Trend Decomposition Transformer (Ister)<n>We introduce a novel Dot-attention mechanism that improves interpretability, computational efficiency, and predictive accuracy.<n>Ister enables intuitive visualization of component contributions, shedding lights on model's decision process and enhancing transparency in prediction results.
arXiv Detail & Related papers (2024-12-25T06:37:19Z) - Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments.<n>Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders.<n>We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z) - Variational Bayesian Bow tie Neural Networks with Shrinkage [0.276240219662896]
We develop a fast, approximate variational inference algorithm that avoids distributional assumptions and independence across layers.<n>We use Polya-Gamma data augmentation tricks, which render a conditionally linear and Gaussian model.
arXiv Detail & Related papers (2024-11-17T17:36:30Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a causal Transformer for unified time series forecasting.<n>Based on large-scale pre-training, Timer-XL achieves state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - Evolving Multi-Scale Normalization for Time Series Forecasting under Distribution Shifts [20.02869280775877]
We propose a novel model-agnostic Evolving Multi-Scale Normalization (EvoMSN) framework to tackle the distribution shift problem.
We evaluate the effectiveness of EvoMSN in improving the performance of five mainstream forecasting methods on benchmark datasets.
arXiv Detail & Related papers (2024-09-29T14:26:22Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.