Related papers: HierCVAE: Hierarchical Attention-Driven Conditional Variational Autoencoders for Multi-Scale Temporal Modeling

HierCVAE: Hierarchical Attention-Driven Conditional Variational Autoencoders for Multi-Scale Temporal Modeling

URL: http://arxiv.org/abs/2508.18922v1
Date: Tue, 26 Aug 2025 10:55:35 GMT
Title: HierCVAE: Hierarchical Attention-Driven Conditional Variational Autoencoders for Multi-Scale Temporal Modeling
Authors: Yao Wu,
Abstract summary: Temporal modeling in complex systems requires capturing dependencies across multiple time scales.<n>We propose HierCVAE, a novel architecture that integrates hierarchical attention mechanisms with conditional variational autoencoders.
Score: 7.900277891102576
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Temporal modeling in complex systems requires capturing dependencies across multiple time scales while managing inherent uncertainties. We propose HierCVAE, a novel architecture that integrates hierarchical attention mechanisms with conditional variational autoencoders to address these challenges. HierCVAE employs a three-tier attention structure (local, global, cross-temporal) combined with multi-modal condition encoding to capture temporal, statistical, and trend information. The approach incorporates ResFormer blocks in the latent space and provides explicit uncertainty quantification via prediction heads. Through evaluations on energy consumption datasets, HierCVAE demonstrates a 15-40% improvement in prediction accuracy and superior uncertainty calibration compared to state-of-the-art methods, excelling in long-term forecasting and complex multi-variate dependencies.

Related papers

RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion [64.49056527678606]
We propose a Token-wise Attention integrated into not only the U-Net diffusion model but also the radar-temporal encoder.<n>Unlike prior approaches, our method integrates attention into the architecture without incurring the high resource cost typical of pixel-space diffusion.<n>Our experiments and evaluations demonstrate that the proposed method significantly outperforms state-of-the-art approaches, robustness local fidelity, generalization, and superior in complex precipitation forecasting scenarios.
arXiv Detail & Related papers (2025-10-16T17:59:13Z)
Multiresolution Analysis and Statistical Thresholding on Dynamic Networks [49.09073800467438]
ANIE (Adaptive Network Intensity Estimation) is a multi-resolution framework designed to automatically identify the time scales at which network structure evolves.<n>We show that ANIE adapts to the appropriate time resolution and is able to capture sharp structural changes while remaining robust to noise.
arXiv Detail & Related papers (2025-06-01T22:55:55Z)
SCENT: Robust Spatiotemporal Learning for Continuous Scientific Data via Scalable Conditioned Neural Fields [11.872753517172555]
We present SCENT, a novel framework for scalable and continuity-informed modeling learning.<n>SCENT unifies representation, reconstruction, and forecasting within a single architecture.<n>We validate SCENT through extensive simulations and real-world experiments, demonstrating state-of-the-art performance.
arXiv Detail & Related papers (2025-04-16T17:17:31Z)
Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.<n>We introduce a novel quantization framework that includes three strategies.<n>This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z)
Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z)
FormerTime: Hierarchical Multi-Scale Representations for Multivariate Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task. It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z)
Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism. We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z)
Temporal Saliency Detection Towards Explainable Transformer-based Timeseries Forecasting [3.046315755726937]
This paper introduces Temporal Saliency Detection (TSD), an effective approach that builds upon the attention mechanism and applies it to multi-horizon time series prediction. The TSD approach facilitates the multiresolution analysis of saliency patterns by condensing multi-heads, thereby progressively enhancing the forecasting of complex time series data.
arXiv Detail & Related papers (2022-12-15T12:47:59Z)
An Attention-based ConvLSTM Autoencoder with Dynamic Thresholding for Unsupervised Anomaly Detection in Multivariate Time Series [2.9685635948299995]
We propose an unsupervised Attention-based Convolutional Long Short-Term Memory (ConvLSTM) Autoencoder with Dynamic Thresholding (ACLAE-DT) framework for anomaly detection and diagnosis. The framework starts by pre-processing and enriching the data, before constructing feature images to characterize the system statuses. The constructed feature images are fed into an attention-based ConvLSTM autoencoder, which aims to encode the constructed feature images and capture the temporal behavior. The reconstruction errors are then computed and subjected to a statistical-based, dynamic thresholding mechanism to detect and diagnose the anomalies
arXiv Detail & Related papers (2022-01-23T04:01:43Z)
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting [68.86835407617778]
Autoformer is a novel decomposition architecture with an Auto-Correlation mechanism. In long-term forecasting, Autoformer yields state-of-the-art accuracy, with a relative improvement on six benchmarks.
arXiv Detail & Related papers (2021-06-24T13:43:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.