MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model
- URL: http://arxiv.org/abs/2509.18751v1
- Date: Tue, 23 Sep 2025 07:48:25 GMT
- Title: MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model
- Authors: Samuel Yoon, Jongwon Kim, Juyoung Ha, Young Myoung Ko,
- Abstract summary: We propose textbfMOMEMTO, a TFM for anomaly detection enhanced with a patch-based memory module.<n>The memory module is designed to capture representative normal patterns from multiple domains and enables a single model to be jointly fine-tuned.<n> Experimental results demonstrate that MOMEMTO, as a single model, achieves higher scores on AUC and VUS metrics compared to baseline methods.
- Score: 0.07777489763207261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently reconstruction-based deep models have been widely used for time series anomaly detection, but as their capacity and representation capability increase, these models tend to over-generalize, often reconstructing unseen anomalies accurately. Prior works have attempted to mitigate this by incorporating a memory architecture that stores prototypes of normal patterns. Nevertheless, these approaches suffer from high training costs and have yet to be effectively integrated with time series foundation models (TFMs). To address these challenges, we propose \textbf{MOMEMTO}, a TFM for anomaly detection, enhanced with a patch-based memory module to mitigate over-generalization. The memory module is designed to capture representative normal patterns from multiple domains and enables a single model to be jointly fine-tuned across multiple datasets through a multi-domain training strategy. MOMEMTO initializes memory items with latent representations from a pre-trained encoder, organizes them into patch-level units, and updates them via an attention mechanism. We evaluate our method using 23 univariate benchmark datasets. Experimental results demonstrate that MOMEMTO, as a single model, achieves higher scores on AUC and VUS metrics compared to baseline methods, and further enhances the performance of its backbone TFM, particularly in few-shot learning scenarios.
Related papers
- MEMTS: Internalizing Domain Knowledge via Parameterized Memory for Retrieval-Free Domain Adaptation of Time Series Foundation Models [51.506429027626005]
Memory for Time Series (MEMTS) is a lightweight and plug-and-play method for retrieval-free domain adaptation in time series forecasting.<n>Key component of MEMTS is a Knowledge Persistence Module (KPM), which internalizes domain-specific temporal dynamics.<n>This paradigm shift enables MEMTS to achieve accurate domain adaptation with constant-time inference and near-zero latency.
arXiv Detail & Related papers (2026-02-14T14:00:06Z) - Time Series Foundation Models for Process Model Forecasting [8.339024524110828]
Process Model Forecasting aims to predict how the control-flow structure of a process evolves over time.<n>Machine learning and deep learning models provide only modest gains over statistical baselines.<n>We investigate Time Series Foundation Models (TSFMs) as an alternative for PMF.
arXiv Detail & Related papers (2025-12-08T15:08:50Z) - TSGym: Design Choices for Deep Multivariate Time-Series Forecasting [38.12202305030755]
This work bridges gaps by decomposing deep MTSF methods into their core, fine-grained components.<n>We propose a novel automated solution called TSGym for MTSF tasks.<n>Extensive experiments indicate that TSGym significantly outperforms existing state-of-the-art MTSF and AutoML methods.
arXiv Detail & Related papers (2025-09-21T12:49:31Z) - VARMA-Enhanced Transformer for Time Series Forecasting [4.982130518684668]
VARMAformer is a novel architecture that synergizes the efficiency of a cross-attention-only framework with the principles of classical time series analysis.<n>By fusing these classical insights into a modern backbone, VARMAformer captures both global, long-range dependencies and local, statistical structures.
arXiv Detail & Related papers (2025-09-05T03:32:51Z) - Few-Shot Pattern Detection via Template Matching and Regression [52.79291493477272]
We propose a simple yet effective detector based on template matching and regression, dubbed TMR.<n>It effectively preserves and leverages the spatial layout of exemplars through a minimalistic structure with a small number of learnable convolutional or projection layers on top of a frozen backbone.<n>Our method outperforms the state-of-the-art methods on the three benchmarks, RPINE, FSCD-147, and FSCD-LVIS, and demonstrates strong generalization in cross-dataset evaluation.
arXiv Detail & Related papers (2025-08-25T03:52:42Z) - Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - GM-DF: Generalized Multi-Scenario Deepfake Detection [49.072106087564144]
Existing face forgery detection usually follows the paradigm of training models in a single domain.
In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets.
arXiv Detail & Related papers (2024-06-28T17:42:08Z) - UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens.
Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z) - ConvTimeNet: A Deep Hierarchical Fully Convolutional Model for Multivariate Time Series Analysis [7.979501926410114]
ConvTimeNet is a hierarchical pure convolutional model designed for time series analysis.<n>It adaptively perceives local patterns of temporally dependent basic units in a data-driven manner.<n>A large kernel mechanism is employed to ensure that convolutional blocks can be deeply stacked.
arXiv Detail & Related papers (2024-03-03T12:05:49Z) - MEMTO: Memory-guided Transformer for Multivariate Time Series Anomaly
Detection [6.16984478518058]
MEMTO is a memory-guided Transformer that learns the degree to which each memory item should be updated in response to the input data.
We evaluate our proposed method on five real-world datasets from diverse domains.
arXiv Detail & Related papers (2023-12-05T06:28:19Z) - Adaptive Memory Networks with Self-supervised Learning for Unsupervised
Anomaly Detection [54.76993389109327]
Unsupervised anomaly detection aims to build models to detect unseen anomalies by only training on the normal data.
We propose a novel approach called Adaptive Memory Network with Self-supervised Learning (AMSL) to address these challenges.
AMSL incorporates a self-supervised learning module to learn general normal patterns and an adaptive memory fusion module to learn rich feature representations.
arXiv Detail & Related papers (2022-01-03T03:40:21Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.