Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation
- URL: http://arxiv.org/abs/2405.17508v2
- Date: Tue, 26 Nov 2024 13:26:58 GMT
- Title: Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation
- Authors: Linglong Qian, Yiyuan Yang, Wenjie Du, Jun Wang, Zina Ibrahim,
- Abstract summary: Time series imputation is a critical challenge in data mining, particularly in domains like healthcare and environmental monitoring, where missing data can compromise analytical outcomes.
This study investigates the influence of diverse masking strategies, normalization timing, and missingness patterns on the performance of eleven state-of-the-art imputation models across three diverse datasets.
- Score: 7.650009336768971
- License:
- Abstract: Time series imputation is a critical challenge in data mining, particularly in domains like healthcare and environmental monitoring, where missing data can compromise analytical outcomes. This study investigates the influence of diverse masking strategies, normalization timing, and missingness patterns on the performance of eleven state-of-the-art imputation models across three diverse datasets. Specifically, we evaluate the effects of pre-masking versus in-mini-batch masking, augmentation versus overlaying of artificial missingness, and pre-normalization versus post-normalization. Our findings reveal that masking strategies profoundly affect imputation accuracy, with dynamic masking providing robust augmentation benefits and overlay masking better simulating real-world missingness patterns. Sophisticated models, such as CSDI, exhibited sensitivity to preprocessing configurations, while simpler models like BRITS delivered consistent and efficient performance. We highlight the importance of aligning preprocessing pipelines and masking strategies with dataset characteristics to improve robustness under diverse conditions, including high missing rates. This study provides actionable insights for designing imputation pipelines and underscores the need for transparent and comprehensive experimental designs.
Related papers
- Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
Segment Anything Model (SAM) has made great progress in anomaly segmentation tasks due to its impressive generalization ability.
Existing methods that directly apply SAM through prompting often overlook the domain shift issue.
We propose a novel Self-Perceptinon Tuning (SPT) method, aiming to enhance SAM's perception capability for anomaly segmentation.
arXiv Detail & Related papers (2024-11-26T08:33:25Z) - EMIT- Event-Based Masked Auto Encoding for Irregular Time Series [9.903108445512576]
Irregular time series, where data points are recorded at uneven intervals, are prevalent in healthcare settings.
This variability, which reflects critical fluctuations in patient health, is essential for informed clinical decision-making.
Existing self-supervised learning research on irregular time series often relies on generic pretext tasks like forecasting.
This paper proposes a novel pretraining framework, EMIT, an event-based masking for irregular time series.
arXiv Detail & Related papers (2024-09-25T02:05:32Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Semantic-Preserving Feature Partitioning for Multi-View Ensemble
Learning [11.415864885658435]
We introduce the Semantic-Preserving Feature Partitioning (SPFP) algorithm, a novel method grounded in information theory.
The SPFP algorithm effectively partitions datasets into multiple semantically consistent views, enhancing the multi-view ensemble learning process.
It maintains model accuracy while significantly improving uncertainty measures in scenarios where high generalization performance is achievable.
arXiv Detail & Related papers (2024-01-11T20:44:45Z) - Representation Learning for Wearable-Based Applications in the Case of
Missing Data [20.37256375888501]
multimodal sensor data in real-world environments is still challenging due to low data quality and limited data annotations.
We investigate representation learning for imputing missing wearable data and compare it with state-of-the-art statistical approaches.
Our study provides insights for the design and development of masking-based self-supervised learning tasks.
arXiv Detail & Related papers (2024-01-08T08:21:37Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - RARE: Robust Masked Graph Autoencoder [45.485891794905946]
Masked graph autoencoder (MGAE) has emerged as a promising self-supervised graph pre-training (SGP) paradigm.
We propose a novel SGP method termed Robust mAsked gRaph autoEncoder (RARE) to improve the certainty in inferring masked data.
arXiv Detail & Related papers (2023-04-04T03:35:29Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.