SEMixer: Semantics Enhanced MLP-Mixer for Multiscale Mixing and Long-term Time Series Forecasting
- URL: http://arxiv.org/abs/2602.16220v1
- Date: Wed, 18 Feb 2026 06:53:32 GMT
- Title: SEMixer: Semantics Enhanced MLP-Mixer for Multiscale Mixing and Long-term Time Series Forecasting
- Authors: Xu Zhang, Qitong Wang, Peng Wang, Wei Wang,
- Abstract summary: SEMixer is a lightweight model designed for long-term time series forecasting (TSF)<n>Mixer features a Random Attention Mechanism (RAM) and a Multiscale Progressive Mixing Chain (MPMC)<n>MPMC stacks RAM and ensemble-Mixer in a memory-efficient manner, achieving more effective temporal mixing.
- Score: 9.398256560898448
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modeling multiscale patterns is crucial for long-term time series forecasting (TSF). However, redundancy and noise in time series, together with semantic gaps between non-adjacent scales, make the efficient alignment and integration of multi-scale temporal dependencies challenging. To address this, we propose SEMixer, a lightweight multiscale model designed for long-term TSF. SEMixer features two key components: a Random Attention Mechanism (RAM) and a Multiscale Progressive Mixing Chain (MPMC). RAM captures diverse time-patch interactions during training and aggregates them via dropout ensemble at inference, enhancing patch-level semantics and enabling MLP-Mixer to better model multi-scale dependencies. MPMC further stacks RAM and MLP-Mixer in a memory-efficient manner, achieving more effective temporal mixing. It addresses semantic gaps across scales and facilitates better multiscale modeling and forecasting performance. We not only validate the effectiveness of SEMixer on 10 public datasets, but also on the \textit{2025 CCF AlOps Challenge} based on 21GB real wireless network data, where SEMixer achieves third place. The code is available at the link https://github.com/Meteor-Stars/SEMixer.
Related papers
- SDMixer: Sparse Dual-Mixer for Time Series Forecasting [8.124083509364981]
This paper proposes a dual-stream sparse prediction framework that extracts global trends and local dynamic features from sequences in both the frequency and time domains.<n>It employs a sparsity mechanism to filter out invalid information, thereby enhancing the accuracy of cross-variable dependency modeling.
arXiv Detail & Related papers (2026-02-27T01:13:56Z) - MaD-Mix: Multi-Modal Data Mixtures via Latent Space Coupling for Vision-Language Model Training [54.78779514101305]
MaD-Mix is a principled framework that derives multi-modal data mixtures for VLM training.<n>MaD-Mix speeds VLM training across diverse benchmarks.<n>In complex tri-modal video-image-text scenarios, MaD-Mix boosts average accuracy over uniform weights, with negligible mixture overhead.
arXiv Detail & Related papers (2026-02-08T03:07:36Z) - MTS-UNMixers: Multivariate Time Series Forecasting via Channel-Time Dual Unmixing [3.192685534395382]
We propose a channel-time dual unmixing network for time series forecasting (named MTS-UNMixer)<n> MTS-UNMixers decomposes the entire series into critical bases and coefficients across both the time and channel dimensions.<n>Results show that MTS-UNMixers significantly outperform existing methods on multiple benchmark datasets.
arXiv Detail & Related papers (2024-11-26T08:23:42Z) - xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories [20.773694998061707]
Time series data is prevalent across numerous fields, necessitating the development of robust and accurate forecasting models.
We introduce xLSTM-Mixer, a model designed to effectively integrate temporal sequences, joint time-variable information, and multiple perspectives for robust forecasting.
Our evaluations demonstrate xLSTM-Mixer's superior long-term forecasting performance compared to recent state-of-the-art methods.
arXiv Detail & Related papers (2024-10-22T11:59:36Z) - MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding [64.65145700121442]
We introduce MM-Mixing, a multi-modal mixing alignment framework for 3D understanding.
Our proposed two-stage training pipeline combines feature-level and input-level mixing to optimize the 3D encoder.
We demonstrate that MM-Mixing significantly improves baseline performance across various learning scenarios.
arXiv Detail & Related papers (2024-05-28T18:44:15Z) - SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion [59.96233305733875]
Time series forecasting plays a crucial role in various fields such as finance, traffic management, energy, and healthcare.
Several methods utilize mechanisms like attention or mixer to address this by capturing channel correlations.
This paper presents an efficient-based model, the Series-cOre Fused Time Series forecaster (SOFTS)
arXiv Detail & Related papers (2024-04-22T14:06:35Z) - PowMix: A Versatile Regularizer for Multimodal Sentiment Analysis [71.8946280170493]
This paper introduces PowMix, a versatile embedding space regularizer that builds upon the strengths of unimodal mixing-based regularization approaches.
PowMix is integrated before the fusion stage of multimodal architectures and facilitates intra-modal mixing, such as mixing text with text, to act as a regularizer.
arXiv Detail & Related papers (2023-12-19T17:01:58Z) - TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series
Forecasting [13.410217680999459]
Transformers have gained popularity in time series forecasting for their ability to capture long-sequence interactions.
High memory and computing requirements pose a critical bottleneck for long-term forecasting.
We propose TSMixer, a lightweight neural architecture composed of multi-layer perceptron (MLP) modules.
arXiv Detail & Related papers (2023-06-14T06:26:23Z) - TSMixer: An All-MLP Architecture for Time Series Forecasting [41.178272171720316]
Time-Series Mixer (TSMixer) is a novel architecture designed by stacking multi-layer perceptrons (MLPs)
On popular academic benchmarks, the simple-to-implement TSMixer is comparable to specialized state-of-the-art models.
We present various analyses to shed light into the capabilities of TSMixer.
arXiv Detail & Related papers (2023-03-10T16:41:24Z) - SplitMixer: Fat Trimmed From MLP-like Models [53.12472550578278]
We present SplitMixer, a simple and lightweight isotropic-like architecture, for visual recognition.
It contains two types of interleaving convolutional operations to mix information across locations (spatial mixing) and channels (channel mixing)
arXiv Detail & Related papers (2022-07-21T01:37:07Z) - Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data.
In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM)
DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z) - PointMixer: MLP-Mixer for Point Cloud Understanding [74.694733918351]
The concept of channel-mixings and token-mixings achieves noticeable performance in visual recognition tasks.
Unlike images, point clouds are inherently sparse, unordered and irregular, which limits the direct use of universal-Mixer for point cloud understanding.
We propose PointMixer, a universal point set operator that facilitates information sharing among unstructured 3D points.
arXiv Detail & Related papers (2021-11-22T13:25:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.