Related papers: SVTime: Small Time Series Forecasting Models Informed by "Physics" of Large Vision Model Forecasters

SVTime: Small Time Series Forecasting Models Informed by "Physics" of Large Vision Model Forecasters

URL: http://arxiv.org/abs/2510.09780v2
Date: Fri, 31 Oct 2025 02:36:51 GMT
Title: SVTime: Small Time Series Forecasting Models Informed by "Physics" of Large Vision Model Forecasters
Authors: ChengAo Shen, Ziming Zhao, Hanghang Tong, Dongjin Song, Dongsheng Luo, Qingsong Wen, Jingchao Ni,
Abstract summary: Time series AI is crucial for analyzing dynamic web content.<n>Given their energy-intensive training, inference, and hardware demands, using large models as a one-fits-all solution raises serious concerns about carbon footprint and sustainability.<n>This paper introduces SVTime, a novel Small model inspired by large Vision model (LVM) forecasters for long-term Time series forecasting (LTSF)
Score: 86.38433605933515
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series AI is crucial for analyzing dynamic web content, driving a surge of pre-trained large models known for their strong knowledge encoding and transfer capabilities across diverse tasks. However, given their energy-intensive training, inference, and hardware demands, using large models as a one-fits-all solution raises serious concerns about carbon footprint and sustainability. For a specific task, a compact yet specialized, high-performing model may be more practical and affordable, especially for resource-constrained users such as small businesses. This motivates the question: Can we build cost-effective lightweight models with large-model-like performance on core tasks such as forecasting? This paper addresses this question by introducing SVTime, a novel Small model inspired by large Vision model (LVM) forecasters for long-term Time series forecasting (LTSF). Recently, LVMs have been shown as powerful tools for LTSF. We identify a set of key inductive biases of LVM forecasters -- analogous to the "physics" governing their behaviors in LTSF -- and design small models that encode these biases through meticulously crafted linear layers and constraint functions. Across 21 baselines spanning lightweight, complex, and pre-trained large models on 8 benchmark datasets, SVTime outperforms state-of-the-art (SOTA) lightweight models and rivals large models with 10^3 fewer parameters than LVMs, while enabling efficient training and inference in low-resource settings.

Related papers

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models [78.73992315826035]
We introduce Youtu-LLM, a lightweight language model that harmonizes high computational efficiency with native agentic intelligence.<n>Youtu-LLM is pre-trained from scratch to systematically cultivate reasoning and planning capabilities.
arXiv Detail & Related papers (2025-12-31T04:25:11Z)
SEMPO: Lightweight Foundation Models for Time Series Forecasting [45.456949943052116]
SEMPO is a lightweight foundation model that requires pretraining on relatively small-scale data, yet exhibits strong general time series forecasting.<n> SEMPO comprises two key modules: 1) energy-aware SpEctral decomposition module, that substantially improves the utilization of pre-training data.<n>Experiments on two large-scale benchmarks covering 16 datasets demonstrate the superior performance of SEMPO in both zero-shot and few-shot forecasting scenarios.
arXiv Detail & Related papers (2025-10-22T15:58:44Z)
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving [52.63591791507895]
We propose textbfDriveVLA-W0, a training paradigm that employs world modeling to predict future images.<n>This task generates a dense, self-supervised signal that compels the model to learn the underlying dynamics of the driving environment.<n>Experiments on the NAVSIM v1/v2 benchmark and a 680x larger in-house dataset demonstrate that DriveVLA-W0 significantly outperforms BEV and VLA baselines.
arXiv Detail & Related papers (2025-10-14T17:59:47Z)
Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation [43.68215777330875]
We introduce a systematic post-training pipeline that efficiently enhances small model accuracy.<n>The resulting instruction-tuned model achieves state-of-the-art performance.<n>This work provides a practical and efficient solution for developing high-performance language models on Ascend edge devices.
arXiv Detail & Related papers (2025-09-30T16:40:55Z)
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts [25.503695417712997]
Time-MoE is a scalable and unified architecture designed to pre-train larger, more capable forecasting foundation models.<n>Time-MoE enhances computational efficiency by activating only a subset of networks for each prediction.<n>For the first time, we scaled a time series foundation model up to 2.4 billion parameters, achieving significantly improved forecasting precision.
arXiv Detail & Related papers (2024-09-24T12:42:18Z)
LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation [17.839755917342362]
This paper introduces LV-UNet, a lightweight and vanilla model that leverages pre-trained MobileNetv3-Large backbones and incorporates modules.<n> Experimental results on ISIC 2016, BUSI, CVCClinicDB, CVCColonDB, and KvairSEG datasets demonstrate a better tradeoff between performance and the computational load.
arXiv Detail & Related papers (2024-08-29T20:19:10Z)
Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset. We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding. Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z)
Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM) During pre-training, we curate large-scale datasets with up to 1 billion time points. To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z)
Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting [46.63798583414426]
Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis. Our study demonstrates, through both analytical and empirical evidence, that decomposition is key to containing excessive model inflation. Remarkably, by tailoring decomposition to the intrinsic dynamics of time series data, our proposed model outperforms existing benchmarks.
arXiv Detail & Related papers (2024-01-22T13:15:40Z)
A Momentum-Incorporated Non-Negative Latent Factorization of Tensors Model for Dynamic Network Representation [0.0]
A large-scale dynamic network (LDN) is a source of data in many big data-related applications. A Latent factorization of tensors (LFT) model efficiently extracts this time pattern. LFT models based on gradient descent (SGD) solvers are often limited by training schemes and have poor tail convergence. This paper proposes a novel nonlinear LFT model (MNNL) based on momentum-ind SGD to make training unconstrained and compatible with general training schemes.
arXiv Detail & Related papers (2023-05-04T12:30:53Z)
Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost. Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.