TS-HINT: Enhancing Semiconductor Time Series Regression Using Attention Hints From Large Language Model Reasoning
- URL: http://arxiv.org/abs/2512.05419v1
- Date: Fri, 05 Dec 2025 04:35:18 GMT
- Title: TS-HINT: Enhancing Semiconductor Time Series Regression Using Attention Hints From Large Language Model Reasoning
- Authors: Jonathan Adam Rico, Nagarajan Raghavan, Senthilnath Jayavelu,
- Abstract summary: Existing data-driven methods rely on the extraction of static features from time series to approximate the material removal rate (MRR) of semiconductor manufacturing processes such as chemical mechanical polishing (CMP)<n>In this paper, we propose TS-Hint, a Time Series Foundation Model (TSFM) framework, integrated with chain-of-thought reasoning which provides attention hints during training based on attention mechanism data and saliency data.
- Score: 2.4958651162443943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing data-driven methods rely on the extraction of static features from time series to approximate the material removal rate (MRR) of semiconductor manufacturing processes such as chemical mechanical polishing (CMP). However, this leads to a loss of temporal dynamics. Moreover, these methods require a large amount of data for effective training. In this paper, we propose TS-Hint, a Time Series Foundation Model (TSFM) framework, integrated with chain-of-thought reasoning which provides attention hints during training based on attention mechanism data and saliency data. Experimental results demonstrate the effectiveness of our model in limited data settings via few-shot learning and can learn directly from multivariate time series features.
Related papers
- OATS: Online Data Augmentation for Time Series Foundation Models [49.1394215208561]
Time Series Foundation Models (TSFMs) are a powerful paradigm for time analysis and are often enhanced by synthetic data augmentation to improve the training data quality.<n>We propose OATS (Online Data Augmentation for Time Series Foundation Models), a principled strategy that generates synthetic data tailored to different training steps.
arXiv Detail & Related papers (2026-01-26T23:51:03Z) - Time Series Foundation Models for Process Model Forecasting [8.339024524110828]
Process Model Forecasting aims to predict how the control-flow structure of a process evolves over time.<n>Machine learning and deep learning models provide only modest gains over statistical baselines.<n>We investigate Time Series Foundation Models (TSFMs) as an alternative for PMF.
arXiv Detail & Related papers (2025-12-08T15:08:50Z) - SEMPO: Lightweight Foundation Models for Time Series Forecasting [45.456949943052116]
SEMPO is a lightweight foundation model that requires pretraining on relatively small-scale data, yet exhibits strong general time series forecasting.<n> SEMPO comprises two key modules: 1) energy-aware SpEctral decomposition module, that substantially improves the utilization of pre-training data.<n>Experiments on two large-scale benchmarks covering 16 datasets demonstrate the superior performance of SEMPO in both zero-shot and few-shot forecasting scenarios.
arXiv Detail & Related papers (2025-10-22T15:58:44Z) - WDformer: A Wavelet-based Differential Transformer Model for Time Series Forecasting [21.222605948133893]
Time series forecasting has various applications, such as meteorological rainfall prediction, traffic flow analysis, financial forecasting, and operational load monitoring.<n>Due to the sparsity of time series data, relying solely on time-domain or frequency-domain modeling limits the model's ability to fully leverage multi-domain information.<n>We proposed WDformer, a wavelet-based differential Transformer model, to conduct a multi-resolution analysis of time series data.
arXiv Detail & Related papers (2025-09-25T02:43:51Z) - Retrieval-Augmented Diffusion Models for Time Series Forecasting [19.251274915003265]
We propose a Retrieval- Augmented Time series Diffusion model (RATD)
RATD consists of two parts: an embedding-based retrieval process and a reference-guided diffusion model.
Our approach allows leveraging meaningful samples within the database to aid in sampling, thus maximizing the utilization of datasets.
arXiv Detail & Related papers (2024-10-24T13:14:39Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data.
Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction.
Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z) - Robustness and Generalization Performance of Deep Learning Models on
Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise.
We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z) - A Continuous Time Framework for Discrete Denoising Models [43.135447812798155]
We provide the first complete continuous time framework for denoising diffusion models of discrete data.
This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs)
arXiv Detail & Related papers (2022-05-30T10:37:41Z) - Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF)
Our model avoids the influence of cumulative error and does not increase the time complexity.
Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.