FTT-GRU: A Hybrid Fast Temporal Transformer with GRU for Remaining Useful Life Prediction
- URL: http://arxiv.org/abs/2511.00564v1
- Date: Sat, 01 Nov 2025 14:02:03 GMT
- Title: FTT-GRU: A Hybrid Fast Temporal Transformer with GRU for Remaining Useful Life Prediction
- Authors: Varun Teja Chirukiri, Udaya Bhasker Cheerala, Sandeep Kanta, Abdul Karim, Praveen Damacharla,
- Abstract summary: We propose a hybrid model, FTT-GRU, which combines a Fast Temporal Transformer (FTT) with a gated recurrent unit (GRU) layer for sequential modeling.<n>On NASA CMAPSS FD001, FTT-GRU attains RMSE 30.76, MAE 18.97, and $R2=0.45$, with 1.12 ms CPU latency at batch=1.<n>These results demonstrate that a compact Transformer-RNN hybrid delivers accurate and efficient RUL predictions on CMAPSS.
- Score: 0.6421270655703623
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurate prediction of the remaining useful life (RUL) of industrial machinery is essential for reducing downtime and optimizing maintenance schedules. Existing approaches, such as long short-term memory (LSTM) networks and convolutional neural networks (CNNs), often struggle to model both global temporal dependencies and fine-grained degradation trends in multivariate sensor data. We propose a hybrid model, FTT-GRU, which combines a Fast Temporal Transformer (FTT) -- a lightweight Transformer variant using linearized attention via fast Fourier transform (FFT) -- with a gated recurrent unit (GRU) layer for sequential modeling. To the best of our knowledge, this is the first application of an FTT with a GRU for RUL prediction on NASA CMAPSS, enabling simultaneous capture of global and local degradation patterns in a compact architecture. On CMAPSS FD001, FTT-GRU attains RMSE 30.76, MAE 18.97, and $R^2=0.45$, with 1.12 ms CPU latency at batch=1. Relative to the best published deep baseline (TCN--Attention), it improves RMSE by 1.16\% and MAE by 4.00\%. Training curves averaged over $k=3$ runs show smooth convergence with narrow 95\% confidence bands, and ablations (GRU-only, FTT-only) support the contribution of both components. These results demonstrate that a compact Transformer-RNN hybrid delivers accurate and efficient RUL predictions on CMAPSS, making it suitable for real-time industrial prognostics.
Related papers
- Time2Vec Transformer for Robust Gesture Recognition from Low-Density sEMG [1.231764991565978]
This paper presents a novel, data-efficient deep learning framework for myoelectric prosthesis control.<n>Our approach implements a hybrid Transformer optimized for sparse, two-channel surface electromyography (sEMG)<n>The proposed framework offers a robust, cost-effective blueprint for next-generation prosthetic interfaces capable of rapid personalization.
arXiv Detail & Related papers (2026-02-02T09:28:27Z) - MeanFlow Transformers with Representation Autoencoders [71.45823902973349]
MeanFlow (MF) is a diffusion-motivated generative model that enables efficient few-step generation by learning long jumps directly from noise to data.<n>We develop an efficient training and sampling scheme for MF in the latent space of a Representation Autoencoder (RAE)<n>We achieve a 1-step FID of 2.03, outperforming vanilla MF's 3.43, while reducing sampling GFLOPS by 38% and total training cost by 83% on ImageNet 256.
arXiv Detail & Related papers (2025-11-17T06:17:08Z) - RatioWaveNet: A Learnable RDWT Front-End for Robust and Interpretable EEG Motor-Imagery Classification [1.4069478981641936]
We present RatioWaveNet, which augments a strong temporal CNN-Transformer backbone with a trainable, Rationally-Dilated Wavelet Transform front end.<n>Our goal is to test whether this principled wavelet front end improves robustness precisely where BCIs typically fail.
arXiv Detail & Related papers (2025-10-22T14:04:03Z) - Learning from Scratch: Structurally-masked Transformer for Next Generation Lib-free Simulation [5.564681128355971]
This paper proposes a neural framework for power and timing prediction of multi-stage data path.<n>To the best of our knowledge, this is the first language-based, netlist-aware neural network designed explicitly for standard cells.
arXiv Detail & Related papers (2025-07-23T10:46:25Z) - rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA [0.0]
This paper introduces a novel method to predict the resource utilization and inference latency of Neural Networks (NNs) before their synthesis and implementation on FPGA.
We leverage HLS4ML, a tool-flow that helps translate NNs into high-level synthesis (HLS) code.
Our method uses trained regression models for immediate pre-synthesis predictions.
arXiv Detail & Related papers (2024-08-09T19:35:10Z) - Semi-Federated Learning: Convergence Analysis and Optimization of A
Hybrid Learning Framework [70.83511997272457]
We propose a semi-federated learning (SemiFL) paradigm to leverage both the base station (BS) and devices for a hybrid implementation of centralized learning (CL) and FL.
We propose a two-stage algorithm to solve this intractable problem, in which we provide the closed-form solutions to the beamformers.
arXiv Detail & Related papers (2023-10-04T03:32:39Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Adaptive Split-Fusion Transformer [90.04885335911729]
We propose an Adaptive Split-Fusion Transformer (ASF-former) to treat convolutional and attention branches differently with adaptive weights.
Experiments on standard benchmarks, such as ImageNet-1K, show that our ASF-former outperforms its CNN, transformer counterparts, and hybrid pilots in terms of accuracy.
arXiv Detail & Related papers (2022-04-26T10:00:28Z) - Global Vision Transformer Pruning with Hessian-Aware Saliency [93.33895899995224]
This work challenges the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage.
We derive a novel Hessian-based structural pruning criteria comparable across all layers and structures, with latency-aware regularization for direct latency reduction.
Performing iterative pruning on the DeiT-Base model leads to a new architecture family called NViT (Novel ViT), with a novel parameter that utilizes parameters more efficiently.
arXiv Detail & Related papers (2021-10-10T18:04:59Z) - TCCT: Tightly-Coupled Convolutional Transformer on Time Series
Forecasting [6.393659160890665]
We propose the concept of tightly-coupled convolutional Transformer(TCCT) and three TCCT architectures.
Our experiments on real-world datasets show that our TCCT architectures could greatly improve the performance of existing state-of-art Transformer models.
arXiv Detail & Related papers (2021-08-29T08:49:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.