An empirical evaluation of attention-based multi-head models for
improved turbofan engine remaining useful life prediction
- URL: http://arxiv.org/abs/2109.01761v1
- Date: Sat, 4 Sep 2021 01:13:47 GMT
- Title: An empirical evaluation of attention-based multi-head models for
improved turbofan engine remaining useful life prediction
- Authors: Abiodun Ayodeji, Wenhai Wang, Jianzhong Su, Jianquan Yuan, Xinggao Liu
- Abstract summary: A single unit (head) is the conventional input feature extractor in deep learning architectures trained on multivariate time series signals.
This work extends the conventional single-head deep learning models to a more robust form by developing context-specific heads.
- Score: 9.282239595143787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A single unit (head) is the conventional input feature extractor in deep
learning architectures trained on multivariate time series signals. The
importance of the fixed-dimensional vector representation generated by the
single-head network has been demonstrated for industrial machinery condition
monitoring and predictive maintenance. However, processing heterogeneous sensor
signals with a single head may result in a model that cannot explicitly account
for the diversity in time-varying multivariate inputs. This work extends the
conventional single-head deep learning models to a more robust form by
developing context-specific heads to independently capture the inherent pattern
of each sensor reading in multivariate time series signals. Using the turbofan
aircraft engine benchmark dataset (CMAPSS), an extensive experiment is
performed to verify the effectiveness and benefits of multi-head fully
connected neurons, recurrent networks, convolution network, the
transformer-style stand-alone attention network, and their variants for
remaining useful life estimation. Moreover, the effect of different attention
mechanisms on the multi-head models is also evaluated. In addition, each
architecture's relative advantage and computational overhead are analyzed.
Results show that utilizing the attention layer is task-sensitive and
model-dependent, as it does not provide consistent improvement across the
models investigated. The result is further compared with five state-of-the-art
models, and the comparison shows that a relatively simple multi-head
architecture performs better than the state-of-the-art models. The results
presented in this study demonstrate the importance of multi-head models and
attention mechanisms to improved understanding of the remaining useful life of
industrial assets.
Related papers
- Improving satellite imagery segmentation using multiple Sentinel-2 revisits [0.0]
We explore the best way to use revisits in the framework of fine-tuning pre-trained remote sensing models.
We find that fusing representations from multiple revisits in the model latent space is superior to other methods of using revisits.
A SWIN Transformer-based architecture performs better than U-nets and ViT-based models.
arXiv Detail & Related papers (2024-09-25T21:13:33Z) - Exploring Representations and Interventions in Time Series Foundation Models [17.224575072056627]
Time series foundation models (TSFMs) promise to be powerful tools for a wide range of applications.
Their internal representations and learned concepts are still not well understood.
This study investigates the structure and redundancy of representations across various TSFMs.
arXiv Detail & Related papers (2024-09-19T17:11:27Z) - UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens.
Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z) - Dance of Channel and Sequence: An Efficient Attention-Based Approach for
Multivariate Time Series Forecasting [3.372816393214188]
CSformer is an innovative framework characterized by a meticulously engineered two-stage self-attention mechanism.
We introduce sequence adapters and channel adapters, ensuring the model's ability to discern salient features across various dimensions.
arXiv Detail & Related papers (2023-12-11T09:10:38Z) - Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data.
Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction.
Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Comparing Deep Learning Models for the Task of Volatility Prediction
Using Multivariate Data [4.793572485305333]
The paper evaluates a range of models, starting from simpler and shallower ones and progressing to deeper and more complex architectures.
The prediction of volatility for five assets, namely S&P500, NASDAQ100, gold, silver, and oil, is specifically addressed using GARCH models.
arXiv Detail & Related papers (2023-06-20T17:10:13Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.