Related papers: RP-CATE: Recurrent Perceptron-based Channel Attention Transformer Encoder for Industrial Hybrid Modeling

RP-CATE: Recurrent Perceptron-based Channel Attention Transformer Encoder for Industrial Hybrid Modeling

URL: http://arxiv.org/abs/2512.19147v1
Date: Mon, 22 Dec 2025 08:44:58 GMT
Title: RP-CATE: Recurrent Perceptron-based Channel Attention Transformer Encoder for Industrial Hybrid Modeling
Authors: Haoran Yang, Yinan Zhang, Wenjie Zhang, Dongxia Wang, Peiyu Liu, Yuqi Ye, Kexin Chen, Wenhai Wang,
Abstract summary: Industrial hybrid modeling integrates both mechanistic modeling and machine learning-based modeling techniques.<n>The existing industrial hybrid modeling methods still face two main limitations.<n>This paper proposes the Recurrent Perceptron-based Channel Attention Transformer (RP-CATE) to address these limitations.
Score: 38.59451477828059
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Nowadays, industrial hybrid modeling which integrates both mechanistic modeling and machine learning-based modeling techniques has attracted increasing interest from scholars due to its high accuracy, low computational cost, and satisfactory interpretability. Nevertheless, the existing industrial hybrid modeling methods still face two main limitations. First, current research has mainly focused on applying a single machine learning method to one specific task, failing to develop a comprehensive machine learning architecture suitable for modeling tasks, which limits their ability to effectively represent complex industrial scenarios. Second, industrial datasets often contain underlying associations (e.g., monotonicity or periodicity) that are not adequately exploited by current research, which can degrade model's predictive performance. To address these limitations, this paper proposes the Recurrent Perceptron-based Channel Attention Transformer Encoder (RP-CATE), with three distinctive characteristics: 1: We developed a novel architecture by replacing the self-attention mechanism with channel attention and incorporating our proposed Recurrent Perceptron (RP) Module into Transformer, achieving enhanced effectiveness for industrial modeling tasks compared to the original Transformer. 2: We proposed a new data type called Pseudo-Image Data (PID) tailored for channel attention requirements and developed a cyclic sliding window method for generating PID. 3: We introduced the concept of Pseudo-Sequential Data (PSD) and a method for converting industrial datasets into PSD, which enables the RP Module to capture the underlying associations within industrial dataset more effectively. An experiment aimed at hybrid modeling in chemical engineering was conducted by using RP-CATE and the experimental results demonstrate that RP-CATE achieves the best performance compared to other baseline models.

Related papers

Probing then Editing: A Push-Pull Framework for Retain-Free Machine Unlearning in Industrial IoT [6.973959179359068]
We propose a novel retain-free unlearning framework, referred to as Probing then Editing (PTE)<n>PTE frames unlearning as a probe-edit process and generates corresponding editing instructions using the model's own predictions.<n>Benefiting from this mechanism, PTE achieves efficient and balanced knowledge editing using only the to-be-forgotten data and the original model.
arXiv Detail & Related papers (2025-11-12T15:28:56Z)
DIFFUMA: High-Fidelity Spatio-Temporal Video Prediction via Dual-Path Mamba and Diffusion Enhancement [5.333662480077316]
We release the Chip Dicing Lane dataset (CHDL), the first public temporal image dataset dedicated to the semiconductor wafer dicing process.<n>We propose DIFFUMA, an innovative dual-path prediction architecture specifically designed for such fine-grained dynamics.<n>Experiments demonstrate that DIFFUMA significantly outperforms existing methods, reducing the Mean Squared Error (MSE) by 39% and improving the Similarity (SSIM) from 0.926 to a near-perfect 0.988.
arXiv Detail & Related papers (2025-07-09T10:51:54Z)
Triple Attention Transformer Architecture for Time-Dependent Concrete Creep Prediction [0.0]
This paper presents a novel Triple Attention Transformer Architecture for predicting time-dependent concrete creep.<n>By transforming concrete creep prediction into an autoregressive sequence modeling task similar to language processing, our architecture leverages the transformer's self-attention mechanisms.<n>The architecture achieves exceptional performance with mean absolute percentage error of 1.63% and R2 values of 0.999 across all datasets.
arXiv Detail & Related papers (2025-05-28T22:30:35Z)
Automatically Learning Hybrid Digital Twins of Dynamical Systems [56.69628749813084]
Digital Twins (DTs) simulate the states and temporal dynamics of real-world systems. DTs often struggle to generalize to unseen conditions in data-scarce settings. In this paper, we propose an evolutionary algorithm ($textbfHDTwinGen$) to autonomously propose, evaluate, and optimize HDTwins.
arXiv Detail & Related papers (2024-10-31T07:28:22Z)
Sustainable Diffusion-based Incentive Mechanism for Generative AI-driven Digital Twins in Industrial Cyber-Physical Systems [65.22300383287904]
Industrial Cyber-Physical Systems (ICPSs) are an integral component of modern manufacturing and industries.<n>By digitizing data throughout product life cycles, Digital Twins (DTs) in ICPSs enable a shift from current industrial infrastructures to intelligent and adaptive infrastructures.<n>GenAI can drive the construction and update of DTs to improve predictive accuracy and prepare for diverse smart manufacturing.
arXiv Detail & Related papers (2024-08-02T10:47:10Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism. We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z)
Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation [65.62538699160085]
We propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments.
arXiv Detail & Related papers (2022-11-21T03:09:42Z)
A Generative Approach for Production-Aware Industrial Network Traffic Modeling [70.46446906513677]
We investigate the network traffic data generated from a laser cutting machine deployed in a Trumpf factory in Germany. We analyze the traffic statistics, capture the dependencies between the internal states of the machine, and model the network traffic as a production state dependent process. We compare the performance of various generative models including variational autoencoder (VAE), conditional variational autoencoder (CVAE), and generative adversarial network (GAN)
arXiv Detail & Related papers (2022-11-11T09:46:58Z)
Development of Deep Transformer-Based Models for Long-Term Prediction of Transient Production of Oil Wells [9.832272256738452]
We propose a novel approach to data-driven modeling of a transient production of oil wells. We apply the transformer-based neural networks trained on the multivariate time series composed of various parameters of oil wells. We generalize the single-well model based on the transformer architecture for multiple wells to simulate complex transient oilfield-level patterns.
arXiv Detail & Related papers (2021-10-12T15:00:45Z)
KNODE-MPC: A Knowledge-based Data-driven Predictive Control Framework for Aerial Robots [5.897728689802829]
We make use of a deep learning tool, knowledge-based neural ordinary differential equations (KNODE), to augment a model obtained from first principles. The resulting hybrid model encompasses both a nominal first-principle model and a neural network learnt from simulated or real-world experimental data. To improve closed-loop performance, the hybrid model is integrated into a novel MPC framework, known as KNODE-MPC.
arXiv Detail & Related papers (2021-09-10T12:09:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.