Semi-supervised CAPP Transformer Learning via Pseudo-labeling
- URL: http://arxiv.org/abs/2602.01419v1
- Date: Sun, 01 Feb 2026 19:51:39 GMT
- Title: Semi-supervised CAPP Transformer Learning via Pseudo-labeling
- Authors: Dennis Gross, Helge Spieker, Arnaud Gotlieb, Emmanuel Stathatos, Panorios Benardos, George-Christopher Vosniakos,
- Abstract summary: We propose a semi-supervised learning approach to improve transformer-based CAPP transformer models without manual labeling.<n>An oracle, trained on available transformer behaviour data, filters correct predictions from unseen parts, which are then used for one-shot retraining.
- Score: 3.6799158613885066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-level Computer-Aided Process Planning (CAPP) generates manufacturing process plans from part specifications. It suffers from limited dataset availability in industry, reducing model generalization. We propose a semi-supervised learning approach to improve transformer-based CAPP transformer models without manual labeling. An oracle, trained on available transformer behaviour data, filters correct predictions from unseen parts, which are then used for one-shot retraining. Experiments on small-scale datasets with simulated ground truth across the full data distribution show consistent accuracy gains over baselines, demonstrating the method's effectiveness in data-scarce manufacturing environments.
Related papers
- In-Context Reinforcement Learning From Suboptimal Historical Data [56.60512975858003]
Transformer models have achieved remarkable empirical successes, largely due to their in-context learning capabilities.<n>We propose the Decision Importance Transformer framework, which emulates the actor-critic algorithm in an in-context manner.<n>Our results show that DIT achieves superior performance, particularly when the offline dataset contains suboptimal historical data.
arXiv Detail & Related papers (2026-01-27T23:13:06Z) - Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice [109.9635246405237]
We show that the experiment conclusions about data quality can flip with even minor adjustments to training hyper parameters.<n>We introduce a simple patch to the evaluation protocol: using reduced learning rates for proxy model training.<n> Empirically, we validate this approach across 23 data recipes covering four critical dimensions of data curation.
arXiv Detail & Related papers (2025-12-30T23:02:44Z) - Challenging Gradient Boosted Decision Trees with Tabular Transformers for Fraud Detection at Booking.com [3.2750365257196803]
Transformer-based neural networks, empowered by Self-Supervised Learning (SSL), have demonstrated unprecedented performance across various domains.<n>In this paper, we aim to challenge GBDTs with tabular Transformers on a typical task faced in e-commerce, namely fraud detection.<n>Our study is additionally motivated by the problem of selection bias, often occurring in real-life fraud detection systems. It is caused by the production system affecting which subset of traffic becomes labeled.
arXiv Detail & Related papers (2024-05-22T14:38:48Z) - FaultFormer: Pretraining Transformers for Adaptable Bearing Fault Classification [7.136205674624813]
We present a novel self-supervised pretraining and fine-tuning framework based on transformer models.
In particular, we investigate different tokenization and data augmentation strategies to reach state-of-the-art accuracies.
This introduces a new paradigm where models can be pretrained on unlabeled data from different bearings, faults, and machinery and quickly deployed to new, data-scarce applications.
arXiv Detail & Related papers (2023-12-04T22:51:02Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Automatic Rule Induction for Efficient Semi-Supervised Learning [56.91428251227253]
Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.
Pretrained transformer models act as black-box correlation engines that are difficult to explain and sometimes behave unreliably.
We propose tackling both of these challenges via Automatic Rule Induction (ARI), a simple and general-purpose framework.
arXiv Detail & Related papers (2022-05-18T16:50:20Z) - Towards Data-Efficient Detection Transformers [77.43470797296906]
We show most detection transformers suffer from significant performance drops on small-size datasets.
We empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR.
We introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
arXiv Detail & Related papers (2022-03-17T17:56:34Z) - Development of Deep Transformer-Based Models for Long-Term Prediction of
Transient Production of Oil Wells [9.832272256738452]
We propose a novel approach to data-driven modeling of a transient production of oil wells.
We apply the transformer-based neural networks trained on the multivariate time series composed of various parameters of oil wells.
We generalize the single-well model based on the transformer architecture for multiple wells to simulate complex transient oilfield-level patterns.
arXiv Detail & Related papers (2021-10-12T15:00:45Z) - Discriminative and Generative Transformer-based Models For Situation
Entity Classification [8.029049649310211]
We re-examine the situation entity (SE) classification task with varying amounts of available training data.
We exploit a Transformer-based variational autoencoder to encode sentences into a lower dimensional latent space.
arXiv Detail & Related papers (2021-09-15T17:07:07Z) - Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels.
In this paper, we explore how to apply mixup to natural language processing tasks.
We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.