Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines
- URL: http://arxiv.org/abs/2403.09056v1
- Date: Thu, 14 Mar 2024 02:55:06 GMT
- Title: Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines
- Authors: Liang Wu, X. -G. Ma,
- Abstract summary: We developed a strategy for expanding industrial datasets to achieve efficient, high-quality, and large-scale dataset expansion.
We also applied this strategy to video action recognition.
In the "hand movements during wire insertion" scenarios on the actual assembly line, the accuracy of hand action recognition reached 98.8%.
- Score: 3.0992677770545254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: On modern industrial assembly lines, many intelligent algorithms have been developed to replace or supervise workers. However, we found that there were bottlenecks in both training datasets and real-time performance when deploying algorithms on actual assembly line. Therefore, we developed a promising strategy for expanding industrial datasets, which utilized large models with strong generalization abilities to achieve efficient, high-quality, and large-scale dataset expansion, solving the problem of insufficient and low-quality industrial datasets. We also applied this strategy to video action recognition. We proposed a method of converting hand action recognition problems into hand skeletal trajectory classification problems, which solved the real-time performance problem of industrial algorithms. In the "hand movements during wire insertion" scenarios on the actual assembly line, the accuracy of hand action recognition reached 98.8\%. We conducted detailed experimental analysis to demonstrate the effectiveness and superiority of the method, and deployed the entire process on Midea's actual assembly line.
Related papers
- Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling [39.98859285173431]
The flexible job-shop scheduling problem (FJSP) has attracted significant attention due to its complex and strong alignment with real-world production scenarios.<n>Current deep reinforcement learning (DRL)-based approaches to FJSP predominantly employ constructive methods.<n>This paper proposes a Memory-enhanced Improvement Search framework with heterogeneous graph representation--MIStar.
arXiv Detail & Related papers (2026-03-03T10:43:01Z) - Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods [5.278929538141005]
We benchmark a range of domain randomization (DR) and domain adaptation (DA) techniques, including feature-based methods, generative AI (GenAI) and classical rendering approaches.<n>Our evaluation focuses on the effectiveness and efficiency of low-level and high-level feature alignment, as well as a controlled diffusion-based DA method guided by prompts generated from real-world contexts.<n>Results show that if render-based data with enough variability is available as seed, simpler feature-based methods, such as brightness-based and perceptual hashing filtering, outperform more complex GenAI-based approaches in both accuracy and resource efficiency
arXiv Detail & Related papers (2025-11-28T14:51:08Z) - Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets [40.434003972007744]
We compare open-source simulation models with a real industry dataset to evaluate how optimization methods scale with different levels of complexity.<n>We show that our proposed Evolution Strategies-based method scales much better than a comparable policy-gradient-based approach.<n>We observe double-digit percentage improvement in tardiness and single digit percentage improvement in throughput by use of Evolution Strategies.
arXiv Detail & Related papers (2025-05-16T11:32:29Z) - Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map [50.21082069320818]
We propose a novel diffusion-based pipeline for generating high-fidelity industrial datasets with minimal supervision.<n>Our approach conditions the diffusion model on enriched bounding box representations to produce precise segmentation masks.<n>Results demonstrate that diffusion-based synthesis can bridge the gap between artificial and real-world industrial data.
arXiv Detail & Related papers (2025-05-06T15:21:36Z) - Robo-taxi Fleet Coordination at Scale via Reinforcement Learning [21.266509380044912]
This work introduces a novel decision-making framework that unites mathematical modeling with data-driven techniques.
We present the AMoD coordination problem through the lens of reinforcement learning and propose a graph network-based framework.
In particular, we present the AMoD coordination problem through the lens of reinforcement learning and propose a graph network-based framework.
arXiv Detail & Related papers (2025-04-08T15:19:41Z) - Robust Offline Imitation Learning Through State-level Trajectory Stitching [37.281554320048755]
Imitation learning (IL) has proven effective for enabling robots to acquire visuomotor skills through expert demonstrations.
Recent advances in offline IL have incorporated suboptimal, unlabeled datasets into the training.
We propose a novel approach to enhance policy learning from mixed-quality offline datasets by leveraging task-relevant trajectory fragments and rich environmental dynamics.
arXiv Detail & Related papers (2025-03-28T15:28:36Z) - Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift [51.24522135151649]
Anomaly detection plays a crucial role in quality control for industrial applications.
Existing methods attempt to address domain shifts by training generalizable models.
Our proposed method demonstrates superior results compared with state-of-the-art anomaly detection and domain adaptation methods.
arXiv Detail & Related papers (2025-03-19T05:25:52Z) - DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development.
We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents.
We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z) - What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration.
We identify the critical limitations of regression-based methods with the widely used data generation pipeline.
We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z) - Exploring Large Vision-Language Models for Robust and Efficient Industrial Anomaly Detection [4.691083532629246]
We propose Vision-Language Anomaly Detection via Contrastive Cross-Modal Training (CLAD)
CLAD aligns visual and textual features into a shared embedding space using contrastive learning.
We demonstrate that CLAD outperforms state-of-the-art methods in both image-level anomaly detection and pixel-level anomaly localization.
arXiv Detail & Related papers (2024-12-01T17:00:43Z) - Automated Defect Detection and Grading of Piarom Dates Using Deep Learning [0.0]
We propose an innovative deep learning framework designed specifically for the real-time detection, classification, and grading of Piarom dates.
Our framework integrates state-of-the-art object detection algorithms and Convolutional Neural Networks (CNNs) to achieve high precision in defect identification.
Experimental results demonstrate that our system significantly outperforms existing methods in terms of accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-23T18:25:20Z) - VARADE: a Variational-based AutoRegressive model for Anomaly Detection on the Edge [7.4646496981460855]
This work presents a novel solution implementing a light autoregressive framework based on variational inference, which is best suited for real-time execution on the edge.
The proposed approach was validated on a robotic arm, part of a pilot production line, and compared with several state-of-the-art algorithms.
arXiv Detail & Related papers (2024-09-23T08:46:15Z) - ALow-Cost Real-Time Framework for Industrial Action Recognition Using Foundation Models [8.654703129948901]
Action recognition in industrial environments faces persistent challenges due to high deployment costs, poor cross-scenario generalization, and limited real-time performance.<n>We propose a low-cost real-time framework for industrial action recognition using foundation models, denoted as LRIAR, to enhance recognition accuracy and transferability.
arXiv Detail & Related papers (2024-03-13T11:11:59Z) - Machine learning for industrial sensing and control: A survey and
practical perspective [7.678648424345052]
We identify key statistical and machine learning techniques that have seen practical success in the process industries.
Soft sensing contains a wealth of industrial applications of statistical and machine learning methods.
We consider two distinct flavors for data-driven optimization and control: hybrid modeling in conjunction with mathematical programming techniques and reinforcement learning.
arXiv Detail & Related papers (2024-01-24T22:27:04Z) - A Microservices Identification Method Based on Spectral Clustering for
Industrial Legacy Systems [5.255685751491305]
We propose an automated microservice decomposition method for extracting microservice candidates based on spectral graph theory.
We show that our method can yield favorable results even without the involvement of domain experts.
arXiv Detail & Related papers (2023-12-20T07:47:01Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Deep Learning based pipeline for anomaly detection and quality
enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space.
This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z) - Toward Fault Detection in Industrial Welding Processes with Deep
Learning and Data Augmentation [0.0]
This paper addresses the challenges on the industrial realization of the AI tools.
We use object detection algorithms from the object detection API and adapt them to our use case using transfer learning.
We find that moderate scaling of the dataset via image augmentation leads to improvements in intersection over union (IoU) and recall.
arXiv Detail & Related papers (2021-06-18T14:52:49Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Anomaly Detection Based on Selection and Weighting in Latent Space [73.01328671569759]
We propose a novel selection-and-weighting-based anomaly detection framework called SWAD.
Experiments on both benchmark and real-world datasets have shown the effectiveness and superiority of SWAD.
arXiv Detail & Related papers (2021-03-08T10:56:38Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.