Related papers: Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration

Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration

URL: http://arxiv.org/abs/2409.19856v1
Date: Mon, 30 Sep 2024 01:25:48 GMT
Title: Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration
Authors: Salaar Saraj, Gregory Shklovski, Kristopher Irizarry, Jonathan Vet, Yutian Ren,
Abstract summary: Human-Robot Collaboration (HRC) is vital in Industry 4.0, using sensors, digital twins, collaborative robots (cobots) and intention-recognition models to have efficient manufacturing processes. We address concept drift by integrating Adaptive Intelligence and self-labeling to improve the resilience of intention-recognition in an HRC system.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human-Robot Collaboration (HRC) is vital in Industry 4.0, using sensors, digital twins, collaborative robots (cobots), and intention-recognition models to have efficient manufacturing processes. However, Concept Drift is a significant challenge, where robots struggle to adapt to new environments. We address concept drift by integrating Adaptive Intelligence and self-labeling (SLB) to improve the resilience of intention-recognition in an HRC system. Our methodology begins with data collection using cameras and weight sensors, which is followed by annotation of intentions and state changes. Then we train various deep learning models with different preprocessing techniques for recognizing and predicting the intentions. Additionally, we developed a custom state detection algorithm for enhancing the accuracy of SLB, offering precise state-change definitions and timestamps to label intentions. Our results show that the MViT2 model with skeletal posture preprocessing achieves an accuracy of 83% on our data environment, compared to the 79% accuracy of MViT2 without skeleton posture extraction. Additionally, our SLB mechanism achieves a labeling accuracy of 91%, reducing a significant amount of time that would've been spent on manual annotation. Lastly, we observe swift scaling of model performance that combats concept drift by fine tuning on different increments of self-labeled data in a shifted domain that has key differences from the original training environment.. This study demonstrates the potential for rapid deployment of intelligent cobots in manufacturing through the steps shown in our methodology, paving a way for more adaptive and efficient HRC systems.

Related papers

Privacy-Preserving Driver Drowsiness Detection with Spatial Self-Attention and Federated Learning [10.067641629547014]
Driver drowsiness is one of the main causes of road accidents and is recognized as a leading contributor to traffic-related fatalities.<n>We propose a novel framework for drowsiness detection that is designed to work effectively with heterogeneous and decentralized data.
arXiv Detail & Related papers (2025-08-01T03:12:01Z)
SSSUMO: Real-Time Semi-Supervised Submovement Decomposition [0.6499759302108926]
Submovement analysis offers valuable insights into motor control.<n>Existing methods struggle with reconstruction accuracy, computational cost, and validation.<n>We address these challenges using a semi-supervised learning framework.
arXiv Detail & Related papers (2025-07-08T21:26:25Z)
Smooth-Distill: A Self-distillation Framework for Multitask Learning with Wearable Sensor Data [0.0]
This paper introduces Smooth-Distill, a novel self-distillation framework designed to simultaneously perform human activity recognition (HAR) and sensor placement detection.<n>Unlike conventional distillation methods that require separate teacher and student models, the proposed framework utilizes a smoothed, historical version of the model itself as the teacher.<n> Experimental results show that Smooth-Distill consistently outperforms alternative approaches across different evaluation scenarios.
arXiv Detail & Related papers (2025-06-27T06:51:51Z)
SETransformer: A Hybrid Attention-Based Architecture for Robust Human Activity Recognition [7.291558599547268]
Human Activity Recognition (HAR) using wearable sensor data has become a central task in mobile computing, healthcare, and human-computer interaction.<n>We propose SETransformer, a hybrid deep neural architecture that combines Transformer-based temporal modeling with channel-wise squeeze-and-excitation (SE) attention and a learnable temporal attention pooling mechanism.<n>We evaluate SETransformer on the WISDM dataset and demonstrate that it significantly outperforms conventional models including LSTM, GRU, BiLSTM, and CNN baselines.
arXiv Detail & Related papers (2025-05-25T23:39:34Z)
Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks. We introduce a generative framework leveraging flow matching for online robot dynamics model alignment. We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z)
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning [67.72413262980272]
Pre-trained vision models (PVMs) are fundamental to modern robotics, yet their optimal configuration remains unclear. We develop SlotMIM, a method that induces object-centric representations by introducing a semantic bottleneck. Our approach achieves significant improvements over prior work in image recognition, scene understanding, and robot learning evaluations.
arXiv Detail & Related papers (2025-03-10T06:18:31Z)
AutoML for Multi-Class Anomaly Compensation of Sensor Drift [44.63945828405864]
Sensor drift degrades the performance of machine learning models over time. Standard cross-validation method overestimates performance by inadequately accounting for drift. This paper presents two solutions: (1) a novel sensor drift compensation learning paradigm for validating models, and (2) automated machine learning (AutoML) techniques to enhance classification performance and compensate sensor drift.
arXiv Detail & Related papers (2025-02-26T14:34:53Z)
What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy. By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z)
In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators [11.389756788049944]
Testing autonomous robotic manipulators is challenging due to the complex software interactions between vision and control components. A crucial element of modern robotic manipulators is the deep learning based object detection model. We propose the MARTENS framework, which integrates a photorealistic NVIDIA Isaac Sim simulator with evolutionary search to identify critical scenarios.
arXiv Detail & Related papers (2024-10-25T03:10:42Z)
Time to Retrain? Detecting Concept Drifts in Machine Learning Systems [1.4499463058550683]
We propose a model-agnostic technique (CDSeer) for detecting concept drift in machine learning (ML) models. Results show that CDSeer has better precision and recall compared to the state-of-the-art while requiring significantly less manual labeling. The improved performance and ease of adoption of CDSeer are valuable in making ML systems more reliable.
arXiv Detail & Related papers (2024-10-11T18:47:39Z)
Scheduled Knowledge Acquisition on Lightweight Vector Symbolic Architectures for Brain-Computer Interfaces [18.75591257735207]
Classical feature engineering is computationally efficient but has low accuracy, whereas the recent neural networks (DNNs) improve accuracy but are computationally expensive and incur high latency. As a promising alternative, the low-dimensional computing (LDC) classifier based on vector symbolic architecture (VSA), achieves small model size yet higher accuracy than classical feature engineering methods.
arXiv Detail & Related papers (2024-03-18T01:06:29Z)
Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory [64.11870454160614]
We propose an efficient Adaptive HOI Detector with Concept-guided Memory (ADA-CM) ADA-CM has two operating modes. The first mode makes it tunable without learning new parameters in a training-free paradigm. Our proposed method achieves competitive results with state-of-the-art on the HICO-DET and V-COCO datasets with much less training time.
arXiv Detail & Related papers (2023-09-07T13:10:06Z)
EasyHeC: Accurate and Automatic Hand-eye Calibration via Differentiable Rendering and Space Exploration [49.90228618894857]
We introduce a new approach to hand-eye calibration called EasyHeC, which is markerless, white-box, and delivers superior accuracy and robustness. We propose to use two key technologies: differentiable rendering-based camera pose optimization and consistency-based joint space exploration. Our evaluation demonstrates superior performance in synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-02T03:49:54Z)
Improving Variational Autoencoder based Out-of-Distribution Detection for Embedded Real-time Applications [2.9327503320877457]
Out-of-distribution (OD) detection is an emerging approach to address the challenge of detecting out-of-distribution in real-time. In this paper, we show how we can robustly detect hazardous motion around autonomous driving agents. Our methods significantly improve detection capabilities of OoD factors to unique driving scenarios, 42% better than state-of-the-art approaches. Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented.
arXiv Detail & Related papers (2021-07-25T07:52:53Z)
Domain Adaptive Robotic Gesture Recognition with Unsupervised Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot. It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture. Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z)
Where is my hand? Deep hand segmentation for visual self-recognition in humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view. We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z)
Online Body Schema Adaptation through Cost-Sensitive Active Learning [63.84207660737483]
The work was implemented in a simulation environment, using the 7DoF arm of the iCub robot simulator. A cost-sensitive active learning approach is used to select optimal joint configurations. The results show cost-sensitive active learning has similar accuracy to the standard active learning approach, while reducing in about half the executed movement.
arXiv Detail & Related papers (2021-01-26T16:01:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.