About an Automating Annotation Method for Robot Markers
- URL: http://arxiv.org/abs/2601.22982v1
- Date: Fri, 30 Jan 2026 13:44:56 GMT
- Title: About an Automating Annotation Method for Robot Markers
- Authors: Wataru Uemura, Takeru Nagashima,
- Abstract summary: This paper proposes an automated annotation method for training deep-learning models on ArUco marker images.<n>A YOLO-based model is trained using the automatically annotated dataset, and its performance is evaluated under various conditions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Factory automation has become increasingly important due to labor shortages, leading to the introduction of autonomous mobile robots for tasks such as material transportation. Markers are commonly used for robot self-localization and object identification. In the RoboCup Logistics League (RCLL), ArUco markers are employed both for robot localization and for identifying processing modules. Conventional recognition relies on OpenCV-based image processing, which detects black-and-white marker patterns. However, these methods often fail under noise, motion blur, defocus, or varying illumination conditions. Deep-learning-based recognition offers improved robustness under such conditions, but requires large amounts of annotated data. Annotation must typically be done manually, as the type and position of objects cannot be detected automatically, making dataset preparation a major bottleneck. In contrast, ArUco markers include built-in recognition modules that provide both ID and positional information, enabling automatic annotation. This paper proposes an automated annotation method for training deep-learning models on ArUco marker images. By leveraging marker detection results obtained from the ArUco module, the proposed approach eliminates the need for manual labeling. A YOLO-based model is trained using the automatically annotated dataset, and its performance is evaluated under various conditions. Experimental results demonstrate that the proposed method improves recognition performance compared with conventional image-processing techniques, particularly for images affected by blur or defocus. Automatic annotation also reduces human effort and ensures consistent labeling quality. Future work will investigate the relationship between confidence thresholds and recognition performance.
Related papers
- AutoOcc: Automatic Open-Ended Semantic Occupancy Annotation via Vision-Language Guided Gaussian Splatting [46.677120329555486]
AutoOcc is a vision-centric automated pipeline for semantic occupancy annotation.<n>We formulate the open-ended semantic 3D occupancy reconstruction task to automatically generate scene occupancy.<n>Our framework outperforms existing automated occupancy annotation methods without human labels.
arXiv Detail & Related papers (2025-02-07T14:58:59Z) - Automatic Image Annotation for Mapped Features Detection [6.300346102366891]
Road features are a key enabler for autonomous driving and localization.<n>Modern deep learning-based perception systems need a significant amount of annotated data.<n>In this paper, we consider the fusion of three automatic annotation methods in images.
arXiv Detail & Related papers (2024-12-11T09:06:52Z) - Feedback-driven object detection and iterative model improvement [2.3700911865675187]
We present the development and evaluation of a platform designed to interactively improve object detection models.<n>The platform allows uploading and annotating images as well as fine-tuning object detection models.<n>We show evidence for a significant time reduction of up to 53% for semi-automatic compared to manual annotation.
arXiv Detail & Related papers (2024-11-29T16:45:25Z) - Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics.
Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features.
We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z) - Kalib: Easy Hand-Eye Calibration with Reference Point Tracking [52.4190876409222]
Kalib is an automatic hand-eye calibration method that leverages the generalizability of visual foundation models to overcome challenges.<n>During calibration, a kinematic reference point is tracked in the camera coordinate 3D coordinates in the space behind the robot.<n>Kalib's user-friendly design and minimal setup requirements make it a possible solution for continuous operation in unstructured environments.
arXiv Detail & Related papers (2024-08-20T06:03:40Z) - A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation [22.440065488051047]
Key challenge for the widespread application of learning-based models for robotic perception is to significantly reduce the required amount of annotated training data.<n>We exploit the groundwork paved by visual foundation models to train two lightweight network heads for semantic segmentation and object boundary detection.<n>We demonstrate that PASTEL significantly outperforms previous methods for label-efficient segmentation even when using fewer annotations.
arXiv Detail & Related papers (2024-05-29T12:23:29Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - Repurposing SAM for User-Defined Semantics Aware Segmentation [22.682715573404124]
We propose U-SAM, a novel framework that imbibes semantic awareness into SAM.<n>U-SAM provides pixel-level semantic annotations for images without requiring any labeled/unlabeled samples from the test data distribution.<n>We evaluate U-SAM on PASCAL VOC 2012 and MSCOCO-80, achieving significant mIoU improvements of +17.95% and +520%, respectively.
arXiv Detail & Related papers (2023-12-05T01:37:18Z) - Helping Hands: An Object-Aware Ego-Centric Video Recognition Model [60.350851196619296]
We introduce an object-aware decoder for improving the performance of ego-centric representations on ego-centric videos.
We show that the model can act as a drop-in replacement for an ego-awareness video model to improve performance through visual-text grounding.
arXiv Detail & Related papers (2023-08-15T17:58:11Z) - EasyHeC: Accurate and Automatic Hand-eye Calibration via Differentiable
Rendering and Space Exploration [49.90228618894857]
We introduce a new approach to hand-eye calibration called EasyHeC, which is markerless, white-box, and delivers superior accuracy and robustness.
We propose to use two key technologies: differentiable rendering-based camera pose optimization and consistency-based joint space exploration.
Our evaluation demonstrates superior performance in synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-02T03:49:54Z) - Self-Supervised Clustering on Image-Subtracted Data with Deep-Embedded
Self-Organizing Map [0.0]
Self-supervised machine learning model, the deep-embedded self-organizing map (DESOM) is applied to real-bogus classification problem.
We demonstrate different model training approaches, and find that our best DESOM classifier shows a missed detection rate of 6.6% with a false positive rate of 1.5%.
arXiv Detail & Related papers (2022-09-14T02:37:06Z) - MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point
Cloud Action Recognition [160.49403075559158]
We propose a Masked Pseudo-Labeling autoEncoder (textbfMAPLE) framework for point cloud action recognition.
In particular, we design a novel and efficient textbfDecoupled textbfspatial-textbftemporal TranstextbfFormer (textbfDestFormer) as the backbone of MAPLE.
MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08% accuracy on the MSR-Action3
arXiv Detail & Related papers (2022-09-01T12:32:40Z) - Learning Task Automata for Reinforcement Learning using Hidden Markov
Models [37.69303106863453]
This paper proposes a novel pipeline for learning non-Markovian task specifications as succinct finite-state task automata'
We learn a product MDP, a model composed of the specification's automaton and the environment's MDP, by treating the product MDP as a partially observable MDP and using the well-known Baum-Welch algorithm for learning hidden Markov models.
Our learnt task automaton enables the decomposition of a task into its constituent sub-tasks, which improves the rate at which an RL agent can later synthesise an optimal policy.
arXiv Detail & Related papers (2022-08-25T02:58:23Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - ID-Conditioned Auto-Encoder for Unsupervised Anomaly Detection [0.0]
We introduce ID-Conditioned Auto-Encoder for unsupervised anomaly detection.
Our method is an adaptation of the Class-Conditioned Auto-Encoder (C2AE) designed for the open-set recognition.
We evaluate our method on the ToyADMOS and MIMII datasets from the DCASE 2020 Challenge Task 2.
arXiv Detail & Related papers (2020-07-10T11:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.