Related papers: DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers

DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers

URL: http://arxiv.org/abs/2206.08026v1
Date: Thu, 16 Jun 2022 09:29:26 GMT
Title: DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers
Authors: Mustafa B. Yaldiz, Andreas Meuleman, Hyeonjoong Jang, Hyunho Ha, Min H. Kim
Abstract summary: Existing detection methods assume that markers are printed on ideally planar surfaces. A fiducial marker generator creates a set of free-form color patterns to encode significantly large-scale information. A differentiable image simulator creates a training dataset of photorealistic scene images with the deformed markers. A trained marker detector seeks the regions of interest and recognizes multiple marker patterns simultaneously.
Score: 27.135078472097895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fiducial markers have been broadly used to identify objects or embed messages that can be detected by a camera. Primarily, existing detection methods assume that markers are printed on ideally planar surfaces. Markers often fail to be recognized due to various imaging artifacts of optical/perspective distortion and motion blur. To overcome these limitations, we propose a novel deformable fiducial marker system that consists of three main parts: First, a fiducial marker generator creates a set of free-form color patterns to encode significantly large-scale information in unique visual codes. Second, a differentiable image simulator creates a training dataset of photorealistic scene images with the deformed markers, being rendered during optimization in a differentiable manner. The rendered images include realistic shading with specular reflection, optical distortion, defocus and motion blur, color alteration, imaging noise, and shape deformation of markers. Lastly, a trained marker detector seeks the regions of interest and recognizes multiple marker patterns simultaneously via inverse deformation transformation. The deformable marker creator and detector networks are jointly optimized via the differentiable photorealistic renderer in an end-to-end manner, allowing us to robustly recognize a wide range of deformable markers with high accuracy. Our deformable marker system is capable of decoding 36-bit messages successfully at ~29 fps with severe shape deformation. Results validate that our system significantly outperforms the traditional and data-driven marker methods. Our learning-based marker system opens up new interesting applications of fiducial markers, including cost-effective motion capture of the human body, active 3D scanning using our fiducial markers' array as structured light patterns, and robust augmented reality rendering of virtual objects on dynamic surfaces.

Related papers

Edge-case Synthesis for Fisheye Object Detection: A Data-centric Perspective [2.4603149388689514]
Fisheye cameras introduce significant distortion and pose unique challenges to object detection models trained on conventional datasets.<n>We propose a data-centric pipeline that systematically improves detection performance by focusing on the key question of identifying the blind spots of the model.
arXiv Detail & Related papers (2025-07-22T06:07:07Z)
MAMMA: Markerless & Automatic Multi-Person Motion Action Capture [37.06717786024836]
MAMMA is a markerless motion-capture pipeline that recovers SMPL-X parameters from multi-view video of two-person interaction sequences.<n>We introduce a method that predicts dense 2D surface landmarks conditioned on segmentation masks.<n>We demonstrate that our approach can handle complex person--person interaction and offers greater accuracy than existing methods.
arXiv Detail & Related papers (2025-06-16T02:04:51Z)
New Efficient Visual OILU Markers [0.5120567378386615]
We will exploit basic patterns to develop new efficient visual markers. The proposed markers allow producing rich panel of unique identifiers. The robustness of the markers against acquisition and geometric distortions is validated.
arXiv Detail & Related papers (2024-04-12T13:55:05Z)
OsmLocator: locating overlapping scatter marks with a non-training generative perspective [48.50108853199417]
Locating overlapping marks faces many difficulties such as no texture, less contextual information, hallow shape and tiny size. Here, we formulate it as a optimization problem on clustering-based re-visualization from a non-training generative perspective. We especially built a dataset named 2023 containing hundreds of scatter images with different markers and various levels of overlapping severity, and tested the proposed method and compared it to existing methods.
arXiv Detail & Related papers (2023-12-18T12:39:48Z)
Differentiable Registration of Images and LiDAR Point Clouds with VoxelPoint-to-Pixel Matching [58.10418136917358]
Cross-modality registration between 2D images from cameras and 3D point clouds from LiDARs is a crucial task in computer vision and robotic training. Previous methods estimate 2D-3D correspondences by matching point and pixel patterns learned by neural networks. We learn a structured cross-modality matching solver to represent 3D features via a different latent pixel space.
arXiv Detail & Related papers (2023-12-07T05:46:10Z)
A Locality-based Neural Solver for Optical Motion Capture [37.28597049192196]
Given noisy marker data, we propose a new heterogeneous graph neural network which treats markers and joints as different types of nodes. We show that our method outperforms state-of-the-art methods in terms of prediction accuracy of occluded marker position error.
arXiv Detail & Related papers (2023-09-01T12:40:17Z)
Generalizable Person Re-Identification via Viewpoint Alignment and Fusion [74.30861504619851]
This work proposes to use a 3D dense pose estimation model and a texture mapping module to map pedestrian images to canonical view images. Due to the imperfection of the texture mapping module, the canonical view images may lose the discriminative detail clues from the original images. We show that our method can lead to superior performance over the existing approaches in various evaluation settings.
arXiv Detail & Related papers (2022-12-05T16:24:09Z)
NeuralMarker: A Framework for Learning General Marker Correspondence [25.822657926255573]
We tackle the problem of estimating correspondences from a general marker, such as a movie poster, to an image that captures such a marker. We propose a novel framework NeuralMarker, training a neural network estimating dense marker correspondences under various challenging conditions. We show that NeuralMarker significantly outperforms previous methods and enables new interesting applications, including Augmented Reality (AR) and video editing.
arXiv Detail & Related papers (2022-09-19T10:04:38Z)
Image Understands Point Cloud: Weakly Supervised 3D Semantic Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images. Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels. Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z)
DeepTag: A General Framework for Fiducial Marker Design and Detection [1.2180122937388957]
We propose a general deep learning based framework, DeepTag, for fiducial marker design and detection. DeepTag supports detection of a wide variety of existing marker families and makes it possible to design new marker families with customized local patterns. Experiments show that DeepTag well supports different marker families and greatly outperforms the existing methods in terms of both detection robustness and pose accuracy.
arXiv Detail & Related papers (2021-05-28T10:54:59Z)
MarkerPose: Robust Real-time Planar Target Tracking for Accurate Stereo Pose Estimation [0.0]
MarkerPose is a real-time pose estimation system based on a planar target of three circles and a stereo vision system. Our method consists of two deep neural networks for marker point detection. We demonstrate the suitability of MarkerPose in a 3D freehand ultrasound system.
arXiv Detail & Related papers (2021-05-02T01:09:13Z)
Supervision by Registration and Triangulation for Landmark Detection [70.13440728689231]
We present Supervision by Registration and Triangulation (SRT), an unsupervised approach that utilizes unlabeled multi-view video to improve the accuracy and precision of landmark detectors. Being able to utilize unlabeled data enables our detectors to learn from massive amounts of unlabeled data freely available.
arXiv Detail & Related papers (2021-01-25T02:48:21Z)
Deep Soft Procrustes for Markerless Volumetric Sensor Alignment [81.13055566952221]
In this work, we improve markerless data-driven correspondence estimation to achieve more robust multi-sensor spatial alignment. We incorporate geometric constraints in an end-to-end manner into a typical segmentation based model and bridge the intermediate dense classification task with the targeted pose estimation one. Our model is experimentally shown to achieve similar results with marker-based methods and outperform the markerless ones, while also being robust to the pose variations of the calibration structure.
arXiv Detail & Related papers (2020-03-23T10:51:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.