Kineo: Calibration-Free Metric Motion Capture From Sparse RGB Cameras
- URL: http://arxiv.org/abs/2510.24464v2
- Date: Mon, 03 Nov 2025 12:48:15 GMT
- Title: Kineo: Calibration-Free Metric Motion Capture From Sparse RGB Cameras
- Authors: Charles Javerliat, Pierre Raimbaud, Guillaume Lavoué,
- Abstract summary: We present Kineo, a fully automatic, calibration-free pipeline for markerless motion capture from videos captured by unchronized uncalibrated RGB cameras.<n>A confidence-driven keypoint sampling strategy, combined with graph-based global optimization, ensures robust calibration at a fixed computational cost independent of sequence length.<n>Keino reduces camera translation error by approximately 83-85%, camera angular error by 86-92%, and world mean-per-joint error (W-MPJPE) by 83-91%.
- Score: 2.6941922156574267
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Markerless multiview motion capture is often constrained by the need for precise camera calibration, limiting accessibility for non-experts and in-the-wild captures. Existing calibration-free approaches mitigate this requirement but suffer from high computational cost and reduced reconstruction accuracy. We present Kineo, a fully automatic, calibration-free pipeline for markerless motion capture from videos captured by unsynchronized, uncalibrated, consumer-grade RGB cameras. Kineo leverages 2D keypoints from off-the-shelf detectors to simultaneously calibrate cameras, including Brown-Conrady distortion coefficients, and reconstruct 3D keypoints and dense scene point maps at metric scale. A confidence-driven spatio-temporal keypoint sampling strategy, combined with graph-based global optimization, ensures robust calibration at a fixed computational cost independent of sequence length. We further introduce a pairwise reprojection consensus score to quantify 3D reconstruction reliability for downstream tasks. Evaluations on EgoHumans and Human3.6M demonstrate substantial improvements over prior calibration-free methods. Compared to previous state-of-the-art approaches, Kineo reduces camera translation error by approximately 83-85%, camera angular error by 86-92%, and world mean-per-joint error (W-MPJPE) by 83-91%. Kineo is also efficient in real-world scenarios, processing multi-view sequences faster than their duration in specific configuration (e.g., 36min to process 1h20min of footage). The full pipeline and evaluation code are openly released to promote reproducibility and practical adoption at https://liris-xr.github.io/kineo/.
Related papers
- PTZ-Calib: Robust Pan-Tilt-Zoom Camera Calibration [32.4466455429431]
In this paper, we present a robust two-stage camera calibration method.<n>In the offline stage, we first uniformly select a set of reference images that sufficiently overlap to encompass a complete 360deg view.<n>In the online stage, we demonstrate the calibration of any new viewpoints as a relocalization problem.
arXiv Detail & Related papers (2025-02-13T08:45:43Z) - CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras [18.51320244029833]
It is now possible to estimate 3D human pose from monocular images with off-the-shelf 3D pose estimators.
Many practical applications require fine-grained absolute pose information for which multi-view cues and camera calibration are necessary.
Our goal is full automation, which includes temporal synchronization, as well as intrinsic and extrinsic camera calibration.
arXiv Detail & Related papers (2024-05-10T23:02:23Z) - PeLiCal: Targetless Extrinsic Calibration via Penetrating Lines for RGB-D Cameras with Limited Co-visibility [11.048526314073886]
We present PeLiCal, a novel line-based calibration approach for RGB-D camera systems exhibiting limited overlap.
Our method leverages long line features from surroundings, and filters out outliers with a novel convergence voting algorithm.
arXiv Detail & Related papers (2024-04-22T07:50:24Z) - W-HMR: Monocular Human Mesh Recovery in World Space with Weak-Supervised Calibration [57.37135310143126]
Previous methods for 3D motion recovery from monocular images often fall short due to reliance on camera coordinates.
We introduce W-HMR, a weak-supervised calibration method that predicts "reasonable" focal lengths based on body distortion information.
We also present the OrientCorrect module, which corrects body orientation for plausible reconstructions in world space.
arXiv Detail & Related papers (2023-11-29T09:02:07Z) - Pixel-wise Smoothing for Certified Robustness against Camera Motion
Perturbations [45.576866560987405]
We present a framework for certifying the robustness of 3D-2D projective transformations against camera motion perturbations.
Our approach leverages a smoothing distribution over the 2D pixel space instead of in the 3D physical space.
Our approach achieves approximately 80% certified accuracy while utilizing only 30% of the projected image frames.
arXiv Detail & Related papers (2023-09-22T19:15:49Z) - Self-Supervised Camera Self-Calibration from Video [34.35533943247917]
We propose a learning algorithm to regress per-sequence calibration parameters using an efficient family of general camera models.
Our procedure achieves self-calibration results with sub-pixel reprojection error, outperforming other learning-based methods.
arXiv Detail & Related papers (2021-12-06T19:42:05Z) - Unsupervised Depth Completion with Calibrated Backprojection Layers [79.35651668390496]
We propose a deep neural network architecture to infer dense depth from an image and a sparse point cloud.
It is trained using a video stream and corresponding synchronized sparse point cloud, as obtained from a LIDAR or other range sensor, along with the intrinsic calibration parameters of the camera.
At inference time, the calibration of the camera, which can be different from the one used for training, is fed as an input to the network along with the sparse point cloud and a single image.
arXiv Detail & Related papers (2021-08-24T05:41:59Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - Infrastructure-based Multi-Camera Calibration using Radial Projections [117.22654577367246]
Pattern-based calibration techniques can be used to calibrate the intrinsics of the cameras individually.
Infrastucture-based calibration techniques are able to estimate the extrinsics using 3D maps pre-built via SLAM or Structure-from-Motion.
We propose to fully calibrate a multi-camera system from scratch using an infrastructure-based approach.
arXiv Detail & Related papers (2020-07-30T09:21:04Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.