Semantic keypoint extraction for scanned animals using
multi-depth-camera systems
- URL: http://arxiv.org/abs/2211.08634v1
- Date: Wed, 16 Nov 2022 03:06:17 GMT
- Title: Semantic keypoint extraction for scanned animals using
multi-depth-camera systems
- Authors: Raphael Falque and Teresa Vidal-Calleja and Alen Alempijevic
- Abstract summary: Keypoint annotation in point clouds is an important task for 3D reconstruction, object tracking and alignment.
In the context of agriculture, it is a critical task for livestock automation to work toward condition assessment or behaviour recognition.
We propose a novel approach for semantic keypoint annotation in point clouds, by reformulating the keypoint extraction as a regression problem.
Our method is tested on data collected in the field, on moving beef cattle, with a calibrated system of multiple hardware-synchronised RGB-D cameras.
- Score: 2.513785998932353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Keypoint annotation in point clouds is an important task for 3D
reconstruction, object tracking and alignment, in particular in deformable or
moving scenes. In the context of agriculture robotics, it is a critical task
for livestock automation to work toward condition assessment or behaviour
recognition. In this work, we propose a novel approach for semantic keypoint
annotation in point clouds, by reformulating the keypoint extraction as a
regression problem of the distance between the keypoints and the rest of the
point cloud. We use the distance on the point cloud manifold mapped into a
radial basis function (RBF), which is then learned using an encoder-decoder
architecture. Special consideration is given to the data augmentation specific
to multi-depth-camera systems by considering noise over the extrinsic
calibration and camera frame dropout. Additionally, we investigate
computationally efficient non-rigid deformation methods that can be applied to
animal point clouds. Our method is tested on data collected in the field, on
moving beef cattle, with a calibrated system of multiple hardware-synchronised
RGB-D cameras.
Related papers
- Trainable Pointwise Decoder Module for Point Cloud Segmentation [12.233802912441476]
Point cloud segmentation (PCS) aims to make per-point predictions and enables robots and autonomous driving cars to understand the environment.
We propose a trainable pointwise decoder module (PDM) as the post-processing approach.
We also introduce a virtual range image-guided copy-rotate-paste strategy in data augmentation.
arXiv Detail & Related papers (2024-08-02T19:29:35Z) - Track Initialization and Re-Identification for~3D Multi-View Multi-Object Tracking [12.389483990547223]
We propose a 3D multi-object tracking (MOT) solution using only 2D detections from monocular cameras.
We exploit the 2D detections and extracted features from multiple cameras to provide a better approximation of the multi-object filtering density.
arXiv Detail & Related papers (2024-05-28T21:36:16Z) - PosDiffNet: Positional Neural Diffusion for Point Cloud Registration in
a Large Field of View with Perturbations [27.45001809414096]
PosDiffNet is a model for point cloud registration in 3D computer vision.
We leverage a graph neural partial differential equation (PDE) based on Beltrami flow to obtain high-dimensional features.
We employ the multi-level correspondence derived from the high feature similarity scores to facilitate alignment between point clouds.
We evaluate PosDiffNet on several 3D point cloud datasets, verifying that it achieves state-of-the-art (SOTA) performance for point cloud registration in large fields of view with perturbations.
arXiv Detail & Related papers (2024-01-06T08:58:15Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Ponder: Point Cloud Pre-training via Neural Rendering [93.34522605321514]
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural encoders.
The learned point-cloud can be easily integrated into various downstream tasks, including not only high-level rendering tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image rendering.
arXiv Detail & Related papers (2022-12-31T08:58:39Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z) - Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud
Object Detection [64.2159881697615]
Object detection from 3D point clouds remains a challenging task, though recent studies pushed the envelope with the deep learning techniques.
We propose a domain adaptation like approach to enhance the robustness of the feature representation.
Our simple yet effective approach fundamentally boosts the performance of 3D point cloud object detection and achieves the state-of-the-art results.
arXiv Detail & Related papers (2020-06-08T05:15:06Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z) - View Invariant Human Body Detection and Pose Estimation from Multiple
Depth Sensors [0.7080990243618376]
We propose an end-to-end multi-person 3D pose estimation network, Point R-CNN, using multiple point cloud sources.
We conduct extensive experiments to simulate challenging real world cases, such as individual camera failures, various target appearances, and complex cluttered scenes.
In the meantime, we show our end-to-end network greatly outperforms cascaded state-of-the-art models.
arXiv Detail & Related papers (2020-05-08T19:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.