3DMotion-Net: Learning Continuous Flow Function for 3D Motion Prediction
- URL: http://arxiv.org/abs/2006.13906v1
- Date: Wed, 24 Jun 2020 17:39:19 GMT
- Title: 3DMotion-Net: Learning Continuous Flow Function for 3D Motion Prediction
- Authors: Shuaihang Yuan, Xiang Li, Anthony Tzes, Yi Fang
- Abstract summary: We deal with the problem to predict the future 3D motions of 3D object scans from previous two consecutive frames.
We propose a self-supervised approach that leverages the power of the deep neural network to learn a continuous flow function of 3D point clouds.
We perform extensive experiments on D-FAUST, SCAPE and TOSCA benchmark data sets and the results demonstrate that our approach is capable of handling temporally inconsistent input.
- Score: 12.323767993152968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we deal with the problem to predict the future 3D motions of
3D object scans from previous two consecutive frames. Previous methods mostly
focus on sparse motion prediction in the form of skeletons. While in this paper
we focus on predicting dense 3D motions in the from of 3D point clouds. To
approach this problem, we propose a self-supervised approach that leverages the
power of the deep neural network to learn a continuous flow function of 3D
point clouds that can predict temporally consistent future motions and
naturally bring out the correspondences among consecutive point clouds at the
same time. More specifically, in our approach, to eliminate the unsolved and
challenging process of defining a discrete point convolution on 3D point cloud
sequences to encode spatial and temporal information, we introduce a learnable
latent code to represent the temporal-aware shape descriptor which is optimized
during model training. Moreover, a temporally consistent motion Morpher is
proposed to learn a continuous flow field which deforms a 3D scan from the
current frame to the next frame. We perform extensive experiments on D-FAUST,
SCAPE and TOSCA benchmark data sets and the results demonstrate that our
approach is capable of handling temporally inconsistent input and produces
consistent future 3D motion while requiring no ground truth supervision.
Related papers
- Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences [25.74000325019015]
We introduce a novel LiDAR 3D object detection framework, namely LiSTM, to facilitate spatial-temporal feature learning with cross-frame motion forecasting information.
We have conducted experiments on the aggregation and nuScenes datasets to demonstrate that the proposed framework achieves superior 3D detection performance.
arXiv Detail & Related papers (2024-09-06T16:29:04Z) - Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - Learning Spatial and Temporal Variations for 4D Point Cloud Segmentation [0.39373541926236766]
We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions.
We design a temporal variation-aware module and a temporal voxel-point refiner to capture the temporal variation in the 4D point cloud.
arXiv Detail & Related papers (2022-07-11T07:36:26Z) - RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point
Clouds [44.034836961967144]
3D motion estimation including scene flow and point cloud registration has drawn increasing interest.
Recent methods employ deep neural networks to construct the cost volume for estimating accurate 3D flow.
We decompose the problem into two interlaced stages, where the 3D flows are optimized point-wisely at the first stage and then globally regularized in a recurrent network at the second stage.
arXiv Detail & Related papers (2022-05-23T04:04:30Z) - Self-supervised Point Cloud Prediction Using 3D Spatio-temporal
Convolutional Networks [27.49539859498477]
Exploiting past 3D LiDAR scans to predict future point clouds is a promising method for autonomous mobile systems.
We propose an end-to-end approach that exploits a 2D range image representation of each 3D LiDAR scan.
We develop an encoder-decoder architecture using 3D convolutions to jointly aggregate spatial and temporal information of the scene.
arXiv Detail & Related papers (2021-09-28T19:58:13Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Spatio-temporal Self-Supervised Representation Learning for 3D Point
Clouds [96.9027094562957]
We introduce a-temporal representation learning framework, capable of learning from unlabeled tasks.
Inspired by how infants learn from visual data in the wild, we explore rich cues derived from the 3D data.
STRL takes two temporally-related frames from a 3D point cloud sequence as the input, transforms it with the spatial data augmentation, and learns the invariant representation self-supervisedly.
arXiv Detail & Related papers (2021-09-01T04:17:11Z) - Self-Attentive 3D Human Pose and Shape Estimation from Videos [82.63503361008607]
We present a video-based learning algorithm for 3D human pose and shape estimation.
We exploit temporal information in videos and propose a self-attention module.
We evaluate our method on the 3DPW, MPI-INF-3DHP, and Human3.6M datasets.
arXiv Detail & Related papers (2021-03-26T00:02:19Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - DeepTracking-Net: 3D Tracking with Unsupervised Learning of Continuous
Flow [12.690471276907445]
This paper deals with the problem of 3D tracking, i.e., to find dense correspondences in a sequence of time-varying 3D shapes.
We propose a novel unsupervised 3D shape framework named DeepTracking-Net, which uses deep neural networks (DNNs) as auxiliary functions.
In addition, we prepare a new synthetic 3D data, named SynMotions, to the 3D tracking and recognition community.
arXiv Detail & Related papers (2020-06-24T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.