No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static
Models by Fitting Feature-level Space-time Surfaces
- URL: http://arxiv.org/abs/2203.11113v2
- Date: Wed, 23 Mar 2022 16:36:25 GMT
- Title: No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static
Models by Fitting Feature-level Space-time Surfaces
- Authors: Jia-Xing Zhong, Kaichen Zhou, Qingyong Hu, Bing Wang, Niki Trigoni,
Andrew Markham
- Abstract summary: We propose a kinematics-inspired neural network (Kinet) to capture 3D motions without explicitly tracking correspondences.
Kinet implicitly encodes feature-level dynamics and gains advantages from the use of mature backbones for static point cloud processing.
Kinet achieves the accuracy of 93.27% on MSRAction-3D with only 3.20M parameters and 10.35G FLOPS.
- Score: 46.8891422128
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene flow is a powerful tool for capturing the motion field of 3D point
clouds. However, it is difficult to directly apply flow-based models to dynamic
point cloud classification since the unstructured points make it hard or even
impossible to efficiently and effectively trace point-wise correspondences. To
capture 3D motions without explicitly tracking correspondences, we propose a
kinematics-inspired neural network (Kinet) by generalizing the kinematic
concept of ST-surfaces to the feature space. By unrolling the normal solver of
ST-surfaces in the feature space, Kinet implicitly encodes feature-level
dynamics and gains advantages from the use of mature backbones for static point
cloud processing. With only minor changes in network structures and low
computing overhead, it is painless to jointly train and deploy our framework
with a given static model. Experiments on NvGesture, SHREC'17, MSRAction-3D,
and NTU-RGBD demonstrate its efficacy in performance, efficiency in both the
number of parameters and computational complexity, as well as its versatility
to various static backbones. Noticeably, Kinet achieves the accuracy of 93.27%
on MSRAction-3D with only 3.20M parameters and 10.35G FLOPS.
Related papers
- Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories [28.701879490459675]
We aim to learn an implicit motion field parameterized by a neural network to predict the movement of novel points within same domain.
We exploit intrinsic regularization provided by SIREN, and modify the input layer to produce atemporally smooth motion field.
Our experiments assess the model's performance in predicting unseen point trajectories and its application in temporal mesh alignment with deformation.
arXiv Detail & Related papers (2024-06-05T21:02:10Z) - Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - StarNet: Style-Aware 3D Point Cloud Generation [82.30389817015877]
StarNet is able to reconstruct and generate high-fidelity and even 3D point clouds using a mapping network.
Our framework achieves comparable state-of-the-art performance on various metrics in the point cloud reconstruction and generation tasks.
arXiv Detail & Related papers (2023-03-28T08:21:44Z) - DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets [95.84755169585492]
We present Dynamic Sparse Voxel Transformer (DSVT), a single-stride window-based voxel Transformer backbone for outdoor 3D perception.
Our model achieves state-of-the-art performance with a broad range of 3D perception tasks.
arXiv Detail & Related papers (2023-01-15T09:31:58Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z) - DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic
Voxelization [0.0]
We propose a novel two-stage framework for the efficient 3D point cloud object detection.
We parse the raw point cloud data directly in the 3D space yet achieve impressive efficiency and accuracy.
We highlight our KITTI 3D object detection dataset with 75 FPS and on Open dataset with 25 FPS inference speed with satisfactory accuracy.
arXiv Detail & Related papers (2021-07-27T10:07:39Z) - Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid.
We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently.
We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z) - Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation
and Spatial Supervision [68.35777836993212]
We propose a Pseudo-LiDAR point cloud network to generate temporally and spatially high-quality point cloud sequences.
By exploiting the scene flow between point clouds, the proposed network is able to learn a more accurate representation of the 3D spatial motion relationship.
arXiv Detail & Related papers (2020-06-20T03:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.