Masked Spatio-Temporal Structure Prediction for Self-supervised Learning
on Point Cloud Videos
- URL: http://arxiv.org/abs/2308.09245v1
- Date: Fri, 18 Aug 2023 02:12:54 GMT
- Title: Masked Spatio-Temporal Structure Prediction for Self-supervised Learning
on Point Cloud Videos
- Authors: Zhiqiang Shen and Xiaoxiao Sheng and Hehe Fan and Longguang Wang and
Yulan Guo and Qiong Liu and Hao Wen and Xi Zhou
- Abstract summary: We propose a Masked-temporal Structure Prediction (MaST-Pre) method to capture the structure of point cloud videos without human annotations.
MaST-Pre consists of two self-supervised learning tasks. First, by reconstructing masked point tubes, our method is able to capture appearance information of point cloud videos.
Second, to learn motion, we propose a temporal cardinality difference prediction task that estimates the change in the number of points within a point tube.
- Score: 75.9251839023226
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the community has made tremendous progress in developing effective
methods for point cloud video understanding that learn from massive amounts of
labeled data. However, annotating point cloud videos is usually notoriously
expensive. Moreover, training via one or only a few traditional tasks (e.g.,
classification) may be insufficient to learn subtle details of the
spatio-temporal structure existing in point cloud videos. In this paper, we
propose a Masked Spatio-Temporal Structure Prediction (MaST-Pre) method to
capture the structure of point cloud videos without human annotations. MaST-Pre
is based on spatio-temporal point-tube masking and consists of two
self-supervised learning tasks. First, by reconstructing masked point tubes,
our method is able to capture the appearance information of point cloud videos.
Second, to learn motion, we propose a temporal cardinality difference
prediction task that estimates the change in the number of points within a
point tube. In this way, MaST-Pre is forced to model the spatial and temporal
structure in point cloud videos. Extensive experiments on MSRAction-3D,
NTU-RGBD, NvGesture, and SHREC'17 demonstrate the effectiveness of the proposed
method.
Related papers
- PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds [18.840000859663153]
We propose PRED, a novel image-assisted pre-training framework for outdoor point clouds.
The main ingredient of our framework is a Birds-Eye-View (BEV) feature map conditioned semantic rendering.
We further enhance our model's performance by incorporating point-wise masking with a high mask ratio.
arXiv Detail & Related papers (2023-11-08T07:26:09Z) - CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud
Semantic Segmentation [60.0893353960514]
We study the task of weakly-supervised point cloud semantic segmentation with sparse annotations.
We propose a Contextual Point Cloud Modeling ( CPCM) method that consists of two parts: a region-wise masking (RegionMask) strategy and a contextual masked training (CMT) method.
arXiv Detail & Related papers (2023-07-19T04:41:18Z) - 3DInAction: Understanding Human Actions in 3D Point Clouds [31.66883982183386]
We propose a novel method for 3D point cloud action recognition.
We show that our method achieves improved performance on existing datasets, including ASM videos.
arXiv Detail & Related papers (2023-03-11T08:42:54Z) - PointCaM: Cut-and-Mix for Open-Set Point Cloud Learning [72.07350827773442]
We propose to solve open-set point cloud learning using a novel Point Cut-and-Mix mechanism.
We use the Unknown-Point Simulator to simulate out-of-distribution data in the training stage.
The Unknown-Point Estimator module learns to exploit the point cloud's feature context for discriminating the known and unknown data.
arXiv Detail & Related papers (2022-12-05T03:53:51Z) - PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences [51.53563462897779]
We propose a point-ordered (PST) convolution to achieve informative representations of point cloud sequences.
PST first disentangles space and time in point cloud sequences, then a spatial convolution is employed to capture local structure points in the 3D space, and a temporal convolution is used to model the dynamics of the spatial regions along the time dimension.
We incorporate the proposed PST convolution into a deep network, namely PSTNet, to extract features of point cloud sequences in a hierarchical manner.
arXiv Detail & Related papers (2022-05-27T02:14:43Z) - PointAttN: You Only Need Attention for Point Cloud Completion [89.88766317412052]
Point cloud completion refers to completing 3D shapes from partial 3D point clouds.
We propose a novel neural network for processing point cloud in a per-point manner to eliminate kNNs.
The proposed framework, namely PointAttN, is simple, neat and effective, which can precisely capture the structural information of 3D shapes.
arXiv Detail & Related papers (2022-03-16T09:20:01Z) - CP-Net: Contour-Perturbed Reconstruction Network for Self-Supervised
Point Cloud Learning [53.1436669083784]
We propose a generic Contour-Perturbed Reconstruction Network (CP-Net), which can effectively guide self-supervised reconstruction to learn semantic content in the point cloud.
For classification, we get a competitive result with the fully-supervised methods on ModelNet40 (92.5% accuracy) and ScanObjectNN (87.9% accuracy)
arXiv Detail & Related papers (2022-01-20T15:04:12Z) - Unsupervised Learning of Global Registration of Temporal Sequence of
Point Clouds [16.019588704177288]
Global registration of point clouds aims to find an optimal alignment of a sequence of 2D or 3D point sets.
We present a novel method that takes advantage of current deep learning techniques for unsupervised learning of global registration from a temporal sequence of point clouds.
arXiv Detail & Related papers (2020-06-17T06:00:36Z) - Review: deep learning on 3D point clouds [9.73176900969663]
Point cloud is one of the most significant data formats for 3D representation.
Deep learning is now the most powerful tool for data processing in computer vision.
arXiv Detail & Related papers (2020-01-17T12:55:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.