Related papers: 3DInAction: Understanding Human Actions in 3D Point Clouds

3DInAction: Understanding Human Actions in 3D Point Clouds

URL: http://arxiv.org/abs/2303.06346v2
Date: Fri, 29 Mar 2024 15:10:29 GMT
Title: 3DInAction: Understanding Human Actions in 3D Point Clouds
Authors: Yizhak Ben-Shabat, Oren Shrout, Stephen Gould,
Abstract summary: We propose a novel method for 3D point cloud action recognition. We show that our method achieves improved performance on existing datasets, including ASM videos.
Score: 31.66883982183386
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel method for 3D point cloud action recognition. Understanding human actions in RGB videos has been widely studied in recent years, however, its 3D point cloud counterpart remains under-explored. This is mostly due to the inherent limitation of the point cloud data modality -- lack of structure, permutation invariance, and varying number of points -- which makes it difficult to learn a spatio-temporal representation. To address this limitation, we propose the 3DinAction pipeline that first estimates patches moving in time (t-patches) as a key building block, alongside a hierarchical architecture that learns an informative spatio-temporal representation. We show that our method achieves improved performance on existing datasets, including DFAUST and IKEA ASM. Code is publicly available at https://github.com/sitzikbs/3dincaction.

Related papers

DG-MVP: 3D Domain Generalization via Multiple Views of Point Clouds for Classification [10.744510913722817]
Deep neural networks have achieved significant success in 3D point cloud classification. In this paper, we focus on the 3D point cloud domain generalization problem. We propose a novel method for 3D point cloud domain generalization, which can generalize to unseen domains of point clouds.
arXiv Detail & Related papers (2025-04-16T19:43:32Z)
SPiKE: 3D Human Pose from Point Cloud Sequences [1.8024397171920885]
3D Human Pose Estimation (HPE) is the task of locating keypoints of the human body in 3D space from 2D or 3D representations such as RGB images, depth maps or point clouds. This paper presents SPiKE, a novel approach to 3D HPE using point cloud sequences. Experiments on the ITOP benchmark for 3D HPE show that SPiKE reaches 89.19% mAP, achieving state-of-the-art performance with significantly lower inference times.
arXiv Detail & Related papers (2024-09-03T13:22:01Z)
P2P-Bridge: Diffusion Bridges for 3D Point Cloud Denoising [81.92854168911704]
We tackle the task of point cloud denoising through a novel framework that adapts Diffusion Schr"odinger bridges to points clouds. Experiments on object datasets show that P2P-Bridge achieves significant improvements over existing methods.
arXiv Detail & Related papers (2024-08-29T08:00:07Z)
FASTC: A Fast Attentional Framework for Semantic Traversability Classification Using Point Cloud [7.711666704468952]
We address the problem of traversability assessment using point clouds. We propose a pillar feature extraction module that utilizes PointNet to capture features from point clouds organized in vertical volume. We then propose a newtemporal attention module to fuse multi-frame information, which can properly handle the varying density problem of LIDAR point clouds.
arXiv Detail & Related papers (2024-06-24T12:01:55Z)
PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection. We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement. We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z)
Nothing Stands Still: A Spatiotemporal Benchmark on 3D Point Cloud Registration Under Large Geometric and Temporal Change [86.44429778015657]
Building 3D geometric maps of man-made spaces are fundamental computer vision and robotics. Nothing Stands Still (NSS) benchmark focuses on thetemporal registration of 3D scenes undergoing large spatial and temporal change. As part of NSS, we introduce a dataset of 3D point clouds recurrently captured in large-scale building indoor environments that are under construction or renovation.
arXiv Detail & Related papers (2023-11-15T20:09:29Z)
RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds [44.034836961967144]
3D motion estimation including scene flow and point cloud registration has drawn increasing interest. Recent methods employ deep neural networks to construct the cost volume for estimating accurate 3D flow. We decompose the problem into two interlaced stages, where the 3D flows are optimized point-wisely at the first stage and then globally regularized in a recurrent network at the second stage.
arXiv Detail & Related papers (2022-05-23T04:04:30Z)
IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment [58.8330387551499]
We formulate the problem as estimation of point-wise trajectories (i.e., smooth curves) We propose IDEA-Net, an end-to-end deep learning framework, which disentangles the problem under the assistance of the explicitly learned temporal consistency. We demonstrate the effectiveness of our method on various point cloud sequences and observe large improvement over state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2022-03-22T10:14:08Z)
PointAttN: You Only Need Attention for Point Cloud Completion [89.88766317412052]
Point cloud completion refers to completing 3D shapes from partial 3D point clouds. We propose a novel neural network for processing point cloud in a per-point manner to eliminate kNNs. The proposed framework, namely PointAttN, is simple, neat and effective, which can precisely capture the structural information of 3D shapes.
arXiv Detail & Related papers (2022-03-16T09:20:01Z)
D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds. We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point. Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.