Related papers: TTPOINT: A Tensorized Point Cloud Network for Lightweight Action Recognition with Event Cameras

TTPOINT: A Tensorized Point Cloud Network for Lightweight Action Recognition with Event Cameras

URL: http://arxiv.org/abs/2308.09993v1
Date: Sat, 19 Aug 2023 11:58:31 GMT
Title: TTPOINT: A Tensorized Point Cloud Network for Lightweight Action Recognition with Event Cameras
Authors: Hongwei Ren, Yue Zhou, Haotian Fu, Yulong Huang, Renjing Xu, Bojun Cheng
Abstract summary: Event cameras generate sparse and asynchronous data, which is incompatible with the traditional frame-based method. We propose a point cloud network called TTPOINT which achieves results even compared to the state-of-the-art (SOTA) frame-based method in action recognition tasks.
Score: 5.925545594655497
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Event cameras have gained popularity in computer vision due to their data sparsity, high dynamic range, and low latency. As a bio-inspired sensor, event cameras generate sparse and asynchronous data, which is inherently incompatible with the traditional frame-based method. Alternatively, the point-based method can avoid additional modality transformation and naturally adapt to the sparsity of events. Still, it typically cannot reach a comparable accuracy as the frame-based method. We propose a lightweight and generalized point cloud network called TTPOINT which achieves competitive results even compared to the state-of-the-art (SOTA) frame-based method in action recognition tasks while only using 1.5 % of the computational resources. The model is adept at abstracting local and global geometry by hierarchy structure. By leveraging tensor-train compressed feature extractors, TTPOINT can be designed with minimal parameters and computational complexity. Additionally, we developed a straightforward downsampling algorithm to maintain the spatio-temporal feature. In the experiment, TTPOINT emerged as the SOTA method on three datasets while also attaining SOTA among point cloud methods on all five datasets. Moreover, by using the tensor-train decomposition method, the accuracy of the proposed TTPOINT is almost unaffected while compressing the parameter size by 55 % in all five datasets.

Related papers

PUMPS: Skeleton-Agnostic Point-based Universal Motion Pre-Training for Synthesis in Human Motion Tasks [44.19486142246208]
Motion skeletons drive 3D character animation by transforming bone hierarchies, but differences in proportions or structure make motion data hard to transfer across skeletons.<n>Temporal Point Clouds (TPCs) offer an unstructured, cross-compatible motion representation.<n>We propose PUMPS, the primordial autoencoder architecture for TPC data.
arXiv Detail & Related papers (2025-07-27T08:20:49Z)
PointODE: Lightweight Point Cloud Learning with Neural Ordinary Differential Equations on Edge [0.8403582577557918]
We introduce a parameter-efficient architecture for point cloud feature extraction based on a continuous stack of blocks with residual connections.<n>PointODE shows competitive accuracy to the state-of-the-art models on both synthetic and real-world datasets.
arXiv Detail & Related papers (2025-05-31T07:34:54Z)
FASTer: Focal Token Acquiring-and-Scaling Transformer for Long-term 3D Object Detection [9.291995455336929]
We propose a Focal Token Acquring-and-Scaling Transformer (FASTer) FASTer condenses token sequences in an adaptive and lightweight manner. It significantly outperforms other state-of-the-art detectors in both performance and efficiency.
arXiv Detail & Related papers (2025-02-28T03:15:33Z)
Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement [19.575833741231953]
We use the KNN method to determine the neighborhoods of raw surface points. A conditional probability model is adaptive to local geometry, leading to significant rate reduction. We incorporate an implicit neural representation into the refinement layer, allowing the decoder to sample points on the underlying surface at arbitrary densities.
arXiv Detail & Related papers (2024-08-06T05:24:06Z)
CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation. We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration. The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z)
VPIT: Real-time Embedded Single Object 3D Tracking Using Voxel Pseudo Images [90.60881721134656]
We propose a novel voxel-based 3D single object tracking (3D SOT) method called Voxel Pseudo Image Tracking (VPIT) Experiments on KITTI Tracking dataset show that VPIT is the fastest 3D SOT method and maintains competitive Success and Precision values.
arXiv Detail & Related papers (2022-06-06T14:02:06Z)
CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data. CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers. Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z)
Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution. We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids. The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z)
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z)
FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds [24.85151376535356]
Spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator. The proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.
arXiv Detail & Related papers (2020-11-27T15:35:12Z)
Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid. We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently. We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z)
StickyPillars: Robust and Efficient Feature Matching on Point Clouds using Graph Neural Networks [16.940377259203284]
StickyPillars is a fast, accurate and extremely robust deep middle-end 3D feature matching method on point clouds. We present state-of-art art accuracy results on the registration problem demonstrated on the KITTI dataset. We integrate our matching system into a LiDAR odometry pipeline yielding most accurate results on the KITTI dataset.
arXiv Detail & Related papers (2020-02-10T17:53:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.