Fugu-MT 論文翻訳(概要): PointTransformerX: Portable and Efficient 3D Point Cloud Processing without Sparse Algorithms

論文の概要: PointTransformerX: Portable and Efficient 3D Point Cloud Processing without Sparse Algorithms

arxiv url: http://arxiv.org/abs/2604.24169v2
Date: Wed, 29 Apr 2026 07:44:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-30 13:51:54.056853
Title: PointTransformerX: Portable and Efficient 3D Point Cloud Processing without Sparse Algorithms
Title（参考訳）: PointTransformerX:スパースアルゴリズムのないポータブルで効率的な3Dポイントクラウド処理
Authors: Laurenz Reichardt, Nikolas Ebert, Oliver Wasenmüller,
Abstract要約: PointTransformerX(PTX)は、3Dポイントクラウド用のPyTorchネイティブなビジョントランスフォーマーバックボーンである。 PTXは競合精度を維持しながら、すべてのカスタム演算子と外部ライブラリを削除する。
参考スコア（独自算出の注目度）: 0.10705399532413612
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D point cloud perception remains tightly coupled to custom CUDA operators for spatial operations, limiting portability and efficiency on non-NVIDIA, AMD, and embedded hardware. We introduce PointTransformerX (PTX), a fully PyTorch-native vision transformer backbone for 3D point clouds, removing all custom CUDA operators and external libraries while retaining competitive accuracy. PTX introduces 3D-GS-RoPE, a rotary positional embedding that encodes 3D spatial relationships directly in self-attention without neighborhood construction, and further replaces sparse convolutional patch embedding with a linear projection. PTX explores inference-time scaling of attention windows to improve accuracy without retraining. With a redesigned feed-forward network, PTX achieves 98.7\% of PointTransformer V3's accuracy on ScanNet with 79.2\% fewer parameters and executing 1.6\times faster while requiring just 253 MB memory. PTX runs natively on NVIDIA GPUs, AMD GPUs (ROCm), and CPUs, providing an efficient and portable foundation for point cloud perception.
Abstract（参考訳）: 3Dポイントクラウドの認識は、空間操作のためのカスタムCUDAオペレータと密結合であり、NVIDIA、AMD、組み込みハードウェアのポータビリティと効率を制限している。我々は、PyTorchネイティブな3Dポイントクラウド用のバックボーンであるPointTransformerX(PTX)を導入し、競合精度を維持しながら、カスタムCUDA演算子と外部ライブラリをすべて削除した。 PTXは、3D-GS-RoPEという回転的な位置埋め込みを導入し、3D空間関係を直接、近傍構造なしでエンコードし、さらに細い畳み込みパッチを線形射影で置き換える。 PTXは、アテンションウィンドウの推測時間スケーリングを調査し、再トレーニングせずに精度を向上させる。再設計されたフィードフォワードネットワークにより、PTXは、ScanNet上のPointTransformer V3の精度の98.7\%を79.2\%削減し、わずか253MBのメモリを必要とする1.6\timesを高速に実行する。 PTXはNVIDIA GPU、AMD GPU(ROCm)、CPUでネイティブに動作し、ポイントクラウド認識のための効率的でポータブルな基盤を提供する。

論文の概要: PointTransformerX: Portable and Efficient 3D Point Cloud Processing without Sparse Algorithms

関連論文リスト