Fugu-MT 論文翻訳(概要): Efficient RWKV-based Representation Learning for 3D Point Clouds

論文の概要: Efficient RWKV-based Representation Learning for 3D Point Clouds

arxiv url: http://arxiv.org/abs/2606.10395v1
Date: Tue, 09 Jun 2026 04:16:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.315948
Title: Efficient RWKV-based Representation Learning for 3D Point Clouds
Title（参考訳）: 3次元点群に対するRWKVに基づく効率的な表現学習
Authors: Yun Liu, Xuefeng Yan, Liangliang Nan, Xianzhi Li, Peng Li, Zhe Zhu, Honghua Chen, Mingqiang Wei,
Abstract要約: textbfP-RWKVブロックは、シーケンスモデリングと不規則な3次元幾何学の間のギャップを埋める。 P-RWKVブロックとそのキーサブモジュールは、計算コストと推論遅延の少ない様々なタスクで競合性能を達成する。
参考スコア（独自算出の注目度）: 53.911842457391636
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The recent receptance weighted key value (RWKV) model combines RNN-style recurrence, offering a linear-complexity alternative to Transformers' quadratic self-attention for modeling global dependencies. However, when directly applied to point clouds, RWKV, originally developed for sequential text, struggles to capture local geometric structures and model spatial dependencies effectively. To address this, we propose the \textbf{P-RWKV} block, which bridges the gap between sequence modeling and irregular 3D geometry while preserving the efficiency advantages of RWKV. It consists of a Local Perception Expansion (LPE) component to expand contextual perception along the spatio-temporal sequence and a Spatial Context Enhancement (SCE) component to strengthen spatial awareness. To validate the effectiveness of P-RWKV for point cloud understanding, we construct PointER, a single-modality self-supervised representation learning framework whose encoder is composed of stacked P-RWKV blocks. Furthermore, we extend P-RWKV to a cross-modality setting and integrate the proposed core sub-modules into multiple architectures, demonstrating strong plug-and-play flexibility and architectural generality. Extensive experiments show that the P-RWKV block and its key sub-modules achieve competitive performance across various tasks with lower computational cost and inference latency. Code will be released upon acceptance.
Abstract（参考訳）: 最近の受容重み付きキー値(RWKV)モデルは、大域的依存関係をモデル化するためのトランスフォーマーの二次的自己アテンションに代わる線形複雑性を提供するRNNスタイルのリカレンスを組み合わせたものである。しかし、点雲に直接適用した場合、RWKVはもともとシーケンシャルテキストのために開発されたもので、局所的な幾何学的構造と空間的依存関係を効果的にモデル化するのに苦労している。そこで本研究では,RWKVの効率性を保ちつつ,シーケンスモデリングと不規則な3次元幾何とのギャップを埋める「textbf{P-RWKV}」ブロックを提案する。局所知覚拡張(LPE)コンポーネントと空間認識を強化する空間文脈拡張(SCE)コンポーネントから構成される。ポイントクラウド理解におけるP-RWKVの有効性を検証するために,エンコーダがスタックされたP-RWKVブロックで構成された単一モードの自己教師型表現学習フレームワークであるPointERを構築した。さらに,P-RWKVをモジュール間設定に拡張し,提案するコアサブモジュールを複数のアーキテクチャに統合し,プラグアンドプレイの柔軟性とアーキテクチャの汎用性を示す。大規模な実験により、P-RWKVブロックとそのキーサブモジュールは、計算コストと推論遅延を低くして、様々なタスクで競合性能を実現することが示されている。コードは受理時にリリースされる。

論文の概要: Efficient RWKV-based Representation Learning for 3D Point Clouds

関連論文リスト