Fugu-MT 論文翻訳(概要): M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

論文の概要: M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

arxiv url: http://arxiv.org/abs/2309.13235v1
Date: Sat, 23 Sep 2023 02:19:21 GMT
ステータス: 翻訳完了
システム内更新日: 2023-09-26 21:11:47.813766
Title: M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders
Title（参考訳）: M$^3$CS:学習可能なコードブックとシームズデコーダを用いたマルチターゲットマスキングポイントモデリング
Authors: Qibo Qiu, Honghui Yang, Wenxiao Wang, Shun Zhang, Haiming Gao, Haochao Ying, Wei Hua, Xiaofei He
Abstract要約: マスク付き点モデリングは、点雲の自己教師型事前学習の有望なスキームとなっている。 M$3$CSは上記の能力を持つモデルを可能にするために提案されている。
参考スコア（独自算出の注目度）: 19.68592678093725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds. Existing methods reconstruct either the original points or related features as the objective of pre-training. However, considering the diversity of downstream tasks, it is necessary for the model to have both low- and high-level representation modeling capabilities to capture geometric details and semantic contexts during pre-training. To this end, M$^3$CS is proposed to enable the model with the above abilities. Specifically, with masked point cloud as input, M$^3$CS introduces two decoders to predict masked representations and the original points simultaneously. While an extra decoder doubles parameters for the decoding process and may lead to overfitting, we propose siamese decoders to keep the amount of learnable parameters unchanged. Further, we propose an online codebook projecting continuous tokens into discrete ones before reconstructing masked points. In such way, we can enforce the decoder to take effect through the combinations of tokens rather than remembering each token. Comprehensive experiments show that M$^3$CS achieves superior performance at both classification and segmentation tasks, outperforming existing methods.
Abstract（参考訳）: マスク付き点モデリングは、点雲の自己教師型事前学習の有望なスキームとなっている。既存の方法は、事前学習の目的として原点または関連特徴を再構築する。しかし、下流タスクの多様性を考慮すると、事前学習中に幾何学的詳細や意味的文脈を捉えるために、低レベルかつ高レベルな表現モデリング機能を持つ必要がある。この目的のために、M$^3$CS は上記の能力を持つモデルを可能にするために提案される。具体的には、マスキングポイントクラウドを入力として、m$^3$csは2つのデコーダを導入し、マスクされた表現と元の点を同時に予測する。余分なデコーダはデコードプロセスのパラメータを2倍にし、オーバーフィッティングにつながる可能性があるが、学習可能なパラメータの量を一定に抑えるためにシムデコーダを提案する。さらに,マスキングポイントを再構築する前に,連続トークンを個別に投影するオンラインコードブックを提案する。このようにして、デコーダは各トークンを記憶するのではなく、トークンの組み合わせを通じて効果を発揮することができる。総合的な実験により、M$^3$CSは分類タスクとセグメンテーションタスクの両方において優れた性能を示し、既存の手法よりも優れていた。

論文の概要: M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

関連論文リスト