Fugu-MT 論文翻訳(概要): GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

論文の概要: GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

arxiv url: http://arxiv.org/abs/2212.03010v1
Date: Tue, 6 Dec 2022 14:32:55 GMT
ステータス: 翻訳完了
システム内更新日: 2022-12-07 17:08:40.222904
Title: GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds
Title（参考訳）: GD-MAE: LiDARポイントクラウド上でのMAE事前学習のための生成デコーダ
Authors: Honghui Yang and Tong He and Jiaheng Liu and Hua Chen and Boxi Wu and Binbin Lin and Xiaofei He and Wanli Ouyang
Abstract要約: Masked Autoencoders (MAE)は、大規模な3Dポイントクラウドでの探索が難しい。我々は,周囲のコンテキストを自動的にマージするためのtextbfGenerative textbfDecoder for MAE (GD-MAE)を提案する。提案手法の有効性を, KITTI と ONCE の2つの大規模ベンチマークで実証した。
参考スコア（独自算出の注目度）: 72.60362979456035
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the tremendous progress of Masked Autoencoders (MAE) in developing vision tasks such as image and video, exploring MAE in large-scale 3D point clouds remains challenging due to the inherent irregularity. In contrast to previous 3D MAE frameworks, which either design a complex decoder to infer masked information from maintained regions or adopt sophisticated masking strategies, we instead propose a much simpler paradigm. The core idea is to apply a \textbf{G}enerative \textbf{D}ecoder for MAE (GD-MAE) to automatically merges the surrounding context to restore the masked geometric knowledge in a hierarchical fusion manner. In doing so, our approach is free from introducing the heuristic design of decoders and enjoys the flexibility of exploring various masking strategies. The corresponding part costs less than \textbf{12\%} latency compared with conventional methods, while achieving better performance. We demonstrate the efficacy of the proposed method on several large-scale benchmarks: Waymo, KITTI, and ONCE. Consistent improvement on downstream detection tasks illustrates strong robustness and generalization capability. Not only our method reveals state-of-the-art results, but remarkably, we achieve comparable accuracy even with \textbf{20\%} of the labeled data on the Waymo dataset. The code will be released at \url{https://github.com/Nightmare-n/GD-MAE}.
Abstract（参考訳）: Masked Autoencoders (MAE) が画像やビデオなどの視覚タスクの開発において著しく進歩しているにもかかわらず、大規模な3Dポイント雲におけるMAEの探索は、不規則性のため、依然として困難である。従来の3D MAEフレームワークとは対照的に、複雑なデコーダを設計して、維持領域からマスキング情報を推測するか、高度なマスキング戦略を採用するか、より単純なパラダイムを提案する。中心となる考え方は、MAE (GD-MAE) に \textbf{G}enerative \textbf{D}ecoder を適用し、周囲のコンテキストを自動的にマージして、階層的な融合方式でマスクされた幾何学的知識を復元することである。そこで本手法では,デコーダのヒューリスティックな設計を導入せず,様々なマスキング戦略を探索する柔軟性を享受できる。対応する部分のレイテンシは,従来の方法に比べて低く,パフォーマンスも向上している。提案手法の有効性を,Waymo,KITTI,ONCEなどの大規模ベンチマークで実証した。下流検出タスクの一貫性の向上は、強い堅牢性と一般化能力を示している。我々の手法は最先端の結果を明らかにするだけでなく、Waymoデータセット上のラベル付きデータのtextbf{20\%}でも同等の精度が得られる。コードは \url{https://github.com/Nightmare-n/GD-MAE} でリリースされる。

関連論文リスト

DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning [21.77406648840365]
DeepMeshは、2つの重要なイノベーションを通じてメッシュ生成を最適化するフレームワークである。データキュレーションと処理の改善とともに、新しいトークン化アルゴリズムが組み込まれている。複雑な詳細と正確なトポロジを持つメッシュを生成し、精度と品質の両方で最先端の手法より優れています。
論文参考訳（メタデータ） (2025-03-19T14:39:30Z)
MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields [73.49548565633123]
3Dガウシアンによって表現される放射場は、高いトレーニング効率と高速レンダリングの両方を提供する、新しいビューの合成に優れている。既存の手法では、高密度推定ネットワークからの奥行き先を組み込むことが多いが、入力画像に固有の多視点一貫性を見落としている。本稿では,3次元ガウス・スプレイティング(MCGS)に基づくビュー・フレームワークを提案し,スパークス・インプット・ビューからシーンを再構築する。
論文参考訳（メタデータ） (2024-10-15T08:39:05Z)
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE)は、堅牢な自己管理フレームワークとして登場した。データに依存しないColorMAEという手法を導入し、ランダムノイズをフィルタすることで異なる二元マスクパターンを生成する。ランダムマスキングと比較して,下流タスクにおける戦略の優位性を示す。
論文参考訳（メタデータ） (2024-07-17T22:04:00Z)
Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
画像領域における視覚変換器(ViT)のコントラスト学習(CL)は、従来の畳み込みバックボーンのCLに匹敵する性能を達成した。 ViTで事前訓練した3Dポイントクラウドでは、マスク付きオートエンコーダ(MAE)モデリングが主流である。
論文参考訳（メタデータ） (2024-07-08T12:28:56Z)
GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D [18.33878596057853]
ポイントクラウドのための自己教師型学習に先駆的なアプローチを導入する。我々は、Masked Autosの効率を高めるためにGeoMask3D(GM3D)と呼ばれる幾何学的に情報を得たマスク選択戦略を採用した。
論文参考訳（メタデータ） (2024-05-20T23:53:42Z)
UGMAE: A Unified Framework for Graph Masked Autoencoders [67.75493040186859]
グラフマスク付きオートエンコーダのための統一フレームワークであるUGMAEを提案する。まず,ノードの特異性を考慮した適応型特徴マスク生成器を開発した。次に,階層型構造再構成と特徴再構成を併用し,総合的なグラフ情報を取得する。
論文参考訳（メタデータ） (2024-02-12T19:39:26Z)
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders [52.66195794216989]
本稿では,コンパクトな3D表現を学習するために,ポイント特徴強調マスク付きオートエンコーダ(Point-FEMAE)を提案する。 Point-FEMAEはグローバルブランチとローカルブランチで構成され、潜在意味的特徴をキャプチャする。本手法は, クロスモーダル方式と比較して, 事前学習効率を著しく向上させる。
論文参考訳（メタデータ） (2023-12-17T14:17:05Z)
How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders [21.849681446573257]
再構成タスクに基づくマスケ自動エンコーダ(MAE)は、自己教師型学習(SSL)の有望なパラダイムになってきた。本稿では,MAEが意味のある特徴を学習する上で,マスキングがいかに重要であるかを理論的に理解する。
論文参考訳（メタデータ） (2022-10-15T17:36:03Z)
MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition [160.49403075559158]
本稿では,Pseudo-Labeling autoEncoder (textbfMAPLE) フレームワークを提案する。特に、MAPLEのバックボーンとして、新規で効率的なtextbfDecoupled textbfspatial-textbftemporal TranstextbfFormer(textbfDestFormer)を設計する。 MAPLEは3つの公開ベンチマークにおいて優れた結果を得て、MSR-Action3の精度を8.08%向上させる。
論文参考訳（メタデータ） (2022-09-01T12:32:40Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。