Fugu-MT 論文翻訳(概要): GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

論文の概要: GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

arxiv url: http://arxiv.org/abs/2603.26661v1
Date: Fri, 27 Mar 2026 17:58:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-30 21:49:48.63502
Title: GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation
Title（参考訳）: GaussianGPT: 自動回帰型3Dガウスシーン生成に向けて
Authors: Nicolas von Lützow, Barbara Rössle, Katharina Schmid, Matthias Nießner,
Abstract要約: 本稿では,3次元ガウスを直接生成するトランスフォーマーモデルを提案する。得られたトークンは、3次元回転する位置埋め込みを備えた因果変換器を用いてシリアライズされ、モデル化される。シーンを均等に洗練する拡散法とは異なり、私たちの定式化はシーンをステップバイステップで構築し、自然に完了を支え、露光し、温度による制御可能なサンプリングを行い、フレキシブルな生成地平線を創出する。
参考スコア（独自算出の注目度）: 42.49842620609683
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most recent advances in 3D generative modeling rely on diffusion or flow-matching formulations. We instead explore a fully autoregressive alternative and introduce GaussianGPT, a transformer-based model that directly generates 3D Gaussians via next-token prediction, thus facilitating full 3D scene generation. We first compress Gaussian primitives into a discrete latent grid using a sparse 3D convolutional autoencoder with vector quantization. The resulting tokens are serialized and modeled using a causal transformer with 3D rotary positional embedding, enabling sequential generation of spatial structure and appearance. Unlike diffusion-based methods that refine scenes holistically, our formulation constructs scenes step-by-step, naturally supporting completion, outpainting, controllable sampling via temperature, and flexible generation horizons. This formulation leverages the compositional inductive biases and scalability of autoregressive modeling while operating on explicit representations compatible with modern neural rendering pipelines, positioning autoregressive transformers as a complementary paradigm for controllable and context-aware 3D generation.
Abstract（参考訳）: 最近の3次元生成モデリングの進歩は拡散やフローマッチングの定式化に依存している。代わりに、完全に自己回帰的な代替品を探究し、3Dガウスを直接生成するトランスフォーマーベースモデルであるGaussianGPTを導入し、フル3Dシーン生成を容易にする。まず,ベクトル量子化を用いたスパース3次元畳み込みオートエンコーダを用いて,ガウスプリミティブを離散潜在格子に圧縮する。得られたトークンは、3次元回転する位置埋め込みを備えた因果変換器を用いてシリアライズされ、空間構造と外観のシーケンシャルな生成を可能にする。シーンを均等に洗練する拡散法とは異なり、私たちの定式化はシーンをステップバイステップで構築し、完了を自然にサポートし、露光し、温度による制御可能なサンプリングを行い、フレキシブルな生成地平線を創出する。この定式化は、合成帰納的バイアスと自己回帰的モデリングのスケーラビリティを活用し、現代のニューラルレンダリングパイプラインと互換性のある明示的な表現を運用し、自己回帰的トランスフォーマーを制御可能でコンテキスト対応の3D生成の補完パラダイムとして位置づける。

論文の概要: GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

関連論文リスト