Fugu-MT 論文翻訳(概要): DiffATS: Diffusion in Aligned Tensor Space

論文の概要: DiffATS: Diffusion in Aligned Tensor Space

arxiv url: http://arxiv.org/abs/2605.09275v1
Date: Sun, 10 May 2026 02:53:43 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:50.160168
Title: DiffATS: Diffusion in Aligned Tensor Space
Title（参考訳）: DiffATS: アライメントされたテンソル空間での拡散
Authors: Jinhua Lyu, Tianmin Yu, Brian Kim, Lizhuo Zhou, Chanwook Park, Naichen Shi,
Abstract要約: 予め訓練された圧縮オートエンコーダを使わずにデータ依存テンソルプリミティブを構築する。 ※空間での拡散*(DiffATS)はテンソルプリミティブ上で拡散モデルを直接訓練する生成フレームワークである。
参考スコア（独自算出の注目度）: 3.8961572818716768
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Direct diffusion modeling of high-resolution spatiotemporal fields is computationally challenging. Parameter-efficient primitives address this by representing high-dimensional data with a compact set of parameters. In this paper, we construct data-dependent tensor primitives without pretrained compression autoencoders. Our construction starts from Tucker decomposition, which captures low-rank multilinear structure through a core tensor and mode-wise factors. However, Tucker factors are non-unique: the same tensor can be represented by different rotated factors, which complicates generative modeling. We address this issue with orthogonal Procrustes (OP) alignment. Specifically, we select medoid anchor matrices from the data and align the factor matrices to resolve the gauge ambiguity. This yields matrix Grassmannian primitives and tensor Grassmannian primitives that are compact, data-adaptive, and directly decodable by explicit multilinear reconstruction. Theoretically, we prove that the proposed primitive maps are homeomorphisms between low-rank tensors and their corresponding primitive spaces, certifying that the representations are non-degenerate and topologically faithful. Building on these primitives, we propose *Diffusion in Aligned Tensor Space* (DiffATS), a generative framework that trains diffusion models directly on aligned tensor primitives. Across images, videos, and PDE solutions, DiffATS achieves strong unconditional and conditional generation performance while compressing original data by $3.9\times$ to $210\times$, without relying on any pretrained deep compression autoencoders.
Abstract（参考訳）: 高分解能時空間場の直接拡散モデリングは計算的に困難である。パラメータ効率の良いプリミティブは、パラメータのコンパクトなセットで高次元データを表現することでこの問題に対処する。本稿では,事前訓練された圧縮オートエンコーダを使わずにデータ依存型テンソルプリミティブを構築する。我々の構成はTucker分解から始まり、コアテンソルとモードワイドファクタを通して低ランクのマルチ線形構造をキャプチャする。しかし、タッカー因子は非特異であり、同じテンソルは異なる回転因子で表され、生成的モデリングが複雑になる。直交プロクリスト(OP)アライメントでこの問題に対処する。具体的には、データからメドイドアンカー行列を選択し、係数行列を整列させてゲージのあいまいさを解消する。これにより、行列のグラスマン原始体とテンソルのグラスマン原始体はコンパクトで、データ適応的で、明示的な多重線型再構成によって直接退化可能である。理論的には、提案された原始写像がローランクテンソルとその対応する原始空間の間の同型であることを証明し、表現が非退化かつ位相的に忠実であることを証明する。これらのプリミティブの上に構築された *Diffusion in Aligned Tensor Space* (DiffATS) は、アライメントテンソルプリミティブ上で拡散モデルを直接訓練する生成フレームワークである。画像、ビデオ、PDEソリューション全体で、DiffATSは、事前訓練されたディープ圧縮オートエンコーダに頼ることなく、元のデータを$3.9\times$から$210\times$に圧縮しながら、強い無条件および条件生成性能を達成する。

論文の概要: DiffATS: Diffusion in Aligned Tensor Space

関連論文リスト