論文の概要: Likelihood estimation of sparse topic distributions in topic models and
- arxiv url: http://arxiv.org/abs/2107.05766v1
- Date: Mon, 12 Jul 2021 22:22:32 GMT
- ステータス: 処理完了
- システム内更新日: 2021-07-14 14:32:40.535825
- Title: Likelihood estimation of sparse topic distributions in topic models and
- Title(参考訳): トピックモデルにおけるスパーストピック分布の確率推定とwasserstein文書距離計算への応用
- Authors: Xin Bing and Florentina Bunea and Seth Strimas-Mackey and Marten
- Abstract要約: トピックモデルでは、$ptimes n$予測ワード頻度行列は$ptimes K$ワードトピック行列$A$として分解される。
- 参考スコア(独自算出の注目度): 3.679981089267181
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the estimation of high-dimensional, discrete, possibly
sparse, mixture models in topic models. The data consists of observed
multinomial counts of $p$ words across $n$ independent documents. In topic
models, the $p\times n$ expected word frequency matrix is assumed to be
factorized as a $p\times K$ word-topic matrix $A$ and a $K\times n$
topic-document matrix $T$. Since columns of both matrices represent conditional
probabilities belonging to probability simplices, columns of $A$ are viewed as
$p$-dimensional mixture components that are common to all documents while
columns of $T$ are viewed as the $K$-dimensional mixture weights that are
document specific and are allowed to be sparse. The main interest is to provide
sharp, finite sample, $\ell_1$-norm convergence rates for estimators of the
mixture weights $T$ when $A$ is either known or unknown. For known $A$, we
suggest MLE estimation of $T$. Our non-standard analysis of the MLE not only
establishes its $\ell_1$ convergence rate, but reveals a remarkable property:
the MLE, with no extra regularization, can be exactly sparse and contain the
true zero pattern of $T$. We further show that the MLE is both minimax optimal
and adaptive to the unknown sparsity in a large class of sparse topic
distributions. When $A$ is unknown, we estimate $T$ by optimizing the
likelihood function corresponding to a plug in, generic, estimator $\hat{A}$ of
$A$. For any estimator $\hat{A}$ that satisfies carefully detailed conditions
for proximity to $A$, the resulting estimator of $T$ is shown to retain the
properties established for the MLE. The ambient dimensions $K$ and $p$ are
allowed to grow with the sample sizes. Our application is to the estimation of
1-Wasserstein distances between document generating distributions. We propose,
estimate and analyze new 1-Wasserstein distances between two probabilistic
document representations.
- Abstract(参考訳): 本稿では,トピックモデルにおける高次元,離散的,おそらくスパースな混合モデルの推定について検討する。
トピックモデルでは、$p\times n$ 期待語周波数行列は$p\times k$ word-topic matrix $a$ と$k\times n$ topic-document matrix $t$ と推定される。
我々の MLE の非標準解析は $\ell_1$ 収束率を確立するだけでなく、顕著な性質を明らかにしている: MLE は余分な正規化を持たず、正確にスパースであり、真の $T$ の零パターンを含むことができる。
A$ が未知の場合、プラグイン、ジェネリック、推定器 $\hat{A}$ of $A$ に対応する可能性関数を最適化することで$T$ を推定する。
任意の推定器 $\hat{A}$ が$A$ に近いような詳細な条件を満たす場合、結果として生じる$T$ の推定器は MLE で確立されたプロパティを保持する。
