- Abstract: We study the problem of testing the goodness of fit of occurrences of items from many categories to an identical Poisson distribution over the categories. As a class of alternative hypotheses, we consider the removal of an $\ell_p$ ball, $p \leq 2$, of radius $\epsilon$ from a hypercube around the sequence of uniform Poisson rates. When the expected number of samples $n$ and number of categories $N$ go to infinity while $\epsilon$ is small, the minimax risk asymptotes to $2\Phi(-n N^{2-2/p} \epsilon^2/\sqrt{8N})$; $\Phi(x)$ is the normal CDF. This result allows the comparison of the many estimators previously proposed for this problem at the constant level, rather than at the rate of convergence of the risk or the scaling order of the sample complexity. The minimax test mostly relies on collisions in the small sample limit but behaves like the chisquared test. Empirical studies over a range of problem parameters show that the asymptotic risk estimate is accurate in finite samples and that the minimax test is significantly better than the chisquared test or a test that only uses collisions. Our analysis combines standard ideas from non-parametric hypothesis testing with new results in the low count limit of multiple Poisson distributions, including the convexity of certain kernels and a central limit theorem of linear test statistics.
- Abstract(参考訳): 本研究は,様々なカテゴリーの項目が同一のポアソン分布に収まることの良さを検証することの課題について考察する。
代替仮説のクラスとして、一様ポアソン数列のハイパーキューブから半径$\epsilon$の$\ell_p$ ball, $p \leq 2$の除去を考える。
期待されるサンプル数$n$とカテゴリ数$N$が無限大になり、$\epsilon$が小さければ、ミニマックスのアシャンポテスは$2\Phi(-n N^{2-2/p} \epsilon^2/\sqrt{8N})$; $\Phi(x)$は通常のCDFである。
