Fugu-MT 論文翻訳(概要): Redesigning Multi-Scale Neural Network for Crowd Counting

論文の概要: Redesigning Multi-Scale Neural Network for Crowd Counting

arxiv url: http://arxiv.org/abs/2208.02894v2
Date: Tue, 4 Jul 2023 01:55:13 GMT
ステータス: 翻訳完了
システム内更新日: 2023-07-07 00:16:06.943682
Title: Redesigning Multi-Scale Neural Network for Crowd Counting
Title（参考訳）: 集団カウントのためのマルチスケールニューラルネットワークの再設計
Authors: Zhipeng Du, Miaojing Shi, Jiankang Deng, Stefanos Zafeiriou
Abstract要約: 本稿では, 集団カウントのための多スケール密度マップを階層的にマージする, 密度専門家の階層的混合を導入する。階層構造の中では、すべてのスケールからの貢献を促進するために、専門家の競争とコラボレーションのスキームが提示されます。実験の結果,提案手法は5つの公開データセット上での最先端性能を実現することがわかった。
参考スコア（独自算出の注目度）: 68.674652984003
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Perspective distortions and crowd variations make crowd counting a challenging task in computer vision. To tackle it, many previous works have used multi-scale architecture in deep neural networks (DNNs). Multi-scale branches can be either directly merged (e.g. by concatenation) or merged through the guidance of proxies (e.g. attentions) in the DNNs. Despite their prevalence, these combination methods are not sophisticated enough to deal with the per-pixel performance discrepancy over multi-scale density maps. In this work, we redesign the multi-scale neural network by introducing a hierarchical mixture of density experts, which hierarchically merges multi-scale density maps for crowd counting. Within the hierarchical structure, an expert competition and collaboration scheme is presented to encourage contributions from all scales; pixel-wise soft gating nets are introduced to provide pixel-wise soft weights for scale combinations in different hierarchies. The network is optimized using both the crowd density map and the local counting map, where the latter is obtained by local integration on the former. Optimizing both can be problematic because of their potential conflicts. We introduce a new relative local counting loss based on relative count differences among hard-predicted local regions in an image, which proves to be complementary to the conventional absolute error loss on the density map. Experiments show that our method achieves the state-of-the-art performance on five public datasets, i.e. ShanghaiTech, UCF_CC_50, JHU-CROWD++, NWPU-Crowd and Trancos.
Abstract（参考訳）: 視点の歪みと群衆の変動は、コンピュータビジョンにおいて、群衆の数え上げが困難なタスクとなる。これに取り組むために、多くの先行研究はディープニューラルネットワーク(DNN)にマルチスケールアーキテクチャを使用してきた。マルチスケールブランチは直接マージされる(例えば結合によって)か、DNNのプロキシ(例えば注意)のガイダンスによってマージされる。これらの組み合わせ法は,その普及にもかかわらず,マルチスケール密度マップに対する画素単位の性能差に対処するには不十分である。本研究では,複数スケールの密度マップを階層的にマージした密度エキスパートの階層的混合を導入することにより,マルチスケールニューラルネットワークを再設計する。階層構造の中では、すべてのスケールからの貢献を促進するために専門家のコンペティションとコラボレーションスキームが提示され、異なる階層のスケール組み合わせのためのピクセル単位のソフトウェイトを提供するために、ピクセル単位のソフトゲーティングネットが導入された。ネットワークは、群集密度マップと局所カウントマップの両方を用いて最適化され、後者は、前者の局所積分によって得られる。両者の最適化は、潜在的な競合のために問題となる可能性がある。画像中の強予測された局所領域間の相対的数差に基づく新たな相対的局所的カウント損失を導入し, 密度マップ上の従来の絶対誤差損失と相補的であることを証明した。実験の結果,提案手法は上海技術,UCF_CC_50,JHU-CROWD++,NWPU-Crowd,Trancosの5つの公開データセットに対して,最先端のパフォーマンスを実現することがわかった。

論文の概要: Redesigning Multi-Scale Neural Network for Crowd Counting

関連論文リスト