Fugu-MT 論文翻訳(概要): Count2Density: Crowd Density Estimation without Location-level Annotations

論文の概要: Count2Density: Crowd Density Estimation without Location-level Annotations

arxiv url: http://arxiv.org/abs/2509.03170v1
Date: Wed, 03 Sep 2025 09:36:34 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-04 21:40:46.477664
Title: Count2Density: Crowd Density Estimation without Location-level Annotations
Title（参考訳）: Count2Density: 位置レベルのアノテーションのない集団密度推定
Authors: Mattia Litrico, Feng Chen, Michael Pound, Sotirios A Tsaftaris, Sebastiano Battiato, Mario Valerio Giuffrida,
Abstract要約: トレーニング中に数レベルのアノテーションのみを使用して意味のある密度マップを予測するように設計された,新しいパイプラインであるCount2Densityを提案する。提案手法はドメイン間適応法を著しく上回り, 半教師付き環境における最近の最先端手法よりも優れた結果が得られることを示す。
参考スコア（独自算出の注目度）: 12.745949100586278
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Crowd density estimation is a well-known computer vision task aimed at estimating the density distribution of people in an image. The main challenge in this domain is the reliance on fine-grained location-level annotations, (i.e. points placed on top of each individual) to train deep networks. Collecting such detailed annotations is both tedious, time-consuming, and poses a significant barrier to scalability for real-world applications. To alleviate this burden, we present Count2Density: a novel pipeline designed to predict meaningful density maps containing quantitative spatial information using only count-level annotations (i.e., total number of people) during training. To achieve this, Count2Density generates pseudo-density maps leveraging past predictions stored in a Historical Map Bank, thereby reducing confirmation bias. This bank is initialised using an unsupervised saliency estimator to provide an initial spatial prior and is iteratively updated with an EMA of predicted density maps. These pseudo-density maps are obtained by sampling locations from estimated crowd areas using a hypergeometric distribution, with the number of samplings determined by the count-level annotations. To further enhance the spatial awareness of the model, we add a self-supervised contrastive spatial regulariser to encourage similar feature representations within crowded regions while maximising dissimilarity with background regions. Experimental results demonstrate that our approach significantly outperforms cross-domain adaptation methods and achieves better results than recent state-of-the-art approaches in semi-supervised settings across several datasets. Additional analyses validate the effectiveness of each individual component of our pipeline, confirming the ability of Count2Density to effectively retrieve spatial information from count-level annotations and enabling accurate subregion counting.
Abstract（参考訳）: 群衆密度推定は、画像中の人物の密度分布を推定することを目的とした、よく知られたコンピュータビジョンタスクである。この領域の主な課題は、ディープネットワークをトレーニングするために、きめ細かい位置レベルのアノテーション(すなわち各個人の上に置かれるポイント)に依存することである。このような詳細なアノテーションの収集は面倒で時間がかかり、現実世界のアプリケーションにとってスケーラビリティにとって大きな障壁となる。この負担を軽減するため、トレーニング中に数レベルのアノテーション(つまり、総人数)のみを使用して定量的空間情報を含む有意義な密度マップを予測するために設計された新しいパイプラインであるCount2Densityを提案する。これを実現するために、Count2Densityは、過去の予測を履歴地図バンクに格納した擬似密度マップを生成し、確認バイアスを低減する。このバンクは、初期空間事前を提供するために教師なしの空力推定器を用いて初期化され、予測された密度マップのEMAで反復的に更新される。これらの擬似密度マップは,超幾何分布を用いて推定された群集地域から,数レベルのアノテーションによって決定されるサンプリング数を用いて,その位置をサンプリングすることによって得られる。モデルの空間的認識をさらに高めるために,自己監督型コントラスト空間正規化器を加えて,背景領域との相似性を最大化しつつ,混雑した領域内での類似した特徴表現を促進させる。実験により,本手法はドメイン間適応法を著しく上回り,複数のデータセットにわたる半教師付き設定における最近の最先端手法よりも優れた結果が得られることが示された。追加分析により,各パイプラインの個々の成分の有効性を検証し,カウントレベルのアノテーションから空間情報を効果的に取得し,正確なサブリージョンカウントを可能にするCount2Densityの有効性を確認した。

論文の概要: Count2Density: Crowd Density Estimation without Location-level Annotations

関連論文リスト