Fugu-MT 論文翻訳(概要): Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO

論文の概要: Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO

arxiv url: http://arxiv.org/abs/2201.02396v1
Date: Fri, 7 Jan 2022 11:00:11 GMT
ステータス: 翻訳完了
システム内更新日: 2022-01-10 14:47:40.660173
Title: Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO
Title（参考訳）: DIABOLOによるH2O(Human-to-Human-or-Object)相互作用の検出
Authors: Astrid Orcesi, Romaric Audigier, Fritz Poka Toukam and Bertrand Luvison
Abstract要約: 我々は,Human-to-Human-or-Object(H2O)という2種類のインタラクションを扱う新しいインタラクションデータセットを提案する。さらに, 人間の身体的態度の記述に近づき, 周囲の相互作用の標的について記述することを目的とした, 動詞の新たな分類法を導入する。提案手法は,1回のフォワードパスにおける全てのインタラクションを検出するための,効率的な主観中心単発撮影法であるDIABOLOを提案する。
参考スコア（独自算出の注目度）: 29.0200561485714
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting human interactions is crucial for human behavior analysis. Many methods have been proposed to deal with Human-to-Object Interaction (HOI) detection, i.e., detecting in an image which person and object interact together and classifying the type of interaction. However, Human-to-Human Interactions, such as social and violent interactions, are generally not considered in available HOI training datasets. As we think these types of interactions cannot be ignored and decorrelated from HOI when analyzing human behavior, we propose a new interaction dataset to deal with both types of human interactions: Human-to-Human-or-Object (H2O). In addition, we introduce a novel taxonomy of verbs, intended to be closer to a description of human body attitude in relation to the surrounding targets of interaction, and more independent of the environment. Unlike some existing datasets, we strive to avoid defining synonymous verbs when their use highly depends on the target type or requires a high level of semantic interpretation. As H2O dataset includes V-COCO images annotated with this new taxonomy, images obviously contain more interactions. This can be an issue for HOI detection methods whose complexity depends on the number of people, targets or interactions. Thus, we propose DIABOLO (Detecting InterActions By Only Looking Once), an efficient subject-centric single-shot method to detect all interactions in one forward pass, with constant inference time independent of image content. In addition, this multi-task network simultaneously detects all people and objects. We show how sharing a network for these tasks does not only save computation resource but also improves performance collaboratively. Finally, DIABOLO is a strong baseline for the new proposed challenge of H2O Interaction detection, as it outperforms all state-of-the-art methods when trained and evaluated on HOI dataset V-COCO.
Abstract（参考訳）: ヒューマンインタラクションの検出は、人間の行動分析に不可欠である。ヒューマン・ツー・オブジェクト・インタラクション(HOI: Human-to-Object Interaction)の検出、すなわち、人とオブジェクトが相互作用する画像を検知し、インタラクションのタイプを分類する多くの方法が提案されている。しかしながら、社会と暴力の相互作用のような人間と人間の相互作用は、一般にHOIトレーニングデータセットでは考慮されていない。我々は、人間の行動を分析する際に、これらの相互作用はHOIとは無視できないと考えており、Human-to-Human-or-Object(H2O)という2種類の相互作用を扱うための新しい相互作用データセットを提案する。さらに, 動詞の新たな分類法を導入し, 人間の身体の態度を, 周囲の相互作用の標的に近づき, 環境から独立することを目的としている。既存のデータセットと異なり、それらの使用がターゲットタイプに依存する場合や、高いレベルの意味的解釈を必要とする場合、同義語動詞の定義を避けることに努める。 H2Oデータセットには、この新しい分類に注釈付けされたV-COCOイメージが含まれているため、画像には明らかにより多くの相互作用が含まれている。これは、人、ターゲット、あるいはインタラクションの数に依存する複雑さを持つhoi検出方法の問題だ。そこで本研究では,画像コンテンツに依存しない推定時間を一定に保ちながら,全インタラクションを1回のフォワードパスで検出する効率的な主観中心シングルショット法であるdiaboloを提案する。さらに、このマルチタスクネットワークは、すべての人とオブジェクトを同時に検出する。これらのタスクのネットワーク共有は,計算資源の節約だけでなく,協調的な性能向上にも寄与することを示す。最後に、DIABOLOは、HOIデータセットV-COCOのトレーニングおよび評価において、すべての最先端メソッドよりも優れているため、新しいH2Oインタラクション検出の課題の強力なベースラインである。

論文の概要: Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO

関連論文リスト