Fugu-MT 論文翻訳(概要): Learning Human-Object Interaction for 3D Human Pose Estimation from LiDAR Point Clouds

論文の概要: Learning Human-Object Interaction for 3D Human Pose Estimation from LiDAR Point Clouds

arxiv url: http://arxiv.org/abs/2603.16343v1
Date: Tue, 17 Mar 2026 10:20:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:07.218507
Title: Learning Human-Object Interaction for 3D Human Pose Estimation from LiDAR Point Clouds
Title（参考訳）: LiDAR点雲からの3次元人物位置推定のための人間と物体の相互作用の学習
Authors: Daniel Sungho Jung, Dohee Cho, Kyoung Mu Lee,
Abstract要約: LiDARポイントクラウドから人間を理解することは、自動運転における最も重要なタスクの1つだ。既存の手法は、人間とオブジェクトの相互作用を活用して、堅牢な3Dポーズ推定フレームワークを構築する可能性を大きく見落としている。そこで我々は,LiDAR点雲からの堅牢な3次元ポーズ推定のためのヒューマン・オブジェクト・インタラクション・ラーニング・フレームワークを提案する。
参考スコア（独自算出の注目度）: 49.219348802596876
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Understanding humans from LiDAR point clouds is one of the most critical tasks in autonomous driving due to its close relationships with pedestrian safety, yet it remains challenging in the presence of diverse human-object interactions and cluttered backgrounds. Nevertheless, existing methods largely overlook the potential of leveraging human-object interactions to build robust 3D human pose estimation frameworks. There are two major challenges that motivate the incorporation of human-object interaction. First, human-object interactions introduce spatial ambiguity between human and object points, which often leads to erroneous 3D human keypoint predictions in interaction regions. Second, there exists severe class imbalance in the number of points between interacting and non-interacting body parts, with the interaction-frequent regions such as hand and foot being sparsely observed in LiDAR data. To address these challenges, we propose a Human-Object Interaction Learning (HOIL) framework for robust 3D human pose estimation from LiDAR point clouds. To mitigate the spatial ambiguity issue, we present human-object interaction-aware contrastive learning (HOICL) that effectively enhances feature discrimination between human and object points, particularly in interaction regions. To alleviate the class imbalance issue, we introduce contact-aware part-guided pooling (CPPool) that adaptively reallocates representational capacity by compressing overrepresented points while preserving informative points from interacting body parts. In addition, we present an optional contact-based temporal refinement that refines erroneous per-frame keypoint estimates using contact cues over time. As a result, our HOIL effectively leverages human-object interaction to resolve spatial ambiguity and class imbalance in interaction regions. Codes will be released.
Abstract（参考訳）: LiDARの点雲から人間を理解することは、歩行者の安全と密接な関係にあるため、自動運転において最も重要なタスクの1つである。それでも既存の手法は、人間とオブジェクトの相互作用を活用して、堅牢な3Dポーズ推定フレームワークを構築する可能性を大きく見落としている。人間と物体の相互作用の組み入れを動機付ける2つの大きな課題がある。第一に、人間と物体の相互作用は、人間と物体の間の空間的あいまいさを導入し、しばしば相互作用領域における誤った3次元人間のキーポイント予測をもたらす。第2に、相互作用する身体部位と非相互作用する身体部位の点数に深刻な不均衡が存在し、LiDARデータでは手や足などの相互作用頻度の低い領域が軽視されている。これらの課題に対処するために,LiDAR点雲からの堅牢な3次元ポーズ推定のためのHuman-Object Interaction Learning (HOIL)フレームワークを提案する。空間的あいまいさの問題を緩和するため,人間と物体間の特徴識別,特に相互作用領域における特徴識別を効果的に強化する,人間と物体の相互作用を意識したコントラスト学習(HOICL)を提案する。クラス不均衡の問題を緩和するため,接触認識部分誘導プーリング(CPPool)を導入し,過剰表現点を圧縮し,相互作用する身体部分から情報点を保存することで表現能力を適応的に再配置する。さらに, フレーム単位のキーポイント推定を時間経過とともに改善する, 任意の接触ベース時間改善手法を提案する。その結果,人間と物体の相互作用を効果的に活用し,相互作用領域における空間的あいまいさとクラス不均衡を解消した。コードはリリースされる。

論文の概要: Learning Human-Object Interaction for 3D Human Pose Estimation from LiDAR Point Clouds

関連論文リスト