Modeling human intention inference in continuous 3D domains by inverse
planning and body kinematics
- URL: http://arxiv.org/abs/2112.00903v1
- Date: Thu, 2 Dec 2021 00:55:58 GMT
- Title: Modeling human intention inference in continuous 3D domains by inverse
planning and body kinematics
- Authors: Yingdong Qian, Marta Kryven, Tao Gao, Hanbyul Joo, Josh Tenenbaum
- Abstract summary: We describe a computational framework for evaluating models of goal inference in the domain of 3D motor actions.
We evaluate our framework in three behavioural experiments using a novel Target Reaching Task, in which human observers infer intentions of actors reaching for targets among distracts.
We show that human observers indeed rely on inverse body kinematics in such scenarios, suggesting that modeling body kinematic can improve performance of inference algorithms.
- Score: 31.421686048250827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to build AI that understands human intentions, and uses this knowledge to
collaborate with people? We describe a computational framework for evaluating
models of goal inference in the domain of 3D motor actions, which receives as
input the 3D coordinates of an agent's body, and of possible targets, to
produce a continuously updated inference of the intended target. We evaluate
our framework in three behavioural experiments using a novel Target Reaching
Task, in which human observers infer intentions of actors reaching for targets
among distracts. We describe Generative Body Kinematics model, which predicts
human intention inference in this domain using Bayesian inverse planning and
inverse body kinematics. We compare our model to three heuristics, which
formalize the principle of least effort using simple assumptions about the
actor's constraints, without the use of inverse planning. Despite being more
computationally costly, the Generative Body Kinematics model outperforms the
heuristics in certain scenarios, such as environments with obstacles, and at
the beginning of reaching actions while the actor is relatively far from the
intended target. The heuristics make increasingly accurate predictions during
later stages of reaching actions, such as, when the intended target is close,
and can be inferred by extrapolating the wrist trajectory. Our results identify
contexts in which inverse body kinematics is useful for intention inference. We
show that human observers indeed rely on inverse body kinematics in such
scenarios, suggesting that modeling body kinematic can improve performance of
inference algorithms.
Related papers
- Kinematics-based 3D Human-Object Interaction Reconstruction from Single View [10.684643503514849]
Existing methods simply predict the body poses merely rely on network training on some indoor datasets.
We propose a kinematics-based method that can drive the joints of human body to the human-object contact regions accurately.
arXiv Detail & Related papers (2024-07-19T05:44:35Z) - 3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects [13.58353565350936]
We contribute methodology to jointly estimate the geometry and pose of objects grasped by a robot.
Our method transforms the estimated geometry into the robot's coordinate frame.
We empirically evaluate our approach on a robot manipulator holding a diverse set of real-world objects.
arXiv Detail & Related papers (2024-07-14T21:02:55Z) - Multimodal Sense-Informed Prediction of 3D Human Motions [16.71099574742631]
This work introduces a novel multi-modal sense-informed motion prediction approach, which conditions high-fidelity generation on two modal information.
The gaze information is regarded as the human intention, and combined with both motion and scene features, we construct a ternary intention-aware attention to supervise the generation.
On two real-world benchmarks, the proposed method achieves state-of-the-art performance both in 3D human pose and trajectory prediction.
arXiv Detail & Related papers (2024-05-05T12:38:10Z) - WANDR: Intention-guided Human Motion Generation [67.07028110459787]
We introduce WANDR, a data-driven model that takes an avatar's initial pose and a goal's 3D position and generates natural human motions that place the end effector (wrist) on the goal location.
Intention guides the agent to the goal, and interactively adapts the generation to novel situations without needing to define sub-goals or the entire motion path.
We evaluate our method extensively and demonstrate its ability to generate natural and long-term motions that reach 3D goals and to unseen goal locations.
arXiv Detail & Related papers (2024-04-23T10:20:17Z) - Controllable Human-Object Interaction Synthesis [77.56877961681462]
We propose Controllable Human-Object Interaction Synthesis (CHOIS) to generate synchronized object motion and human motion in 3D scenes.
Here, language descriptions inform style and intent, and waypoints, which can be effectively extracted from high-level planning, ground the motion in the scene.
Our module seamlessly integrates with a path planning module, enabling the generation of long-term interactions in 3D environments.
arXiv Detail & Related papers (2023-12-06T21:14:20Z) - Goal-directed Planning and Goal Understanding by Active Inference:
Evaluation Through Simulated and Physical Robot Experiments [3.7660066212240757]
We show that goal-directed action planning can be formulated using the free energy principle.
The proposed model is built on a variational recurrent neural network model.
arXiv Detail & Related papers (2022-02-21T03:48:35Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - 3D Pose Estimation and Future Motion Prediction from 2D Images [26.28886209268217]
This paper considers to jointly tackle the highly correlated tasks of estimating 3D human body poses and predicting future 3D motions from RGB image sequences.
Based on Lie algebra pose representation, a novel self-projection mechanism is proposed that naturally preserves human motion kinematics.
arXiv Detail & Related papers (2021-11-26T01:02:00Z) - Self-Attentive 3D Human Pose and Shape Estimation from Videos [82.63503361008607]
We present a video-based learning algorithm for 3D human pose and shape estimation.
We exploit temporal information in videos and propose a self-attention module.
We evaluate our method on the 3DPW, MPI-INF-3DHP, and Human3.6M datasets.
arXiv Detail & Related papers (2021-03-26T00:02:19Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.