Learning Dense Hand Contact Estimation from Imbalanced Data
- URL: http://arxiv.org/abs/2505.11152v1
- Date: Fri, 16 May 2025 11:54:25 GMT
- Title: Learning Dense Hand Contact Estimation from Imbalanced Data
- Authors: Daniel Sungho Jung, Kyoung Mu Lee,
- Abstract summary: There are two major challenges for learning dense hand contact estimation.<n>First, there exists class imbalance issue from hand contact datasets where majority of samples are not in contact.<n>Second, hand contact datasets contain spatial imbalance issue with most of hand contact exhibited in finger tips.
- Score: 51.54990464786128
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Hands are essential to human interaction, and understanding contact between hands and the world can promote comprehensive understanding of their function. Recently, there have been growing number of hand interaction datasets that cover interaction with object, other hand, scene, and body. Despite the significance of the task and increasing high-quality data, how to effectively learn dense hand contact estimation remains largely underexplored. There are two major challenges for learning dense hand contact estimation. First, there exists class imbalance issue from hand contact datasets where majority of samples are not in contact. Second, hand contact datasets contain spatial imbalance issue with most of hand contact exhibited in finger tips, resulting in challenges for generalization towards contacts in other hand regions. To tackle these issues, we present a framework that learns dense HAnd COntact estimation (HACO) from imbalanced data. To resolve the class imbalance issue, we introduce balanced contact sampling, which builds and samples from multiple sampling groups that fairly represent diverse contact statistics for both contact and non-contact samples. Moreover, to address the spatial imbalance issue, we propose vertex-level class-balanced (VCB) loss, which incorporates spatially varying contact distribution by separately reweighting loss contribution of each vertex based on its contact frequency across dataset. As a result, we effectively learn to predict dense hand contact estimation with large-scale hand contact data without suffering from class and spatial imbalance issue. The codes will be released.
Related papers
- Collaborative Imputation of Urban Time Series through Cross-city Meta-learning [54.438991949772145]
We propose a novel collaborative imputation paradigm leveraging meta-learned implicit neural representations (INRs)<n>We then introduce a cross-city collaborative learning scheme through model-agnostic meta learning.<n>Experiments on a diverse urban dataset from 20 global cities demonstrate our model's superior imputation performance and generalizability.
arXiv Detail & Related papers (2025-01-20T07:12:40Z) - Ask, Pose, Unite: Scaling Data Acquisition for Close Interactions with Vision Language Models [5.541130887628606]
Social dynamics in close human interactions pose significant challenges for Human Mesh Estimation (HME)
We introduce a novel data generation method that utilizes Large Vision Language Models (LVLMs) to annotate contact maps which guide test-time optimization to produce paired image and pseudo-ground truth meshes.
This methodology not only alleviates the annotation burden but also enables the assembly of a comprehensive dataset specifically tailored for close interactions in HME.
arXiv Detail & Related papers (2024-10-01T01:14:24Z) - Ins-HOI: Instance Aware Human-Object Interactions Recovery [44.02128629239429]
We propose an end-to-end Instance-aware Human-Object Interactions recovery (Ins-HOI) framework.
Ins-HOI supports instance-level reconstruction and provides reasonable and realistic invisible contact surfaces.
We collect a large-scale, high-fidelity 3D scan dataset, including 5.2k high-quality scans with real-world human-chair and hand-object interactions.
arXiv Detail & Related papers (2023-12-15T09:30:47Z) - DECO: Dense Estimation of 3D Human-Scene Contact In The Wild [54.44345845842109]
We train a novel 3D contact detector that uses both body-part-driven and scene-context-driven attention to estimate contact on the SMPL body.
We significantly outperform existing SOTA methods across all benchmarks.
We also show qualitatively that DECO generalizes well to diverse and challenging real-world human interactions in natural images.
arXiv Detail & Related papers (2023-09-26T21:21:07Z) - MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties
Grounded in Math Reasoning Problems [74.73881579517055]
We propose a framework to generate such dialogues by pairing human teachers with a Large Language Model prompted to represent common student errors.
We describe how we use this framework to collect MathDial, a dataset of 3k one-to-one teacher-student tutoring dialogues.
arXiv Detail & Related papers (2023-05-23T21:44:56Z) - Contact2Grasp: 3D Grasp Synthesis via Hand-Object Contact Constraint [18.201389966034263]
3D grasp synthesis generates grasping poses given an input object.
We introduce an intermediate variable for grasp contact areas to constrain the grasp generation.
Our method outperforms state-of-the-art methods regarding grasp generation on various metrics.
arXiv Detail & Related papers (2022-10-17T16:39:25Z) - Collaborative Learning for Hand and Object Reconstruction with
Attention-guided Graph Convolution [49.10497573378427]
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality.
Our algorithm is optimisation to object models, and it learns the physical rules governing hand-object interaction.
Experiments using four widely-used benchmarks show that our framework achieves beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand and object shapes.
arXiv Detail & Related papers (2022-04-27T17:00:54Z) - Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and
Guided Progressive Label Correction [27.87570749976023]
We introduce a video-based method for predicting contact between a hand and an object.
Annotating a large number of hand-object tracks and contact labels is costly.
We propose a semi-supervised framework consisting of (i) automatic collection of training data with motion-based pseudo-labels and (ii) guided progressive label correction (gPLC)
arXiv Detail & Related papers (2021-10-19T18:00:02Z) - Few-shot Partial Multi-view Learning [103.33865779721458]
We propose a new task called few-shot partial multi-view learning.
It focuses on overcoming the negative impact of the view-missing issue in the low-data regime.
We conduct extensive experiments to evaluate our method.
arXiv Detail & Related papers (2021-05-05T13:34:43Z) - Data Profiling for Adversarial Training: On the Ruin of Problematic Data [27.11328449349065]
Problems in adversarial training include robustness-accuracy trade-off, robust overfitting, and gradient masking.
We show that these problems share one common cause -- low quality samples in the dataset.
We find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated.
arXiv Detail & Related papers (2021-02-15T10:17:24Z) - ContactPose: A Dataset of Grasps with Object Contact and Hand Pose [27.24450178180785]
We introduce ContactPose, the first dataset of hand-object contact paired with hand pose, object pose, and RGB-D images.
ContactPose has 2306 unique grasps of 25 household objects grasped with 2 functional intents by 50 participants, and more than 2.9 M RGB-D grasp images.
arXiv Detail & Related papers (2020-07-19T01:01:14Z) - Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers.
Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.