6D Pose Estimation on Spoons and Hands
- URL: http://arxiv.org/abs/2505.02335v1
- Date: Mon, 05 May 2025 03:15:12 GMT
- Title: 6D Pose Estimation on Spoons and Hands
- Authors: Kevin Tan, Fan Yang, Yuhao Chen,
- Abstract summary: This paper implements a system that analyzes stationary video feed of people eating.<n>It uses 6D pose estimation to track hand and spoon movements to capture spatial position and orientation.
- Score: 7.17871898732232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate dietary monitoring is essential for promoting healthier eating habits. A key area of research is how people interact and consume food using utensils and hands. By tracking their position and orientation, it is possible to estimate the volume of food being consumed, or monitor eating behaviours, highly useful insights into nutritional intake that can be more reliable than popular methods such as self-reporting. Hence, this paper implements a system that analyzes stationary video feed of people eating, using 6D pose estimation to track hand and spoon movements to capture spatial position and orientation. In doing so, we examine the performance of two state-of-the-art (SOTA) video object segmentation (VOS) models, both quantitatively and qualitatively, and identify main sources of error within the system.
Related papers
- FoodTrack: Estimating Handheld Food Portions with Egocentric Video [5.010690651107531]
FoodTrack estimates food volume directly, without relying on gestures or fixed assumptions about bite size.<n>We achieve absolute percentage loss of approximately 7.01% on a handheld food object.
arXiv Detail & Related papers (2025-05-07T01:53:16Z) - Dietary Intake Estimation via Continuous 3D Reconstruction of Food [5.010690651107531]
This study proposes an approach to accurately monitor ingest behaviours by leveraging 3D food models constructed from monocular 2D video.<n>Experiments with toy models and real food items demonstrate the approach's potential.
arXiv Detail & Related papers (2025-05-01T15:35:42Z) - DietGlance: Dietary Monitoring and Personalized Analysis at a Glance with Knowledge-Empowered AI Assistant [36.806619917276414]
We present DietGlance, a system that automatically monitors dietary in daily routines and delivers personalized analysis from knowledge sources.<n>DietGlance first detects ingestive episodes from multimodal inputs using eyeglasses, capturing privacy-preserving meal images of various dishes being consumed.<n>Based on the inferred food items and consumed quantities from these images, DietGlance further provides nutritional analysis and personalized dietary suggestions.
arXiv Detail & Related papers (2025-02-03T12:46:37Z) - NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias.
Recent work has explored using computer vision prediction systems to predict nutritional information from food images.
This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z) - How Much You Ate? Food Portion Estimation on Spoons [63.611551981684244]
Current image-based food portion estimation algorithms assume that users take images of their meals one or two times.
We introduce an innovative solution that utilizes stationary user-facing cameras to track food items on utensils.
The system is reliable for estimation of nutritional content of liquid-solid heterogeneous mixtures such as soups and stews.
arXiv Detail & Related papers (2024-05-12T00:16:02Z) - NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene
Dataset for Dietary Intake Estimation [68.49526750115429]
We introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation.
The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish.
arXiv Detail & Related papers (2023-11-20T11:05:20Z) - NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating.
Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images.
We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information.
We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z) - Vision-Based Food Analysis for Automatic Dietary Assessment [49.32348549508578]
This review presents one unified Vision-Based Dietary Assessment (VBDA) framework, which generally consists of three stages: food image analysis, volume estimation and nutrient derivation.
Deep learning makes VBDA gradually move to an end-to-end implementation, which applies food images to a single network to directly estimate the nutrition.
arXiv Detail & Related papers (2021-08-06T05:46:01Z) - An Intelligent Passive Food Intake Assessment System with Egocentric
Cameras [14.067860492694251]
Malnutrition is a major public health concern in low-and-middle-income countries (LMICs)
We propose to implement an intelligent passive food intake assessment system via egocentric cameras.
Our method is able to reliably monitor food intake and give feedback on users' eating behaviour.
arXiv Detail & Related papers (2021-05-07T09:47:51Z) - MyFood: A Food Segmentation and Classification System to Aid Nutritional
Monitoring [1.5469452301122173]
The absence of food monitoring has contributed significantly to the increase in the population's weight.
Some solutions have been proposed in computer vision to recognize food images, but few are specialized in nutritional monitoring.
This work presents the development of an intelligent system that classifies and segments food presented in images to help the automatic monitoring of user diet and nutritional intake.
arXiv Detail & Related papers (2020-12-05T17:40:05Z) - Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images
and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another.
We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities.
We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.