VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation
- URL: http://arxiv.org/abs/2505.10205v1
- Date: Thu, 15 May 2025 12:03:05 GMT
- Title: VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation
- Authors: Umair Haroon, Ahmad AlMughrabi, Thanasis Zoumpekas, Ricardo Marques, Petia Radeva,
- Abstract summary: We present VolE, a novel framework that leverages mobile device-driven 3D reconstruction to estimate food volume.<n>VolE captures images and camera locations in free motion to generate precise 3D models, thanks to AR-capable mobile devices.<n>Our experiments demonstrate that VolE outperforms the existing volume estimation techniques across multiple datasets by achieving 2.22 % MAPE.
- Score: 4.621139625109643
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurate food volume estimation is crucial for medical nutrition management and health monitoring applications, but current food volume estimation methods are often limited by mononuclear data, leveraging single-purpose hardware such as 3D scanners, gathering sensor-oriented information such as depth information, or relying on camera calibration using a reference object. In this paper, we present VolE, a novel framework that leverages mobile device-driven 3D reconstruction to estimate food volume. VolE captures images and camera locations in free motion to generate precise 3D models, thanks to AR-capable mobile devices. To achieve real-world measurement, VolE is a reference- and depth-free framework that leverages food video segmentation for food mask generation. We also introduce a new food dataset encompassing the challenging scenarios absent in the previous benchmarks. Our experiments demonstrate that VolE outperforms the existing volume estimation techniques across multiple datasets by achieving 2.22 % MAPE, highlighting its superior performance in food volume estimation.
Related papers
- VolTex: Food Volume Estimation using Text-Guided Segmentation and Neural Surface Reconstruction [4.282795945742752]
Existing 3D Food Volume estimation methods accurately compute the food volume but lack for food portions selection.<n>We present VolTex, a framework that improves changethe food object selection in food volume estimation.
arXiv Detail & Related papers (2025-06-03T14:03:28Z) - MFP3D: Monocular Food Portion Estimation Leveraging 3D Point Clouds [7.357322789192671]
In this paper, we introduce a new framework for accurate food estimation using only a single monocular image.
The framework consists of three key modules: (1) a 3D Reconstruction Module that generates a 3D point cloud representation of the food from the 2D image, (2) a Feature Extraction Module that extracts and represents features from both the 3D point cloud and the 2D RGB image, and (3) a Portion Regression Module that employs a deep regression model to estimate the food's volume and energy content.
arXiv Detail & Related papers (2024-11-14T22:17:27Z) - MetaFood3D: 3D Food Dataset with Nutrition Values [52.16894900096017]
This dataset consists of 743 meticulously scanned and labeled 3D food objects across 131 categories.<n>Our MetaFood3D dataset emphasizes intra-class diversity and includes rich modalities such as textured mesh files, RGB-D videos, and segmentation masks.
arXiv Detail & Related papers (2024-09-03T15:02:52Z) - MetaFood CVPR 2024 Challenge on Physically Informed 3D Food Reconstruction: Methods and Results [52.07174491056479]
We host the MetaFood Workshop and its challenge for Physically Informed 3D Food Reconstruction.
This challenge focuses on reconstructing volume-accurate 3D models of food items from 2D images, using a visible checkerboard as a size reference.
The solutions developed in this challenge achieved promising results in 3D food reconstruction, with significant potential for improving portion estimation for dietary assessment and nutritional monitoring.
arXiv Detail & Related papers (2024-07-12T14:15:48Z) - VolETA: One- and Few-shot Food Volume Estimation [4.282795945742752]
We present VolETA, a sophisticated methodology for estimating food volume using 3D generative techniques.
Our approach creates a scaled 3D mesh of food objects using one- or few-RGBD images.
We achieve robust and accurate volume estimations with 10.97% MAPE using the MTF dataset.
arXiv Detail & Related papers (2024-07-01T18:47:15Z) - Food Portion Estimation via 3D Object Scaling [8.164262056488447]
We propose a new framework to estimate both food volume and energy from 2D images.
Our method estimates the pose of the camera and the food object in the input image.
We also introduce a new dataset, SimpleFood45, which contains 2D images of 45 food items.
arXiv Detail & Related papers (2024-04-18T15:23:37Z) - VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics.
In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z) - OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.