An End-to-end Food Portion Estimation Framework Based on Shape
Reconstruction from Monocular Image
- URL: http://arxiv.org/abs/2308.01810v1
- Date: Thu, 3 Aug 2023 15:17:24 GMT
- Title: An End-to-end Food Portion Estimation Framework Based on Shape
Reconstruction from Monocular Image
- Authors: Zeman Shao, Gautham Vinod, Jiangpeng He, Fengqing Zhu
- Abstract summary: We propose an end-to-end deep learning framework for food energy estimation from a monocular image through 3D shape reconstruction.
Our method is evaluated on a publicly available food image dataset Nutrition5k, resulting a Mean Absolute Error (MAE) of 40.05 kCal and Mean Absolute Percentage Error (MAPE) of 11.47% for food energy estimation.
- Score: 7.380382380564532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dietary assessment is a key contributor to monitoring health status. Existing
self-report methods are tedious and time-consuming with substantial biases and
errors. Image-based food portion estimation aims to estimate food energy values
directly from food images, showing great potential for automated dietary
assessment solutions. Existing image-based methods either use a single-view
image or incorporate multi-view images and depth information to estimate the
food energy, which either has limited performance or creates user burdens. In
this paper, we propose an end-to-end deep learning framework for food energy
estimation from a monocular image through 3D shape reconstruction. We leverage
a generative model to reconstruct the voxel representation of the food object
from the input image to recover the missing 3D information. Our method is
evaluated on a publicly available food image dataset Nutrition5k, resulting a
Mean Absolute Error (MAE) of 40.05 kCal and Mean Absolute Percentage Error
(MAPE) of 11.47% for food energy estimation. Our method uses RGB image as the
only input at the inference stage and achieves competitive results compared to
the existing method requiring both RGB and depth information.
Related papers
- Vision-Based Approach for Food Weight Estimation from 2D Images [0.9208007322096533]
The study employs a dataset of 2380 images comprising fourteen different food types in various portions, orientations, and containers.
The proposed methodology integrates deep learning and computer vision techniques, specifically employing Faster R-CNN for food detection and MobileNetV3 for weight estimation.
arXiv Detail & Related papers (2024-05-26T08:03:51Z) - NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias.
Recent work has explored using computer vision prediction systems to predict nutritional information from food images.
This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z) - How Much You Ate? Food Portion Estimation on Spoons [63.611551981684244]
Current image-based food portion estimation algorithms assume that users take images of their meals one or two times.
We introduce an innovative solution that utilizes stationary user-facing cameras to track food items on utensils.
The system is reliable for estimation of nutritional content of liquid-solid heterogeneous mixtures such as soups and stews.
arXiv Detail & Related papers (2024-05-12T00:16:02Z) - NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene
Dataset for Dietary Intake Estimation [68.49526750115429]
We introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation.
The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish.
arXiv Detail & Related papers (2023-11-20T11:05:20Z) - DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion [0.8579795118452238]
DPF-Nutrition is an end-to-end nutrition estimation method using monocular images.
In DPF-Nutrition, we introduced a depth prediction module to generate depth maps, thereby improving the accuracy of food portion estimation.
We also designed an RGB-D fusion module that combined monocular images with the predicted depth information, resulting in better performance for nutrition estimation.
arXiv Detail & Related papers (2023-10-18T04:23:05Z) - NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating.
Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images.
We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information.
We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z) - Image Based Food Energy Estimation With Depth Domain Adaptation [6.602838826255494]
We propose an "Energy Density Map" which is a pixel-to-pixel mapping from the RGB image to the energy density of the food.
We then incorporate the "Energy Density Map" with an associated depth map that is captured by a depth sensor to estimate the food energy.
arXiv Detail & Related papers (2022-08-25T15:18:48Z) - Towards the Creation of a Nutrition and Food Group Based Image Database [58.429385707376554]
We propose a framework to create a nutrition and food group based image database.
We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS)
Our proposed method is used to build a nutrition and food group based image database including 16,114 food datasets.
arXiv Detail & Related papers (2022-06-05T02:41:44Z) - Vision-Based Food Analysis for Automatic Dietary Assessment [49.32348549508578]
This review presents one unified Vision-Based Dietary Assessment (VBDA) framework, which generally consists of three stages: food image analysis, volume estimation and nutrient derivation.
Deep learning makes VBDA gradually move to an end-to-end implementation, which applies food images to a single network to directly estimate the nutrition.
arXiv Detail & Related papers (2021-08-06T05:46:01Z) - Towards Learning Food Portion From Monocular Images With Cross-Domain
Feature Adaptation [6.648441500207032]
We propose a deep regression process for portion size estimation by combining features estimated from both RGB and learned energy distribution domains.
Our estimates of food energy achieved state-of-the-art with a MAPE of 11.47%, significantly outperforms non-expert human estimates by 27.56%.
arXiv Detail & Related papers (2021-03-12T22:58:37Z) - An End-to-End Food Image Analysis System [8.622335099019214]
We propose an image-based food analysis framework that integrates food localization, classification and portion size estimation.
Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items.
Our framework is evaluated on a real life food image dataset collected from a nutrition feeding study.
arXiv Detail & Related papers (2021-02-01T05:36:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.