Related papers: Food Portion Estimation: From Pixels to Calories

Food Portion Estimation: From Pixels to Calories

URL: http://arxiv.org/abs/2602.05078v1
Date: Wed, 04 Feb 2026 21:53:21 GMT
Title: Food Portion Estimation: From Pixels to Calories
Authors: Gautham Vinod, Fengqing Zhu,
Abstract summary: Image-based dietary assessment suffers from estimating the three dimensional size of food from 2D image inputs.<n>Deep learning also helps bridge the gap by either using monocular images or combinations of the image and the auxillary inputs to precisely predict the output portion from the image input.
Score: 9.670264791361605
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reliance on images for dietary assessment is an important strategy to accurately and conveniently monitor an individual's health, making it a vital mechanism in the prevention and care of chronic diseases and obesity. However, image-based dietary assessment suffers from estimating the three dimensional size of food from 2D image inputs. Many strategies have been devised to overcome this critical limitation such as the use of auxiliary inputs like depth maps, multi-view inputs, or model-based approaches such as template matching. Deep learning also helps bridge the gap by either using monocular images or combinations of the image and the auxillary inputs to precisely predict the output portion from the image input. In this paper, we explore the different strategies employed for accurate portion estimation.

Related papers

Vision-Based Approach for Food Weight Estimation from 2D Images [0.9208007322096533]
The study employs a dataset of 2380 images comprising fourteen different food types in various portions, orientations, and containers. The proposed methodology integrates deep learning and computer vision techniques, specifically employing Faster R-CNN for food detection and MobileNetV3 for weight estimation.
arXiv Detail & Related papers (2024-05-26T08:03:51Z)
NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias. Recent work has explored using computer vision prediction systems to predict nutritional information from food images. This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z)
How Much You Ate? Food Portion Estimation on Spoons [63.611551981684244]
Current image-based food portion estimation algorithms assume that users take images of their meals one or two times. We introduce an innovative solution that utilizes stationary user-facing cameras to track food items on utensils. The system is reliable for estimation of nutritional content of liquid-solid heterogeneous mixtures such as soups and stews.
arXiv Detail & Related papers (2024-05-12T00:16:02Z)
NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation [68.49526750115429]
We introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation. The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish.
arXiv Detail & Related papers (2023-11-20T11:05:20Z)
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images. We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information. We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z)
An End-to-end Food Portion Estimation Framework Based on Shape Reconstruction from Monocular Image [7.380382380564532]
We propose an end-to-end deep learning framework for food energy estimation from a monocular image through 3D shape reconstruction. Our method is evaluated on a publicly available food image dataset Nutrition5k, resulting a Mean Absolute Error (MAE) of 40.05 kCal and Mean Absolute Percentage Error (MAPE) of 11.47% for food energy estimation.
arXiv Detail & Related papers (2023-08-03T15:17:24Z)
Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food. One challenge is that food items can overlap and mix, making them difficult to distinguish. Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT) The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z)
Saliency-Aware Class-Agnostic Food Image Segmentation [10.664526852464812]
We propose a class-agnostic food image segmentation method. Using information from both the before and after eating images, we can segment food images by finding the salient missing objects. Our method is validated on food images collected from a dietary study.
arXiv Detail & Related papers (2021-02-13T08:05:19Z)
An End-to-End Food Image Analysis System [8.622335099019214]
We propose an image-based food analysis framework that integrates food localization, classification and portion size estimation. Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items. Our framework is evaluated on a real life food image dataset collected from a nutrition feeding study.
arXiv Detail & Related papers (2021-02-01T05:36:20Z)
Multi-Task Image-Based Dietary Assessment for Food Recognition and Portion Size Estimation [6.603050343996914]
We propose an end-to-end multi-task framework that can achieve both food classification and food portion size estimation. Our results outperforms the baseline methods for both classification accuracy and mean absolute error for portion estimation.
arXiv Detail & Related papers (2020-04-27T21:35:07Z)
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another. We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities. We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.