NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches
- URL: http://arxiv.org/abs/2309.07704v2
- Date: Mon, 2 Sep 2024 02:08:22 GMT
- Title: NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches
- Authors: Chi-en Amy Tai, Matthew Keller, Saeejith Nair, Yuhao Chen, Yifan Wu, Olivia Markham, Krish Parmar, Pengcheng Xi, Heather Keller, Sharon Kirkpatrick, Alexander Wong,
- Abstract summary: Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating.
Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images.
We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information.
We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
- Score: 59.38343165508926
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs and may necessitate trained personnel. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images, but the lack of comprehensive datasets with diverse viewpoints, modalities and food annotations hinders the accuracy and realism of such methods. To address this limitation, we introduce NutritionVerse-Synth, the first large-scale dataset of 84,984 photorealistic synthetic 2D food images with associated dietary information and multimodal annotations (including depth images, instance masks, and semantic masks). Additionally, we collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism. Leveraging these novel datasets, we develop and benchmark NutritionVerse, an empirical study of various dietary intake estimation approaches, including indirect segmentation-based and direct prediction networks. We further fine-tune models pretrained on synthetic data with real images to provide insights into the fusion of synthetic and real data. Finally, we release both datasets (NutritionVerse-Synth, NutritionVerse-Real) on https://www.kaggle.com/nutritionverse/datasets as part of an open initiative to accelerate machine learning for dietary sensing.
Related papers
- NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias.
Recent work has explored using computer vision prediction systems to predict nutritional information from food images.
This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z) - NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene
Dataset for Dietary Intake Estimation [68.49526750115429]
We introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation.
The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish.
arXiv Detail & Related papers (2023-11-20T11:05:20Z) - Personalized Food Image Classification: Benchmark Datasets and New
Baseline [8.019925729254178]
We propose a new framework for personalized food image classification by leveraging self-supervised learning and temporal image feature information.
Our method is evaluated on both benchmark datasets and shows improved performance compared to existing works.
arXiv Detail & Related papers (2023-09-15T20:11:07Z) - NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake
Estimation [65.47310907481042]
One in four older adults are malnourished.
Machine learning and computer vision show promise of automated nutrition tracking methods of food.
NutritionVerse-3D is a large-scale high-resolution dataset of 105 3D food models.
arXiv Detail & Related papers (2023-04-12T05:27:30Z) - Towards the Creation of a Nutrition and Food Group Based Image Database [58.429385707376554]
We propose a framework to create a nutrition and food group based image database.
We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS)
Our proposed method is used to build a nutrition and food group based image database including 16,114 food datasets.
arXiv Detail & Related papers (2022-06-05T02:41:44Z) - Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food [8.597152169571057]
We introduce Nutrition5k, a novel dataset of 5k diverse, real world food dishes with corresponding video streams, depth images, component weights, and high accuracy nutritional content annotation.
We demonstrate the potential of this dataset by training a computer vision algorithm capable of predicting the caloric and macronutrient values of a complex, real world dish at an accuracy that outperforms professional nutritionists.
arXiv Detail & Related papers (2021-03-04T22:59:22Z) - An Artificial Intelligence-Based System to Assess Nutrient Intake for
Hospitalised Patients [4.048427587958764]
Regular monitoring of nutrient intake in hospitalised patients plays a critical role in reducing the risk of disease-related malnutrition.
We propose a novel system based on artificial intelligence (AI) to accurately estimate nutrient intake.
arXiv Detail & Related papers (2020-03-18T15:28:51Z) - Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images
and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another.
We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities.
We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.