Related papers: UMDFood: Vision-language models boost food composition compilation

UMDFood: Vision-language models boost food composition compilation

URL: http://arxiv.org/abs/2306.01747v2
Date: Tue, 7 Nov 2023 02:09:18 GMT
Title: UMDFood: Vision-language models boost food composition compilation
Authors: Peihua Ma, Yixin Wu, Ning Yu, Yang Zhang, Michael Backes, Qin Wang, Cheng-I Wei
Abstract summary: We propose a novel vision-language model, UMDFood-VL, using front-of-package labeling and product images to accurately estimate food composition profiles. Up to 82.2% of selected products' estimated error between chemical analysis results and model estimation results are less than 10%. This performance sheds light on generalization towards other food and nutrition-related data compilation and catalyzation.
Score: 26.5694236976957
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Nutrition information is crucial in precision nutrition and the food industry. The current food composition compilation paradigm relies on laborious and experience-dependent methods. However, these methods struggle to keep up with the dynamic consumer market, resulting in delayed and incomplete nutrition data. In addition, earlier machine learning methods overlook the information in food ingredient statements or ignore the features of food images. To this end, we propose a novel vision-language model, UMDFood-VL, using front-of-package labeling and product images to accurately estimate food composition profiles. In order to empower model training, we established UMDFood-90k, the most comprehensive multimodal food database to date, containing 89,533 samples, each labeled with image and text-based ingredient descriptions and 11 nutrient annotations. UMDFood-VL achieves the macro-AUCROC up to 0.921 for fat content estimation, which is significantly higher than existing baseline methods and satisfies the practical requirements of food composition compilation. Meanwhile, up to 82.2% of selected products' estimated error between chemical analysis results and model estimation results are less than 10%. This performance sheds light on generalization towards other food and nutrition-related data compilation and catalyzation for the evolution of generative AI-based technology in other food applications that require personalization.

Related papers

Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion [69.84988999191343]
We introduce FastFood, a dataset with 84,446 images across 908 fast food categories, featuring ingredient and nutritional annotations.<n>We propose a new model-agnostic Visual-Ingredient Feature Fusion (VIF$2$) method to enhance nutrition estimation.
arXiv Detail & Related papers (2025-05-13T17:01:21Z)
Personalized Class Incremental Context-Aware Food Classification for Food Intake Monitoring Systems [3.8767314375943918]
Existing class-incremental food classification models have low accuracy for the new classes and lack personalization. This paper introduces a personalized, class-incremental food classification model designed to overcome these challenges. Our approach adapts itself to the new array of food classes, maintaining applicability and accuracy, both for new and existing classes by using personalization.
arXiv Detail & Related papers (2025-03-09T14:50:56Z)
NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias. Recent work has explored using computer vision prediction systems to predict nutritional information from food images. This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z)
NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation [68.49526750115429]
We introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation. The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish.
arXiv Detail & Related papers (2023-11-20T11:05:20Z)
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images. We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information. We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z)
Muti-Stage Hierarchical Food Classification [9.013592803864086]
We propose a multi-stage hierarchical framework for food item classification by iteratively clustering and merging food items during the training process. Our method is evaluated on VFN-nutrient dataset and achieve promising results compared with existing work in terms of both food type and food item classification.
arXiv Detail & Related papers (2023-09-03T04:45:44Z)
NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake Estimation [65.47310907481042]
One in four older adults are malnourished. Machine learning and computer vision show promise of automated nutrition tracking methods of food. NutritionVerse-3D is a large-scale high-resolution dataset of 105 3D food models.
arXiv Detail & Related papers (2023-04-12T05:27:30Z)
Towards the Creation of a Nutrition and Food Group Based Image Database [58.429385707376554]
We propose a framework to create a nutrition and food group based image database. We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS) Our proposed method is used to build a nutrition and food group based image database including 16,114 food datasets.
arXiv Detail & Related papers (2022-06-05T02:41:44Z)
Towards Building a Food Knowledge Graph for Internet of Food [66.57235827087092]
We review the evolution of food knowledge organization, from food classification to food to food knowledge graphs. Food knowledge graphs play an important role in food search and Question Answering (QA), personalized dietary recommendation, food analysis and visualization. Future directions for food knowledge graphs cover several fields such as multimodal food knowledge graphs and food intelligence.
arXiv Detail & Related papers (2021-07-13T06:26:53Z)
Picture-to-Amount (PITA): Predicting Relative Ingredient Amounts from Food Images [24.26111169033236]
We study the novel and challenging problem of predicting the relative amount of each ingredient from a food image. We propose PITA, the Picture-to-Amount deep learning architecture to solve the problem. Experiments on a dataset of recipes collected from the Internet show the model generates promising results.
arXiv Detail & Related papers (2020-10-17T06:43:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.