MetaFood3D: Large 3D Food Object Dataset with Nutrition Values
- URL: http://arxiv.org/abs/2409.01966v1
- Date: Tue, 3 Sep 2024 15:02:52 GMT
- Title: MetaFood3D: Large 3D Food Object Dataset with Nutrition Values
- Authors: Yuhao Chen, Jiangpeng He, Chris Czarnecki, Gautham Vinod, Talha Ibn Mahmud, Siddeshwar Raghavan, Jinge Ma, Dayou Mao, Saeejith Nair, Pengcheng Xi, Alexander Wong, Edward Delp, Fengqing Zhu,
- Abstract summary: This dataset consists of 637 meticulously labeled 3D food objects across 108 categories, featuring detailed nutrition information, weight, and food codes linked to a comprehensive nutrition database.
Experimental results demonstrate our dataset's significant potential for improving algorithm performance, highlight the challenging gap between video captures and 3D scanned data, and show the strength of the MetaFood3D dataset in high-quality data generation, simulation, and augmentation.
- Score: 53.24500333363066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Food computing is both important and challenging in computer vision (CV). It significantly contributes to the development of CV algorithms due to its frequent presence in datasets across various applications, ranging from classification and instance segmentation to 3D reconstruction. The polymorphic shapes and textures of food, coupled with high variation in forms and vast multimodal information, including language descriptions and nutritional data, make food computing a complex and demanding task for modern CV algorithms. 3D food modeling is a new frontier for addressing food-related problems, due to its inherent capability to deal with random camera views and its straightforward representation for calculating food portion size. However, the primary hurdle in the development of algorithms for food object analysis is the lack of nutrition values in existing 3D datasets. Moreover, in the broader field of 3D research, there is a critical need for domain-specific test datasets. To bridge the gap between general 3D vision and food computing research, we propose MetaFood3D. This dataset consists of 637 meticulously labeled 3D food objects across 108 categories, featuring detailed nutrition information, weight, and food codes linked to a comprehensive nutrition database. The dataset emphasizes intra-class diversity and includes rich modalities such as textured mesh files, RGB-D videos, and segmentation masks. Experimental results demonstrate our dataset's significant potential for improving algorithm performance, highlight the challenging gap between video captures and 3D scanned data, and show the strength of the MetaFood3D dataset in high-quality data generation, simulation, and augmentation.
Related papers
- MetaFood CVPR 2024 Challenge on Physically Informed 3D Food Reconstruction: Methods and Results [52.07174491056479]
We host the MetaFood Workshop and its challenge for Physically Informed 3D Food Reconstruction.
This challenge focuses on reconstructing volume-accurate 3D models of food items from 2D images, using a visible checkerboard as a size reference.
The solutions developed in this challenge achieved promising results in 3D food reconstruction, with significant potential for improving portion estimation for dietary assessment and nutritional monitoring.
arXiv Detail & Related papers (2024-07-12T14:15:48Z) - NutritionVerse-Synth: An Open Access Synthetically Generated 2D Food
Scene Dataset for Dietary Intake Estimation [71.22646949733833]
We introduce NutritionVerse- Synth (NV- Synth), a large-scale synthetic food image dataset.
NV- Synth contains 84,984 photorealistic meal images rendered from 7,082 dynamically plated 3D scenes.
As the largest open-source synthetic food dataset, NV- Synth highlights the value of physics-based simulations.
arXiv Detail & Related papers (2023-12-11T08:15:49Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake
Estimation [65.47310907481042]
One in four older adults are malnourished.
Machine learning and computer vision show promise of automated nutrition tracking methods of food.
NutritionVerse-3D is a large-scale high-resolution dataset of 105 3D food models.
arXiv Detail & Related papers (2023-04-12T05:27:30Z) - Deep Learning based Food Instance Segmentation using Synthetic Data [0.0]
This paper proposes a food segmentation method applicable to real-world through synthetic data.
To perform food segmentation on healthcare robot systems, we generate synthetic data using open-source 3D graphics software Blender.
As a result, on our real-world dataset, the model trained only synthetic data is available to segment food instances that are not trained with 52.2% mask AP@all.
arXiv Detail & Related papers (2021-07-15T08:36:54Z) - Towards Building a Food Knowledge Graph for Internet of Food [66.57235827087092]
We review the evolution of food knowledge organization, from food classification to food to food knowledge graphs.
Food knowledge graphs play an important role in food search and Question Answering (QA), personalized dietary recommendation, food analysis and visualization.
Future directions for food knowledge graphs cover several fields such as multimodal food knowledge graphs and food intelligence.
arXiv Detail & Related papers (2021-07-13T06:26:53Z) - Visual Aware Hierarchy Based Food Recognition [10.194167945992938]
We propose a new two-step food recognition system using Convolutional Neural Networks (CNNs) as the backbone architecture.
The food localization step is based on an implementation of the Faster R-CNN method to identify food regions.
In the food classification step, visually similar food categories can be clustered together automatically to generate a hierarchical structure.
arXiv Detail & Related papers (2020-12-06T20:25:31Z) - Are We Hungry for 3D LiDAR Data for Semantic Segmentation? A Survey and
Experimental Study [5.6780397318769245]
3D semantic segmentation is a fundamental task for robotic and autonomous driving applications.
Recent works have been focused on using deep learning techniques, whereas developing fine-annotated 3D LiDAR datasets is extremely labor intensive.
The performance limitation caused by insufficient datasets is called data hunger problem.
arXiv Detail & Related papers (2020-06-08T01:20:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.