Multi-Task Image-Based Dietary Assessment for Food Recognition and
Portion Size Estimation
- URL: http://arxiv.org/abs/2004.13188v1
- Date: Mon, 27 Apr 2020 21:35:07 GMT
- Title: Multi-Task Image-Based Dietary Assessment for Food Recognition and
Portion Size Estimation
- Authors: Jiangpeng He, Zeman Shao, Janine Wright, Deborah Kerr, Carol Boushey
and Fengqing Zhu
- Abstract summary: We propose an end-to-end multi-task framework that can achieve both food classification and food portion size estimation.
Our results outperforms the baseline methods for both classification accuracy and mean absolute error for portion estimation.
- Score: 6.603050343996914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning based methods have achieved impressive results in many
applications for image-based diet assessment such as food classification and
food portion size estimation. However, existing methods only focus on one task
at a time, making it difficult to apply in real life when multiple tasks need
to be processed together. In this work, we propose an end-to-end multi-task
framework that can achieve both food classification and food portion size
estimation. We introduce a food image dataset collected from a nutrition study
where the groundtruth food portion is provided by registered dietitians. The
multi-task learning uses L2-norm based soft parameter sharing to train the
classification and regression tasks simultaneously. We also propose the use of
cross-domain feature adaptation together with normalization to further improve
the performance of food portion size estimation. Our results outperforms the
baseline methods for both classification accuracy and mean absolute error for
portion estimation, which shows great potential for advancing the field of
image-based dietary assessment.
Related papers
- From Canteen Food to Daily Meals: Generalizing Food Recognition to More
Practical Scenarios [92.58097090916166]
We present two new benchmarks, namely DailyFood-172 and DailyFood-16, designed to curate food images from everyday meals.
These two datasets are used to evaluate the transferability of approaches from the well-curated food image domain to the everyday-life food image domain.
arXiv Detail & Related papers (2024-03-12T08:32:23Z) - FoodLMM: A Versatile Food Assistant using Large Multi-modal Model [96.76271649854542]
Large Multi-modal Models (LMMs) have made impressive progress in many vision-language tasks.
This paper proposes FoodLMM, a versatile food assistant based on LMMs with various capabilities.
We introduce a series of novel task-specific tokens and heads, enabling the model to predict food nutritional values and multiple segmentation masks.
arXiv Detail & Related papers (2023-12-22T11:56:22Z) - NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating.
Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images.
We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information.
We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z) - Single-Stage Heavy-Tailed Food Classification [7.800379384628357]
We introduce a novel single-stage heavy-tailed food classification framework.
Our method is evaluated on two heavy-tailed food benchmark datasets, Food101-LT and VFN-LT.
arXiv Detail & Related papers (2023-07-01T00:45:35Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - Long-tailed Food Classification [5.874935571318868]
We introduce two new benchmark datasets for long-tailed food classification including Food101-LT and VFN-LT.
We propose a novel 2-Phase framework to address the problem of class-imbalance by (1) under the head classes to remove redundant samples along with maintaining the learned information through knowledge distillation.
We show the effectiveness of our method by comparing with existing state-of-the-art long-tailed classification methods and show improved performance on both Food101-LT and VFN-LT benchmarks.
arXiv Detail & Related papers (2022-10-26T14:29:30Z) - Towards the Creation of a Nutrition and Food Group Based Image Database [58.429385707376554]
We propose a framework to create a nutrition and food group based image database.
We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS)
Our proposed method is used to build a nutrition and food group based image database including 16,114 food datasets.
arXiv Detail & Related papers (2022-06-05T02:41:44Z) - Improving Dietary Assessment Via Integrated Hierarchy Food
Classification [7.398060062678395]
We introduce a new food classification framework to improve the quality of predictions by integrating the information from multiple domains.
Our method is validated on the modified VIPER-FoodNet (VFN) food image dataset by including associated energy and nutrient information.
arXiv Detail & Related papers (2021-09-06T20:59:58Z) - Saliency-Aware Class-Agnostic Food Image Segmentation [10.664526852464812]
We propose a class-agnostic food image segmentation method.
Using information from both the before and after eating images, we can segment food images by finding the salient missing objects.
Our method is validated on food images collected from a dietary study.
arXiv Detail & Related papers (2021-02-13T08:05:19Z) - An End-to-End Food Image Analysis System [8.622335099019214]
We propose an image-based food analysis framework that integrates food localization, classification and portion size estimation.
Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items.
Our framework is evaluated on a real life food image dataset collected from a nutrition feeding study.
arXiv Detail & Related papers (2021-02-01T05:36:20Z) - Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images
and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another.
We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities.
We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.