Visual Aware Hierarchy Based Food Recognition
- URL: http://arxiv.org/abs/2012.03368v1
- Date: Sun, 6 Dec 2020 20:25:31 GMT
- Title: Visual Aware Hierarchy Based Food Recognition
- Authors: Runyu Mao, Jiangpeng He, Zeman Shao, Sri Kalyan Yarlagadda, Fengqing
Zhu
- Abstract summary: We propose a new two-step food recognition system using Convolutional Neural Networks (CNNs) as the backbone architecture.
The food localization step is based on an implementation of the Faster R-CNN method to identify food regions.
In the food classification step, visually similar food categories can be clustered together automatically to generate a hierarchical structure.
- Score: 10.194167945992938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Food recognition is one of the most important components in image-based
dietary assessment. However, due to the different complexity level of food
images and inter-class similarity of food categories, it is challenging for an
image-based food recognition system to achieve high accuracy for a variety of
publicly available datasets. In this work, we propose a new two-step food
recognition system that includes food localization and hierarchical food
classification using Convolutional Neural Networks (CNNs) as the backbone
architecture. The food localization step is based on an implementation of the
Faster R-CNN method to identify food regions. In the food classification step,
visually similar food categories can be clustered together automatically to
generate a hierarchical structure that represents the semantic visual relations
among food categories, then a multi-task CNN model is proposed to perform the
classification task based on the visual aware hierarchical structure. Since the
size and quality of dataset is a key component of data driven methods, we
introduce a new food image dataset, VIPER-FoodNet (VFN) dataset, consists of 82
food categories with 15k images based on the most commonly consumed foods in
the United States. A semi-automatic crowdsourcing tool is used to provide the
ground-truth information for this dataset including food object bounding boxes
and food object labels. Experimental results demonstrate that our system can
significantly improve both classification and recognition performance on 4
publicly available datasets and the new VFN dataset.
Related papers
- Personalized Food Image Classification: Benchmark Datasets and New
Baseline [8.019925729254178]
We propose a new framework for personalized food image classification by leveraging self-supervised learning and temporal image feature information.
Our method is evaluated on both benchmark datasets and shows improved performance compared to existing works.
arXiv Detail & Related papers (2023-09-15T20:11:07Z) - Muti-Stage Hierarchical Food Classification [9.013592803864086]
We propose a multi-stage hierarchical framework for food item classification by iteratively clustering and merging food items during the training process.
Our method is evaluated on VFN-nutrient dataset and achieve promising results compared with existing work in terms of both food type and food item classification.
arXiv Detail & Related papers (2023-09-03T04:45:44Z) - Diffusion Model with Clustering-based Conditioning for Food Image
Generation [22.154182296023404]
Deep learning-based techniques are commonly used to perform image analysis such as food classification, segmentation, and portion size estimation.
One potential solution is to use synthetic food images for data augmentation.
In this paper, we propose an effective clustering-based training framework, named ClusDiff, for generating high-quality and representative food images.
arXiv Detail & Related papers (2023-09-01T01:40:39Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - Towards the Creation of a Nutrition and Food Group Based Image Database [58.429385707376554]
We propose a framework to create a nutrition and food group based image database.
We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS)
Our proposed method is used to build a nutrition and food group based image database including 16,114 food datasets.
arXiv Detail & Related papers (2022-06-05T02:41:44Z) - Learning Structural Representations for Recipe Generation and Food
Retrieval [101.97397967958722]
We propose a novel framework of Structure-aware Generation Network (SGN) to tackle the food recipe generation task.
Our proposed model can produce high-quality and coherent recipes, and achieve the state-of-the-art performance on the benchmark Recipe1M dataset.
arXiv Detail & Related papers (2021-10-04T06:36:31Z) - Improving Dietary Assessment Via Integrated Hierarchy Food
Classification [7.398060062678395]
We introduce a new food classification framework to improve the quality of predictions by integrating the information from multiple domains.
Our method is validated on the modified VIPER-FoodNet (VFN) food image dataset by including associated energy and nutrient information.
arXiv Detail & Related papers (2021-09-06T20:59:58Z) - A Large-Scale Benchmark for Food Image Segmentation [62.28029856051079]
We build a new food image dataset FoodSeg103 (and its extension FoodSeg154) containing 9,490 images.
We annotate these images with 154 ingredient classes and each image has an average of 6 ingredient labels and pixel-wise masks.
We propose a multi-modality pre-training approach called ReLeM that explicitly equips a segmentation model with rich and semantic food knowledge.
arXiv Detail & Related papers (2021-05-12T03:00:07Z) - Structure-Aware Generation Network for Recipe Generation from Images [142.047662926209]
We investigate an open research task of generating cooking instructions based on only food images and ingredients.
Target recipes are long-length paragraphs and do not have annotations on structure information.
We propose a novel framework of Structure-aware Generation Network (SGN) to tackle the food recipe generation task.
arXiv Detail & Related papers (2020-09-02T10:54:25Z) - ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked
Global-Local Attention Network [50.7720194859196]
We introduce the dataset ISIA Food- 500 with 500 categories from the list in the Wikipedia and 399,726 images.
This dataset surpasses existing popular benchmark datasets by category coverage and data volume.
We propose a stacked global-local attention network, which consists of two sub-networks for food recognition.
arXiv Detail & Related papers (2020-08-13T02:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.