Related papers: Single-Stage Heavy-Tailed Food Classification

Single-Stage Heavy-Tailed Food Classification

URL: http://arxiv.org/abs/2307.00182v1
Date: Sat, 1 Jul 2023 00:45:35 GMT
Title: Single-Stage Heavy-Tailed Food Classification
Authors: Jiangpeng He and Fengqing Zhu
Abstract summary: We introduce a novel single-stage heavy-tailed food classification framework. Our method is evaluated on two heavy-tailed food benchmark datasets, Food101-LT and VFN-LT.
Score: 7.800379384628357
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning based food image classification has enabled more accurate nutrition content analysis for image-based dietary assessment by predicting the types of food in eating occasion images. However, there are two major obstacles to apply food classification in real life applications. First, real life food images are usually heavy-tailed distributed, resulting in severe class-imbalance issue. Second, it is challenging to train a single-stage (i.e. end-to-end) framework under heavy-tailed data distribution, which cause the over-predictions towards head classes with rich instances and under-predictions towards tail classes with rare instance. In this work, we address both issues by introducing a novel single-stage heavy-tailed food classification framework. Our method is evaluated on two heavy-tailed food benchmark datasets, Food101-LT and VFN-LT, and achieves the best performance compared to existing work with over 5% improvements for top-1 accuracy.

Related papers

From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios [92.58097090916166]
We present two new benchmarks, namely DailyFood-172 and DailyFood-16, designed to curate food images from everyday meals. These two datasets are used to evaluate the transferability of approaches from the well-curated food image domain to the everyday-life food image domain.
arXiv Detail & Related papers (2024-03-12T08:32:23Z)
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images. We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information. We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z)
Diffusion Model with Clustering-based Conditioning for Food Image Generation [22.154182296023404]
Deep learning-based techniques are commonly used to perform image analysis such as food classification, segmentation, and portion size estimation. One potential solution is to use synthetic food images for data augmentation. In this paper, we propose an effective clustering-based training framework, named ClusDiff, for generating high-quality and representative food images.
arXiv Detail & Related papers (2023-09-01T01:40:39Z)
Long-Tailed Continual Learning For Visual Food Recognition [5.377869029561348]
The distribution of food images in real life is usually long-tailed as a small number of popular food types are consumed more frequently than others. We propose a novel end-to-end framework for long-tailed continual learning, which effectively addresses the catastrophic forgetting. We also introduce a novel data augmentation technique by integrating class-activation-map (CAM) and CutMix.
arXiv Detail & Related papers (2023-07-01T00:55:05Z)
Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food. One challenge is that food items can overlap and mix, making them difficult to distinguish. Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT) The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z)
Self-Supervised Visual Representation Learning on Food Images [6.602838826255494]
Existing deep learning-based methods learn the visual representation for downstream tasks based on human annotation of each food image. Most food images in real life are obtained without labels, and data annotation requires plenty of time and human effort. In this paper, we focus on the implementation and analysis of existing representative self-supervised learning methods on food images.
arXiv Detail & Related papers (2023-03-16T02:31:51Z)
Long-tailed Food Classification [5.874935571318868]
We introduce two new benchmark datasets for long-tailed food classification including Food101-LT and VFN-LT. We propose a novel 2-Phase framework to address the problem of class-imbalance by (1) under the head classes to remove redundant samples along with maintaining the learned information through knowledge distillation. We show the effectiveness of our method by comparing with existing state-of-the-art long-tailed classification methods and show improved performance on both Food101-LT and VFN-LT benchmarks.
arXiv Detail & Related papers (2022-10-26T14:29:30Z)
Towards the Creation of a Nutrition and Food Group Based Image Database [58.429385707376554]
We propose a framework to create a nutrition and food group based image database. We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS) Our proposed method is used to build a nutrition and food group based image database including 16,114 food datasets.
arXiv Detail & Related papers (2022-06-05T02:41:44Z)
A Large-Scale Benchmark for Food Image Segmentation [62.28029856051079]
We build a new food image dataset FoodSeg103 (and its extension FoodSeg154) containing 9,490 images. We annotate these images with 154 ingredient classes and each image has an average of 6 ingredient labels and pixel-wise masks. We propose a multi-modality pre-training approach called ReLeM that explicitly equips a segmentation model with rich and semantic food knowledge.
arXiv Detail & Related papers (2021-05-12T03:00:07Z)
Multi-Task Image-Based Dietary Assessment for Food Recognition and Portion Size Estimation [6.603050343996914]
We propose an end-to-end multi-task framework that can achieve both food classification and food portion size estimation. Our results outperforms the baseline methods for both classification accuracy and mean absolute error for portion estimation.
arXiv Detail & Related papers (2020-04-27T21:35:07Z)
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another. We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities. We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.