Self-Supervised Visual Representation Learning on Food Images
- URL: http://arxiv.org/abs/2303.09046v1
- Date: Thu, 16 Mar 2023 02:31:51 GMT
- Title: Self-Supervised Visual Representation Learning on Food Images
- Authors: Andrew Peng, Jiangpeng He, Fengqing Zhu
- Abstract summary: Existing deep learning-based methods learn the visual representation for downstream tasks based on human annotation of each food image.
Most food images in real life are obtained without labels, and data annotation requires plenty of time and human effort.
In this paper, we focus on the implementation and analysis of existing representative self-supervised learning methods on food images.
- Score: 6.602838826255494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Food image analysis is the groundwork for image-based dietary assessment,
which is the process of monitoring what kinds of food and how much energy is
consumed using captured food or eating scene images. Existing deep
learning-based methods learn the visual representation for downstream tasks
based on human annotation of each food image. However, most food images in real
life are obtained without labels, and data annotation requires plenty of time
and human effort, which is not feasible for real-world applications. To make
use of the vast amount of unlabeled images, many existing works focus on
unsupervised or self-supervised learning of visual representations directly
from unlabeled data. However, none of these existing works focus on food
images, which is more challenging than general objects due to its high
inter-class similarity and intra-class variance.
In this paper, we focus on the implementation and analysis of existing
representative self-supervised learning methods on food images. Specifically,
we first compare the performance of six selected self-supervised learning
models on the Food-101 dataset. Then we analyze the pros and cons of each
selected model when training on food data to identify the key factors that can
help improve the performance. Finally, we propose several ideas for future work
on self-supervised visual representation learning for food images.
Related papers
- Shape-Preserving Generation of Food Images for Automatic Dietary Assessment [1.602820210496921]
We present a simple GAN-based neural network architecture for conditional food image generation.
The shapes of the food and container in the generated images closely resemble those in the reference input image.
arXiv Detail & Related papers (2024-08-23T20:18:51Z) - How Much You Ate? Food Portion Estimation on Spoons [63.611551981684244]
Current image-based food portion estimation algorithms assume that users take images of their meals one or two times.
We introduce an innovative solution that utilizes stationary user-facing cameras to track food items on utensils.
The system is reliable for estimation of nutritional content of liquid-solid heterogeneous mixtures such as soups and stews.
arXiv Detail & Related papers (2024-05-12T00:16:02Z) - From Canteen Food to Daily Meals: Generalizing Food Recognition to More
Practical Scenarios [92.58097090916166]
We present two new benchmarks, namely DailyFood-172 and DailyFood-16, designed to curate food images from everyday meals.
These two datasets are used to evaluate the transferability of approaches from the well-curated food image domain to the everyday-life food image domain.
arXiv Detail & Related papers (2024-03-12T08:32:23Z) - Personalized Food Image Classification: Benchmark Datasets and New
Baseline [8.019925729254178]
We propose a new framework for personalized food image classification by leveraging self-supervised learning and temporal image feature information.
Our method is evaluated on both benchmark datasets and shows improved performance compared to existing works.
arXiv Detail & Related papers (2023-09-15T20:11:07Z) - NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating.
Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images.
We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information.
We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z) - Diffusion Model with Clustering-based Conditioning for Food Image
Generation [22.154182296023404]
Deep learning-based techniques are commonly used to perform image analysis such as food classification, segmentation, and portion size estimation.
One potential solution is to use synthetic food images for data augmentation.
In this paper, we propose an effective clustering-based training framework, named ClusDiff, for generating high-quality and representative food images.
arXiv Detail & Related papers (2023-09-01T01:40:39Z) - Food Image Classification and Segmentation with Attention-based Multiple
Instance Learning [51.279800092581844]
The paper presents a weakly supervised methodology for training food image classification and semantic segmentation models.
The proposed methodology is based on a multiple instance learning approach in combination with an attention-based mechanism.
We conduct experiments on two meta-classes within the FoodSeg103 data set to verify the feasibility of the proposed approach.
arXiv Detail & Related papers (2023-08-22T13:59:47Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - Online Continual Learning For Visual Food Classification [7.704949298975352]
Existing methods require static datasets for training and are not capable of learning from sequentially available new food images.
We introduce a novel clustering based exemplar selection algorithm to store the most representative data belonging to each learned food.
Our results show significant improvements compared with existing state-of-the-art online continual learning methods.
arXiv Detail & Related papers (2021-08-15T17:48:03Z) - A Large-Scale Benchmark for Food Image Segmentation [62.28029856051079]
We build a new food image dataset FoodSeg103 (and its extension FoodSeg154) containing 9,490 images.
We annotate these images with 154 ingredient classes and each image has an average of 6 ingredient labels and pixel-wise masks.
We propose a multi-modality pre-training approach called ReLeM that explicitly equips a segmentation model with rich and semantic food knowledge.
arXiv Detail & Related papers (2021-05-12T03:00:07Z) - Saliency-Aware Class-Agnostic Food Image Segmentation [10.664526852464812]
We propose a class-agnostic food image segmentation method.
Using information from both the before and after eating images, we can segment food images by finding the salient missing objects.
Our method is validated on food images collected from a dietary study.
arXiv Detail & Related papers (2021-02-13T08:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.