Deep learning approaches in food recognition
- URL: http://arxiv.org/abs/2004.03357v2
- Date: Wed, 8 Apr 2020 08:44:48 GMT
- Title: Deep learning approaches in food recognition
- Authors: Chairi Kiourt, George Pavlidis, Stella Markantonatou
- Abstract summary: This chapter focuses on the presentation of some popular approaches and techniques applied in image-based food recognition.
The three main lines of solutions, namely the design from scratch, the transfer learning and the platform-based approaches, are outlined.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic image-based food recognition is a particularly challenging task.
Traditional image analysis approaches have achieved low classification accuracy
in the past, whereas deep learning approaches enabled the identification of
food types and their ingredients. The contents of food dishes are typically
deformable objects, usually including complex semantics, which makes the task
of defining their structure very difficult. Deep learning methods have already
shown very promising results in such challenges, so this chapter focuses on the
presentation of some popular approaches and techniques applied in image-based
food recognition. The three main lines of solutions, namely the design from
scratch, the transfer learning and the platform-based approaches, are outlined,
particularly for the task at hand, and are tested and compared to reveal the
inherent strengths and weaknesses. The chapter is complemented with basic
background material, a section devoted to the relevant datasets that are
crucial in light of the empirical approaches adopted, and some concluding
remarks that underline the future directions.
Related papers
- Shape-Preserving Generation of Food Images for Automatic Dietary Assessment [1.602820210496921]
We present a simple GAN-based neural network architecture for conditional food image generation.
The shapes of the food and container in the generated images closely resemble those in the reference input image.
arXiv Detail & Related papers (2024-08-23T20:18:51Z) - Understanding the Limitations of Diffusion Concept Algebra Through Food [68.48103545146127]
latent diffusion models offer crucial insights into biases and concept relationships.
The food domain offers unique challenges through complex compositions and regional biases.
We reveal measurable insights into the model's ability to capture and represent the nuances of culinary diversity.
arXiv Detail & Related papers (2024-06-05T18:57:02Z) - Food Image Classification and Segmentation with Attention-based Multiple
Instance Learning [51.279800092581844]
The paper presents a weakly supervised methodology for training food image classification and semantic segmentation models.
The proposed methodology is based on a multiple instance learning approach in combination with an attention-based mechanism.
We conduct experiments on two meta-classes within the FoodSeg103 data set to verify the feasibility of the proposed approach.
arXiv Detail & Related papers (2023-08-22T13:59:47Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - Self-Supervised Visual Representation Learning on Food Images [6.602838826255494]
Existing deep learning-based methods learn the visual representation for downstream tasks based on human annotation of each food image.
Most food images in real life are obtained without labels, and data annotation requires plenty of time and human effort.
In this paper, we focus on the implementation and analysis of existing representative self-supervised learning methods on food images.
arXiv Detail & Related papers (2023-03-16T02:31:51Z) - Rethinking Cooking State Recognition with Vision Transformers [0.0]
Self-attention mechanism of Vision Transformer (ViT) architecture is proposed for the Cooking State Recognition task.
The proposed approach encapsulates the globally salient features from images, while also exploiting the weights learned from a larger dataset.
Our framework has an accuracy of 94.3%, which significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2022-12-16T17:06:28Z) - Self-Supervised Representation Learning: Introduction, Advances and
Challenges [125.38214493654534]
Self-supervised representation learning methods aim to provide powerful deep feature learning without the requirement of large annotated datasets.
This article introduces this vibrant area including key concepts, the four main families of approach and associated state of the art, and how self-supervised methods are applied to diverse modalities of data.
arXiv Detail & Related papers (2021-10-18T13:51:22Z) - Deep Long-Tailed Learning: A Survey [163.16874896812885]
Deep long-tailed learning aims to train well-performing deep models from a large number of images that follow a long-tailed class distribution.
Long-tailed class imbalance is a common problem in practical visual recognition tasks.
This paper provides a comprehensive survey on recent advances in deep long-tailed learning.
arXiv Detail & Related papers (2021-10-09T15:25:22Z) - Online Continual Learning For Visual Food Classification [7.704949298975352]
Existing methods require static datasets for training and are not capable of learning from sequentially available new food images.
We introduce a novel clustering based exemplar selection algorithm to store the most representative data belonging to each learned food.
Our results show significant improvements compared with existing state-of-the-art online continual learning methods.
arXiv Detail & Related papers (2021-08-15T17:48:03Z) - Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images
and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another.
We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities.
We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.