Conditional Synthetic Food Image Generation
- URL: http://arxiv.org/abs/2303.09005v1
- Date: Thu, 16 Mar 2023 00:23:20 GMT
- Title: Conditional Synthetic Food Image Generation
- Authors: Wenjin Fu, Yue Han, Jiangpeng He, Sriram Baireddy, Mridul Gupta,
Fengqing Zhu
- Abstract summary: Generative Adversarial Networks (GAN) have been widely investigated for image synthesis based on their powerful representation learning ability.
We aim to explore the capability and improve the performance of GAN methods for food image generation.
- Score: 12.235703733345833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GAN) have been widely investigated for image
synthesis based on their powerful representation learning ability. In this
work, we explore the StyleGAN and its application of synthetic food image
generation. Despite the impressive performance of GAN for natural image
generation, food images suffer from high intra-class diversity and inter-class
similarity, resulting in overfitting and visual artifacts for synthetic images.
Therefore, we aim to explore the capability and improve the performance of GAN
methods for food image generation. Specifically, we first choose StyleGAN3 as
the baseline method to generate synthetic food images and analyze the
performance. Then, we identify two issues that can cause performance
degradation on food images during the training phase: (1) inter-class feature
entanglement during multi-food classes training and (2) loss of high-resolution
detail during image downsampling. To address both issues, we propose to train
one food category at a time to avoid feature entanglement and leverage image
patches cropped from high-resolution datasets to retain fine details. We
evaluate our method on the Food-101 dataset and show improved quality of
generated synthetic food images compared with the baseline. Finally, we
demonstrate the great potential of improving the performance of downstream
tasks, such as food image classification by including high-quality synthetic
training samples in the data augmentation.
Related papers
- Foodfusion: A Novel Approach for Food Image Composition via Diffusion Models [48.821150379374714]
We introduce a large-scale, high-quality food image composite dataset, FC22k, which comprises 22,000 foreground, background, and ground truth ternary image pairs.
We propose a novel food image composition method, Foodfusion, which incorporates a Fusion Module for processing and integrating foreground and background information.
arXiv Detail & Related papers (2024-08-26T09:32:16Z) - Shape-Preserving Generation of Food Images for Automatic Dietary Assessment [1.602820210496921]
We present a simple GAN-based neural network architecture for conditional food image generation.
The shapes of the food and container in the generated images closely resemble those in the reference input image.
arXiv Detail & Related papers (2024-08-23T20:18:51Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation [69.91401809979709]
Current state-of-the-art image generation models such as Latent Diffusion Models (LDMs) have demonstrated the capacity to produce visually striking food-related images.
We introduce FoodFusion, a Latent Diffusion model engineered specifically for the faithful synthesis of realistic food images from textual descriptions.
The development of the FoodFusion model involves harnessing an extensive array of open-source food datasets, resulting in over 300,000 curated image-caption pairs.
arXiv Detail & Related papers (2023-12-06T15:07:12Z) - Diffusion Model with Clustering-based Conditioning for Food Image
Generation [22.154182296023404]
Deep learning-based techniques are commonly used to perform image analysis such as food classification, segmentation, and portion size estimation.
One potential solution is to use synthetic food images for data augmentation.
In this paper, we propose an effective clustering-based training framework, named ClusDiff, for generating high-quality and representative food images.
arXiv Detail & Related papers (2023-09-01T01:40:39Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - IMAGINE: Image Synthesis by Image-Guided Model Inversion [79.4691654458141]
We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images.
We leverage the knowledge of image semantics from a pre-trained classifier to achieve plausible generations.
IMAGINE enables the synthesis procedure to simultaneously 1) enforce semantic specificity constraints during the synthesis, 2) produce realistic images without generator training, and 3) give users intuitive control over the generation process.
arXiv Detail & Related papers (2021-04-13T02:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.