Related papers: Cook-Gen: Robust Generative Modeling of Cooking Actions from Recipes

Cook-Gen: Robust Generative Modeling of Cooking Actions from Recipes

URL: http://arxiv.org/abs/2306.01805v1
Date: Thu, 1 Jun 2023 18:49:47 GMT
Title: Cook-Gen: Robust Generative Modeling of Cooking Actions from Recipes
Authors: Revathy Venkataramanan, Kaushik Roy, Kanak Raj, Renjith Prasad, Yuxin Zi, Vignesh Narayanan, Amit Sheth
Abstract summary: Food computation models have become increasingly popular in assisting people in maintaining healthy eating habits. In this study, we explore the use of generative AI methods to extend current food computation models to include cooking actions. We propose novel aggregation-based generative AI methods, Cook-Gen, that reliably generate cooking actions from recipes.
Score: 6.666528076345153
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As people become more aware of their food choices, food computation models have become increasingly popular in assisting people in maintaining healthy eating habits. For example, food recommendation systems analyze recipe instructions to assess nutritional contents and provide recipe recommendations. The recent and remarkable successes of generative AI methods, such as auto-regressive large language models, can lead to robust methods for a more comprehensive understanding of recipes for healthy food recommendations beyond surface-level nutrition content assessments. In this study, we explore the use of generative AI methods to extend current food computation models, primarily involving the analysis of nutrition and ingredients, to also incorporate cooking actions (e.g., add salt, fry the meat, boil the vegetables, etc.). Cooking actions are notoriously hard to model using statistical learning methods due to irregular data patterns - significantly varying natural language descriptions for the same action (e.g., marinate the meat vs. marinate the meat and leave overnight) and infrequently occurring patterns (e.g., add salt occurs far more frequently than marinating the meat). The prototypical approach to handling irregular data patterns is to increase the volume of data that the model ingests by orders of magnitude. Unfortunately, in the cooking domain, these problems are further compounded with larger data volumes presenting a unique challenge that is not easily handled by simply scaling up. In this work, we propose novel aggregation-based generative AI methods, Cook-Gen, that reliably generate cooking actions from recipes, despite difficulties with irregular data patterns, while also outperforming Large Language Models and other strong baselines.

Related papers

Personalized Class Incremental Context-Aware Food Classification for Food Intake Monitoring Systems [3.8767314375943918]
Existing class-incremental food classification models have low accuracy for the new classes and lack personalization. This paper introduces a personalized, class-incremental food classification model designed to overcome these challenges. Our approach adapts itself to the new array of food classes, maintaining applicability and accuracy, both for new and existing classes by using personalization.
arXiv Detail & Related papers (2025-03-09T14:50:56Z)
CookingDiffusion: Cooking Procedural Image Generation with Stable Diffusion [58.92430755180394]
We present textbfCookingDiffusion, a novel approach to generate photo-realistic images of cooking steps. These prompts encompass text prompts, image prompts, and multi-modal prompts, ensuring the consistent generation of cooking procedural images. Our experimental results demonstrate that our model excels at generating high-quality cooking procedural images.
arXiv Detail & Related papers (2025-01-15T06:58:53Z)
Retrieval Augmented Recipe Generation [96.43285670458803]
We propose a retrieval augmented large multimodal model for recipe generation. It retrieves recipes semantically related to the image from an existing datastore as a supplement. It calculates the consistency among generated recipe candidates, which use different retrieval recipes as context for generation.
arXiv Detail & Related papers (2024-11-13T15:58:50Z)
NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias. Recent work has explored using computer vision prediction systems to predict nutritional information from food images. This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z)
FIRE: Food Image to REcipe generation [10.45344523054623]
Food computing aims to develop end-to-end intelligent systems capable of autonomously producing recipe information for a food image. This paper proposes FIRE, a novel methodology tailored to recipe generation in the food computing domain. We showcase two practical applications that can benefit from integrating FIRE with large language model prompting.
arXiv Detail & Related papers (2023-08-28T08:14:20Z)
Large Language Models as Sous Chefs: Revising Recipes with GPT-3 [56.7155146252028]
We focus on recipes as an example of complex, diverse, and widely used instructions. We develop a prompt grounded in the original recipe and ingredients list that breaks recipes down into simpler steps. We also contribute an Amazon Mechanical Turk task that is carefully designed to reduce fatigue while collecting human judgment of the quality of recipe revisions.
arXiv Detail & Related papers (2023-06-24T14:42:43Z)
Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario [60.20197771545983]
We design the counterfactual recipe generation task, which asks models to modify a base recipe according to the change of an ingredient. We collect a large-scale recipe dataset in Chinese for models to learn culinary knowledge. Results show that existing models have difficulties in modifying the ingredients while preserving the original text style, and often miss actions that need to be adjusted.
arXiv Detail & Related papers (2022-10-20T17:21:46Z)
Simulating Personal Food Consumption Patterns using a Modified Markov Chain [5.874935571318868]
We propose a novel framework to simulate personal food consumption data patterns, leveraging the use of a modified Markov chain model and self-supervised learning. Our experimental results demonstrate promising performance compared with random simulation and the original Markov chain method.
arXiv Detail & Related papers (2022-08-13T18:50:23Z)
Assistive Recipe Editing through Critiquing [34.1050269670062]
RecipeCrit is a hierarchical denoising auto-encoder that edits recipes given ingredient-level critiques. Our work's main innovation is our unsupervised critiquing module that allows users to edit recipes by interacting with the predicted ingredients.
arXiv Detail & Related papers (2022-05-05T05:52:27Z)
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning [17.42688184238741]
Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives. We propose a simplified end-to-end model based on well established and high performing encoders for text and images. Our proposed method achieves state-of-the-art performance in the cross-modal recipe retrieval task on the Recipe1M dataset.
arXiv Detail & Related papers (2021-03-24T10:17:09Z)
Picture-to-Amount (PITA): Predicting Relative Ingredient Amounts from Food Images [24.26111169033236]
We study the novel and challenging problem of predicting the relative amount of each ingredient from a food image. We propose PITA, the Picture-to-Amount deep learning architecture to solve the problem. Experiments on a dataset of recipes collected from the Internet show the model generates promising results.
arXiv Detail & Related papers (2020-10-17T06:43:18Z)
Multi-modal Cooking Workflow Construction for Food Recipes [147.4435186953995]
We build MM-ReS, the first large-scale dataset for cooking workflow construction. We propose a neural encoder-decoder model that utilizes both visual and textual information to construct the cooking workflow.
arXiv Detail & Related papers (2020-08-20T18:31:25Z)
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another. We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities. We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.