KitchenScale: Learning to predict ingredient quantities from recipe
contexts
- URL: http://arxiv.org/abs/2304.10739v1
- Date: Fri, 21 Apr 2023 04:28:16 GMT
- Title: KitchenScale: Learning to predict ingredient quantities from recipe
contexts
- Authors: Donghee Choi, Mogan Gim, Samy Badreddine, Hajung Kim, Donghyeon Park,
Jaewoo Kang
- Abstract summary: KitchenScale is a model that predicts a target ingredient's quantity and measurement unit given its recipe context.
We formulate an ingredient quantity prediction task that consists of three sub-tasks which are ingredient measurement type classification, unit classification, and quantity regression task.
Experiments with our newly constructed dataset and recommendation examples demonstrate KitchenScale's understanding of various recipe contexts.
- Score: 13.001618172288198
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Determining proper quantities for ingredients is an essential part of cooking
practice from the perspective of enriching tastiness and promoting healthiness.
We introduce KitchenScale, a fine-tuned Pre-trained Language Model (PLM) that
predicts a target ingredient's quantity and measurement unit given its recipe
context. To effectively train our KitchenScale model, we formulate an
ingredient quantity prediction task that consists of three sub-tasks which are
ingredient measurement type classification, unit classification, and quantity
regression task. Furthermore, we utilized transfer learning of cooking
knowledge from recipe texts to PLMs. We adopted the Discrete Latent Exponent
(DExp) method to cope with high variance of numerical scales in recipe corpora.
Experiments with our newly constructed dataset and recommendation examples
demonstrate KitchenScale's understanding of various recipe contexts and
generalizability in predicting ingredient quantities. We implemented a web
application for KitchenScale to demonstrate its functionality in recommending
ingredient quantities expressed in numerals (e.g., 2) with units (e.g., ounce).
Related papers
- RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models [96.43285670458803]
Uni-Food is a unified food dataset that comprises over 100,000 images with various food labels.
Uni-Food is designed to provide a more holistic approach to food data analysis.
We introduce a novel Linear Rectification Mixture of Diverse Experts (RoDE) approach to address the inherent challenges of food-related multitasking.
arXiv Detail & Related papers (2024-07-17T16:49:34Z) - FoodLMM: A Versatile Food Assistant using Large Multi-modal Model [96.76271649854542]
Large Multi-modal Models (LMMs) have made impressive progress in many vision-language tasks.
This paper proposes FoodLMM, a versatile food assistant based on LMMs with various capabilities.
We introduce a series of novel task-specific tokens and heads, enabling the model to predict food nutritional values and multiple segmentation masks.
arXiv Detail & Related papers (2023-12-22T11:56:22Z) - Counterfactual Recipe Generation: Exploring Compositional Generalization
in a Realistic Scenario [60.20197771545983]
We design the counterfactual recipe generation task, which asks models to modify a base recipe according to the change of an ingredient.
We collect a large-scale recipe dataset in Chinese for models to learn culinary knowledge.
Results show that existing models have difficulties in modifying the ingredients while preserving the original text style, and often miss actions that need to be adjusted.
arXiv Detail & Related papers (2022-10-20T17:21:46Z) - RecipeMind: Guiding Ingredient Choices from Food Pairing to Recipe
Completion using Cascaded Set Transformer [15.170251924099807]
RecipeMind is a food affinity score prediction model that quantifies the suitability of adding an ingredient to set of other ingredients.
We constructed a large-scale dataset containing ingredient co-occurrence based scores to train and evaluate RecipeMind on food affinity score prediction.
arXiv Detail & Related papers (2022-10-14T06:35:49Z) - Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers
and Self-supervised Learning [17.42688184238741]
Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives.
We propose a simplified end-to-end model based on well established and high performing encoders for text and images.
Our proposed method achieves state-of-the-art performance in the cross-modal recipe retrieval task on the Recipe1M dataset.
arXiv Detail & Related papers (2021-03-24T10:17:09Z) - Multi-Task Learning for Calorie Prediction on a Novel Large-Scale Recipe
Dataset Enriched with Nutritional Information [25.646488178514186]
In this work, we aim to estimate the calorie amount of a meal directly from an image by learning from recipes people have published on the Internet.
We propose the pic2kcal benchmark comprising 308,000 images from over 70,000 recipes including photographs, ingredients and instructions.
Our experiments demonstrate clear benefits of multi-task learning for calorie estimation, surpassing the single-task calorie regression by 9.9%.
arXiv Detail & Related papers (2020-11-02T16:11:51Z) - Multi-modal Cooking Workflow Construction for Food Recipes [147.4435186953995]
We build MM-ReS, the first large-scale dataset for cooking workflow construction.
We propose a neural encoder-decoder model that utilizes both visual and textual information to construct the cooking workflow.
arXiv Detail & Related papers (2020-08-20T18:31:25Z) - Decomposing Generation Networks with Structure Prediction for Recipe
Generation [142.047662926209]
We propose a novel framework: Decomposing Generation Networks (DGN) with structure prediction.
Specifically, we split each cooking instruction into several phases, and assign different sub-generators to each phase.
Our approach includes two novel ideas: (i) learning the recipe structures with the global structure prediction component and (ii) producing recipe phases in the sub-generator output component based on the predicted structure.
arXiv Detail & Related papers (2020-07-27T08:47:50Z) - Classification of Cuisines from Sequentially Structured Recipes [8.696042114987966]
classification of cuisines based on their culinary features is an outstanding problem.
We have implemented a range of classification techniques by accounting for this information on the RecipeDB dataset.
The state-of-the-art RoBERTa model presented the highest accuracy of 73.30% among a range of classification models.
arXiv Detail & Related papers (2020-04-26T05:40:36Z) - A Named Entity Based Approach to Model Recipes [9.18959130745234]
We propose a structure that can accurately represent the recipe as well as a pipeline to infer the best representation of the recipe in this uniform structure.
Ingredients section in a recipe typically lists down the ingredients required and corresponding attributes such as quantity, temperature, and processing state.
The instruction section lists down a series of events in which a cooking technique or process is applied upon these utensils and ingredients.
arXiv Detail & Related papers (2020-04-25T16:37:26Z) - Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images
and Recipes with Semantic Consistency and Attention Mechanism [70.85894675131624]
We learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another.
We propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities through aligning output semantic probabilities.
We show that we can outperform several state-of-the-art cross-modal retrieval strategies for food images and cooking recipes by a significant margin.
arXiv Detail & Related papers (2020-03-09T07:41:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.