TASTEset -- Recipe Dataset and Food Entities Recognition Benchmark
- URL: http://arxiv.org/abs/2204.07775v1
- Date: Sat, 16 Apr 2022 10:52:21 GMT
- Title: TASTEset -- Recipe Dataset and Food Entities Recognition Benchmark
- Authors: Ania Wr\'oblewska, Agnieszka Kaliska, Maciej Paw{\l}owski, Dawid
Wi\'sniewski, Witold Sosnowski, Agnieszka {\L}awrynowicz
- Abstract summary: NER models are expected to find or infer various types of entities helpful in processing recipes.
The dataset consists of 700 recipes with more than 13,000 entities to extract.
We provide a few state-of-the-art baselines of named entity recognition models, which show that our dataset poses a solid challenge to existing models.
- Score: 1.0569625612398386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Food Computing is currently a fast-growing field of research. Natural
language processing (NLP) is also increasingly essential in this field,
especially for recognising food entities. However, there are still only a few
well-defined tasks that serve as benchmarks for solutions in this area. We
introduce a new dataset -- called \textit{TASTEset} -- to bridge this gap. In
this dataset, Named Entity Recognition (NER) models are expected to find or
infer various types of entities helpful in processing recipes, e.g.~food
products, quantities and their units, names of cooking processes, physical
quality of ingredients, their purpose, taste.
The dataset consists of 700 recipes with more than 13,000 entities to
extract. We provide a few state-of-the-art baselines of named entity
recognition models, which show that our dataset poses a solid challenge to
existing models. The best model achieved, on average, 0.95 $F_1$ score,
depending on the entity type -- from 0.781 to 0.982. We share the dataset and
the task to encourage progress on more in-depth and complex information
extraction from recipes.
Related papers
- MetaFood3D: Large 3D Food Object Dataset with Nutrition Values [53.24500333363066]
This dataset consists of 637 meticulously labeled 3D food objects across 108 categories, featuring detailed nutrition information, weight, and food codes linked to a comprehensive nutrition database.
Experimental results demonstrate our dataset's significant potential for improving algorithm performance, highlight the challenging gap between video captures and 3D scanned data, and show the strength of the MetaFood3D dataset in high-quality data generation, simulation, and augmentation.
arXiv Detail & Related papers (2024-09-03T15:02:52Z) - RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models [96.43285670458803]
Uni-Food is a unified food dataset that comprises over 100,000 images with various food labels.
Uni-Food is designed to provide a more holistic approach to food data analysis.
We introduce a novel Linear Rectification Mixture of Diverse Experts (RoDE) approach to address the inherent challenges of food-related multitasking.
arXiv Detail & Related papers (2024-07-17T16:49:34Z) - Deep Learning Based Named Entity Recognition Models for Recipes [7.507956305171027]
Named entity recognition (NER) is a technique for extracting information from unstructured or semi-structured data with known labels.
We created an augmented dataset of 26,445 phrases cumulatively.
We analyzed ingredient phrases from RecipeDB, the gold-standard recipe data repository, and annotated them using the Stanford NER.
A thorough investigation of NER approaches on these datasets involving statistical, fine-tuning of deep learning-based language models provides deep insights.
arXiv Detail & Related papers (2024-02-27T12:03:56Z) - FoodLMM: A Versatile Food Assistant using Large Multi-modal Model [96.76271649854542]
Large Multi-modal Models (LMMs) have made impressive progress in many vision-language tasks.
This paper proposes FoodLMM, a versatile food assistant based on LMMs with various capabilities.
We introduce a series of novel task-specific tokens and heads, enabling the model to predict food nutritional values and multiple segmentation masks.
arXiv Detail & Related papers (2023-12-22T11:56:22Z) - Assorted, Archetypal and Annotated Two Million (3A2M) Cooking Recipes
Dataset based on Active Learning [2.40907745415345]
We present a novel dataset of two million culinary recipes labeled in respective categories.
To construct the dataset, we collect the recipes from the RecipeNLG dataset.
There are more than two million recipes in our dataset, each of which is categorized and has a confidence score linked with it.
arXiv Detail & Related papers (2023-03-27T07:53:18Z) - Structured Vision-Language Pretraining for Computational Cooking [54.0571416522547]
Vision-Language Pretraining and Foundation models have been the go-to recipe for achieving SoTA performance on general benchmarks.
We propose to leverage these techniques for structured-text based computational cuisine tasks.
arXiv Detail & Related papers (2022-12-08T13:37:17Z) - Attention-based Ingredient Phrase Parser [3.499870393443268]
We propose a new ingredient parsing model that can parse an ingredient phrase of recipes into the structure form with its corresponding attributes with over 0.93 F1-score.
Experimental results show that our model achieves state-of-the-art performance on AllRecipes and Food.com datasets.
arXiv Detail & Related papers (2022-10-05T20:09:35Z) - A Large-Scale Benchmark for Food Image Segmentation [62.28029856051079]
We build a new food image dataset FoodSeg103 (and its extension FoodSeg154) containing 9,490 images.
We annotate these images with 154 ingredient classes and each image has an average of 6 ingredient labels and pixel-wise masks.
We propose a multi-modality pre-training approach called ReLeM that explicitly equips a segmentation model with rich and semantic food knowledge.
arXiv Detail & Related papers (2021-05-12T03:00:07Z) - CHEF: Cross-modal Hierarchical Embeddings for Food Domain Retrieval [20.292467149387594]
We introduce a novel cross-modal learning framework to jointly model the latent representations of images and text in the food image-recipe association and retrieval tasks.
Our experiments show that by making use of efficient tree-structured Long Short-Term Memory as the text encoder in our computational cross-modal retrieval framework, we are able to identify the main ingredients and cooking actions in the recipe descriptions without explicit supervision.
arXiv Detail & Related papers (2021-02-04T11:24:34Z) - ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked
Global-Local Attention Network [50.7720194859196]
We introduce the dataset ISIA Food- 500 with 500 categories from the list in the Wikipedia and 399,726 images.
This dataset surpasses existing popular benchmark datasets by category coverage and data volume.
We propose a stacked global-local attention network, which consists of two sub-networks for food recognition.
arXiv Detail & Related papers (2020-08-13T02:48:27Z) - A Named Entity Based Approach to Model Recipes [9.18959130745234]
We propose a structure that can accurately represent the recipe as well as a pipeline to infer the best representation of the recipe in this uniform structure.
Ingredients section in a recipe typically lists down the ingredients required and corresponding attributes such as quantity, temperature, and processing state.
The instruction section lists down a series of events in which a cooking technique or process is applied upon these utensils and ingredients.
arXiv Detail & Related papers (2020-04-25T16:37:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.