Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task
- URL: http://arxiv.org/abs/2411.01996v1
- Date: Mon, 04 Nov 2024 11:31:18 GMT
- Title: Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task
- Authors: Hoonick Lee, Mogan Gim, Donghyeon Park, Donghee Choi, Jaewoo Kang,
- Abstract summary: Large Language Models (LLMs) have shown promise in various creative domains, including culinary arts.
This study focuses on cuisine transfer-applying elements of one cuisine to another to assess LLMs' culinary creativity.
- Score: 18.229223484919775
- License:
- Abstract: The advent of Large Language Models (LLMs) have shown promise in various creative domains, including culinary arts. However, many LLMs still struggle to deliver the desired level of culinary creativity, especially when tasked with adapting recipes to meet specific cultural requirements. This study focuses on cuisine transfer-applying elements of one cuisine to another-to assess LLMs' culinary creativity. We employ a diverse set of LLMs to generate and evaluate culturally adapted recipes, comparing their evaluations against LLM and human judgments. We introduce the ASH (authenticity, sensitivity, harmony) benchmark to evaluate LLMs' recipe generation abilities in the cuisine transfer task, assessing their cultural accuracy and creativity in the culinary domain. Our findings reveal crucial insights into both generative and evaluative capabilities of LLMs in the culinary domain, highlighting strengths and limitations in understanding and applying cultural nuances in recipe creation. The code and dataset used in this project will be openly available in \url{http://github.com/dmis-lab/CulinaryASH}.
Related papers
- LLaVA-Chef: A Multi-modal Generative Model for Food Recipes [17.705244174235045]
Large language models (LLMs) have paved the way for Natural Language Processing approaches to delve deeper into food-related tasks.
This work proposes LLaVA-Chef, a novel model trained on a curated dataset of diverse recipe prompts.
A detailed qualitative analysis reveals that LLaVA-Chef generates more detailed recipes with precise ingredient mentions.
arXiv Detail & Related papers (2024-08-29T20:20:49Z) - Translating Across Cultures: LLMs for Intralingual Cultural Adaptation [12.5954253354303]
We define the task of cultural adaptation and create an evaluation framework to evaluate the performance of modern LLMs.
We analyze possible issues with automatic adaptation.
We hope that this paper will offer more insight into the cultural understanding of LLMs and their creativity in cross-cultural scenarios.
arXiv Detail & Related papers (2024-06-20T17:06:58Z) - FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture [60.51749998013166]
We introduce FoodieQA, a manually curated, fine-grained image-text dataset capturing the intricate features of food cultures across various regions in China.
We evaluate vision-language Models (VLMs) and large language models (LLMs) on newly collected, unseen food images and corresponding questions.
Our findings highlight that understanding food and its cultural implications remains a challenging and under-explored direction.
arXiv Detail & Related papers (2024-06-16T17:59:32Z) - Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense [98.09670425244462]
Large language models (LLMs) have demonstrated substantial commonsense understanding.
This paper examines the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks.
arXiv Detail & Related papers (2024-05-07T20:28:34Z) - Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge [47.57055368312541]
We introduce FmLAMA, a multilingual dataset centered on food-related cultural facts and variations in food practices.
We analyze LLMs across various architectures and configurations, evaluating their performance in both monolingual and multilingual settings.
arXiv Detail & Related papers (2024-04-10T08:49:27Z) - CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge [69.82940934994333]
We introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build challenging evaluation dataset.
Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions.
CULTURALBENCH-V0.1 is a compact yet high-quality evaluation dataset with users' red-teaming attempts.
arXiv Detail & Related papers (2024-04-10T00:25:09Z) - FoodLMM: A Versatile Food Assistant using Large Multi-modal Model [96.76271649854542]
Large Multi-modal Models (LMMs) have made impressive progress in many vision-language tasks.
This paper proposes FoodLMM, a versatile food assistant based on LMMs with various capabilities.
We introduce a series of novel task-specific tokens and heads, enabling the model to predict food nutritional values and multiple segmentation masks.
arXiv Detail & Related papers (2023-12-22T11:56:22Z) - Cultural Adaptation of Recipes [24.825456977440616]
We introduce a new task involving the translation and cultural adaptation of recipes between Chinese and English-speaking cuisines.
To support this investigation, we present CulturalRecipes, a unique dataset comprised of automatically paired recipes written in Mandarin Chinese and English.
We evaluate the performance of various methods, including GPT-4 and other Large Language Models, traditional machine translation, and information retrieval techniques.
arXiv Detail & Related papers (2023-10-26T12:39:20Z) - Counterfactual Recipe Generation: Exploring Compositional Generalization
in a Realistic Scenario [60.20197771545983]
We design the counterfactual recipe generation task, which asks models to modify a base recipe according to the change of an ingredient.
We collect a large-scale recipe dataset in Chinese for models to learn culinary knowledge.
Results show that existing models have difficulties in modifying the ingredients while preserving the original text style, and often miss actions that need to be adjusted.
arXiv Detail & Related papers (2022-10-20T17:21:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.