DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization
- URL: http://arxiv.org/abs/2406.04669v1
- Date: Fri, 7 Jun 2024 06:35:21 GMT
- Title: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization
- Authors: Chengang Hu, Xiao Liu, Yansong Feng,
- Abstract summary: We propose the DIsh NamE Recognition (DiNeR) task and create a large realistic Chinese dataset.
Given a recipe instruction, models are required to recognize the dish name composed of diverse combinations of food, actions, and flavors.
Our dataset consists of 3,811 dishes and 228,114 recipes, and involves plenty of linguistic phenomena such as anaphora, omission and ambiguity.
- Score: 30.05945103235578
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Most of the existing compositional generalization datasets are synthetically-generated, resulting in a lack of natural language variation. While there have been recent attempts to introduce non-synthetic datasets for compositional generalization, they suffer from either limited data scale or a lack of diversity in the forms of combinations. To better investigate compositional generalization with more linguistic phenomena and compositional diversity, we propose the DIsh NamE Recognition (DiNeR) task and create a large realistic Chinese dataset. Given a recipe instruction, models are required to recognize the dish name composed of diverse combinations of food, actions, and flavors. Our dataset consists of 3,811 dishes and 228,114 recipes, and involves plenty of linguistic phenomena such as anaphora, omission and ambiguity. We provide two strong baselines based on T5 and large language models (LLMs). This work contributes a challenging task, baseline methods to tackle the task, and insights into compositional generalization in the context of dish name recognition. Code and data are available at https://github.com/Jumpy-pku/DiNeR.
Related papers
- LaiDA: Linguistics-aware In-context Learning with Data Augmentation for Metaphor Components Identification [0.07989135005592125]
Large language models (LLMs) offer new avenues for accurate comprehension of complex natural language texts.
A new LLM-based framework is proposed, named Linguistics-aware In-context Learning with Data Augmentation (LaiDA)
LaiDA incorporates a simile dataset for pre-training. A graph attention network encoder generates linguistically rich feature representations to retrieve similar examples.
arXiv Detail & Related papers (2024-08-10T02:02:26Z) - Deep Learning Based Named Entity Recognition Models for Recipes [7.507956305171027]
Named entity recognition (NER) is a technique for extracting information from unstructured or semi-structured data with known labels.
We created an augmented dataset of 26,445 phrases cumulatively.
We analyzed ingredient phrases from RecipeDB, the gold-standard recipe data repository, and annotated them using the Stanford NER.
A thorough investigation of NER approaches on these datasets involving statistical, fine-tuning of deep learning-based language models provides deep insights.
arXiv Detail & Related papers (2024-02-27T12:03:56Z) - Data Factors for Better Compositional Generalization [60.698130703909804]
We conduct an empirical analysis by training Transformer models on a variety of training sets with different data factors.
We show that increased dataset complexity can lead to better generalization behavior on multiple different generalization challenges.
We explore how training examples of different difficulty levels influence generalization differently.
arXiv Detail & Related papers (2023-11-08T01:27:34Z) - Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for
Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems [64.40789703661987]
Multi3WOZ is a novel multilingual, multi-domain, multi-parallel ToD dataset.
It is large-scale and offers culturally adapted dialogs in 4 languages.
We describe a complex bottom-up data collection process that yielded the final dataset.
arXiv Detail & Related papers (2023-07-26T08:29:42Z) - On Evaluating Multilingual Compositional Generalization with Translated
Datasets [34.51457321680049]
We show that compositional generalization abilities differ across languages.
We craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese.
Even with the resulting robust benchmark, which we call MCWQ-R, we show that the distribution of compositions still suffers due to linguistic divergences.
arXiv Detail & Related papers (2023-06-20T10:03:57Z) - Tri-level Joint Natural Language Understanding for Multi-turn
Conversational Datasets [5.3361357265365035]
We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.
We evaluate our model on two multi-turn datasets for which we are the first to conduct joint slot-filling and intent detection.
arXiv Detail & Related papers (2023-05-28T13:59:58Z) - CompoundPiece: Evaluating and Improving Decompounding Performance of
Language Models [77.45934004406283]
We systematically study decompounding, the task of splitting compound words into their constituents.
We introduce a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
We introduce a novel methodology to train dedicated models for decompounding.
arXiv Detail & Related papers (2023-05-23T16:32:27Z) - Recursive Neural Networks with Bottlenecks Diagnose
(Non-)Compositionality [65.60002535580298]
Quantifying compositionality of data is a challenging task, which has been investigated primarily for short utterances.
We show that comparing data's representations in models with and without a bottleneck can be used to produce a compositionality metric.
The procedure is applied to the evaluation of arithmetic expressions using synthetic data, and sentiment classification using natural language data.
arXiv Detail & Related papers (2023-01-31T15:46:39Z) - Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings
for Complex Word Identification [0.27998963147546146]
Complex word identification (CWI) is a cornerstone process towards proper text simplification.
CWI is highly dependent on context, whereas its difficulty is augmented by the scarcity of available datasets.
We propose a novel training technique for the CWI task based on domain adaptation to improve the target character and context representations.
arXiv Detail & Related papers (2022-05-15T13:21:02Z) - Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization [80.94424037751243]
In zero-shot multilingual extractive text summarization, a model is typically trained on English dataset and then applied on summarization datasets of other languages.
We propose NLS (Neural Label Search for Summarization), which jointly learns hierarchical weights for different sets of labels together with our summarization model.
We conduct multilingual zero-shot summarization experiments on MLSUM and WikiLingua datasets, and we achieve state-of-the-art results using both human and automatic evaluations.
arXiv Detail & Related papers (2022-04-28T14:02:16Z) - Compositional Temporal Grounding with Structured Variational Cross-Graph
Correspondence Learning [92.07643510310766]
Temporal grounding in videos aims to localize one target video segment that semantically corresponds to a given query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We empirically find that they fail to generalize to queries with novel combinations of seen words.
We propose a variational cross-graph reasoning framework that explicitly decomposes video and language into multiple structured hierarchies.
arXiv Detail & Related papers (2022-03-24T12:55:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.