Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic
benchmarking
- URL: http://arxiv.org/abs/2112.00086v1
- Date: Tue, 30 Nov 2021 20:36:56 GMT
- Title: Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic
benchmarking
- Authors: Ronen Tamari, Kyle Richardson, Aviad Sar-Shalom, Noam Kahlon, Nelson
Liu, Reut Tsarfaty, Dafna Shahaf
- Abstract summary: Dyna-bAbI is a dynamic framework providing fine-grained control over task generation in bAbI.
We demonstrate our ideas by constructing three new tasks requiring compositional generalization.
- Score: 16.109330335379962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While neural language models often perform surprisingly well on natural
language understanding (NLU) tasks, their strengths and limitations remain
poorly understood. Controlled synthetic tasks are thus an increasingly
important resource for diagnosing model behavior. In this work we focus on
story understanding, a core competency for NLU systems. However, the main
synthetic resource for story understanding, the bAbI benchmark, lacks such a
systematic mechanism for controllable task generation. We develop Dyna-bAbI, a
dynamic framework providing fine-grained control over task generation in bAbI.
We demonstrate our ideas by constructing three new tasks requiring
compositional generalization, an important evaluation setting absent from the
original benchmark. We tested both special-purpose models developed for bAbI as
well as state-of-the-art pre-trained methods, and found that while both
approaches solve the original tasks (>99% accuracy), neither approach succeeded
in the compositional generalization setting, indicating the limitations of the
original training data. We explored ways to augment the original data, and
found that though diversifying training data was far more useful than simply
increasing dataset size, it was still insufficient for driving robust
compositional generalization (with <70% accuracy for complex compositions). Our
results underscore the importance of highly controllable task generators for
creating robust NLU systems through a virtuous cycle of model and data
development.
Related papers
- Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition [13.593511876719367]
We propose a novel skeleton-based idempotent generative model (IGM) for unsupervised representation learning.
Our experiments on benchmark datasets, NTU RGB+D and PKUMMD, demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-10-27T06:29:04Z) - Weak-to-Strong Reasoning [33.20094938292376]
We introduce a progressive learning framework that enables the strong model to autonomously refine its training data.
Our method significantly enhances the reasoning capabilities of Llama2-70b using three separate weak models.
This work paves the way for a more scalable and sophisticated strategy to enhance AI reasoning powers.
arXiv Detail & Related papers (2024-07-18T16:25:17Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - STAR: Boosting Low-Resource Information Extraction by Structure-to-Text
Data Generation with Large Language Models [56.27786433792638]
STAR is a data generation method that leverages Large Language Models (LLMs) to synthesize data instances.
We design fine-grained step-by-step instructions to obtain the initial data instances.
Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks.
arXiv Detail & Related papers (2023-05-24T12:15:19Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Syntactic and Semantic-driven Learning for Open Information Extraction [42.65591370263333]
One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora.
We propose a syntactic and semantic-driven learning approach, which can learn neural open IE models without any human-labelled data.
arXiv Detail & Related papers (2021-03-05T02:59:40Z) - Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition [22.90127409366107]
We propose an efficient but strong baseline based on Graph Convolutional Network (GCN)
Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN.
A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
arXiv Detail & Related papers (2020-10-20T02:56:58Z) - Deep Imitation Learning for Bimanual Robotic Manipulation [70.56142804957187]
We present a deep imitation learning framework for robotic bimanual manipulation.
A core challenge is to generalize the manipulation skills to objects in different locations.
We propose to (i) decompose the multi-modal dynamics into elemental movement primitives, (ii) parameterize each primitive using a recurrent graph neural network to capture interactions, and (iii) integrate a high-level planner that composes primitives sequentially and a low-level controller to combine primitive dynamics and inverse kinematics control.
arXiv Detail & Related papers (2020-10-11T01:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.