Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks
- URL: http://arxiv.org/abs/2204.01906v1
- Date: Tue, 5 Apr 2022 00:32:04 GMT
- Title: Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks
- Authors: Tristan Thrush, Kushal Tirumala, Anmol Gupta, Max Bartolo, Pedro
Rodriguez, Tariq Kane, William Gaviria Rojas, Peter Mattson, Adina Williams,
Douwe Kiela
- Abstract summary: Dynatask is an open source system for setting up custom NLP tasks.
It is integrated with Dynabench, a research platform for rethinking benchmarking in AI.
- Score: 31.460091555017197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Dynatask: an open source system for setting up custom NLP tasks
that aims to greatly lower the technical knowledge and effort required for
hosting and evaluating state-of-the-art NLP models, as well as for conducting
model in the loop data collection with crowdworkers. Dynatask is integrated
with Dynabench, a research platform for rethinking benchmarking in AI that
facilitates human and model in the loop data collection and evaluation. To
create a task, users only need to write a short task configuration file from
which the relevant web interfaces and model hosting infrastructure are
automatically generated. The system is available at https://dynabench.org/ and
the full library can be found at https://github.com/facebookresearch/dynabench.
Related papers
- Chain-of-Skills: A Configurable Model for Open-domain Question Answering [79.8644260578301]
The retrieval model is an indispensable component for real-world knowledge-intensive tasks.
Recent work focuses on customized methods, limiting the model transferability and scalability.
We propose a modular retriever where individual modules correspond to key skills that can be reused across datasets.
arXiv Detail & Related papers (2023-05-04T20:19:39Z) - TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with
Millions of APIs [71.7495056818522]
We introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion.
We will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next.
arXiv Detail & Related papers (2023-03-29T03:30:38Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - Dynabench: Rethinking Benchmarking in NLP [82.26699038776812]
We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking.
Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation.
We report on four initial NLP tasks, illustrating these concepts and highlighting the promise of the platform.
arXiv Detail & Related papers (2021-04-07T17:49:17Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z) - MTL-NAS: Task-Agnostic Neural Architecture Search towards
General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL)
In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones.
We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.