SCROLLS: Standardized CompaRison Over Long Language Sequences
- URL: http://arxiv.org/abs/2201.03533v1
- Date: Mon, 10 Jan 2022 18:47:15 GMT
- Title: SCROLLS: Standardized CompaRison Over Long Language Sequences
- Authors: Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv,
Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy
- Abstract summary: We introduce SCROLLS, a suite of tasks that require reasoning over long texts.
SCROLLS contains summarization, question answering, and natural language inference tasks.
We make all datasets available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.
- Score: 62.574959194373264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: NLP benchmarks have largely focused on short texts, such as sentences and
paragraphs, even though long texts comprise a considerable amount of natural
language in the wild. We introduce SCROLLS, a suite of tasks that require
reasoning over long texts. We examine existing long-text datasets, and handpick
ones where the text is naturally long, while prioritizing tasks that involve
synthesizing information across the input. SCROLLS contains summarization,
question answering, and natural language inference tasks, covering multiple
domains, including literature, science, business, and entertainment. Initial
baselines, including Longformer Encoder-Decoder, indicate that there is ample
room for improvement on SCROLLS. We make all datasets available in a unified
text-to-text format and host a live leaderboard to facilitate research on model
architecture and pretraining methods.
Related papers
- Long Input Benchmark for Russian Analysis [2.500659051698016]
LIBRA comprises 21 adapted datasets to study the LLM's abilities to understand long texts thoroughly.
The tests are divided into four complexity groups and allow the evaluation of models across various lengths ranging from 4k up to 128k tokens.
We provide the open-source datasets, context, and public leaderboard for LIBRA to guide forthcoming research.
arXiv Detail & Related papers (2024-08-05T12:59:35Z) - Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering [50.52792174648067]
This initiative seeks to bridge the gap between textual and visual comprehension.
We propose a new multi-task Urdu scene text dataset comprising over 1000 natural scene images.
We provide fine-grained annotations for text instances, addressing the limitations of previous datasets.
arXiv Detail & Related papers (2024-05-21T06:48:26Z) - LongWanjuan: Towards Systematic Measurement for Long Text Quality [102.46517202896521]
LongWanjuan is a dataset specifically tailored to enhance the training of language models for long-text tasks with over 160B tokens.
In LongWanjuan, we categorize long texts into holistic, aggregated, and chaotic types, enabling a detailed analysis of long-text quality.
We devise a data mixture recipe that strategically balances different types of long texts within LongWanjuan, leading to significant improvements in model performance on long-text tasks.
arXiv Detail & Related papers (2024-02-21T07:27:18Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - LoRaLay: A Multilingual and Multimodal Dataset for Long Range and
Layout-Aware Summarization [19.301567079372436]
Text Summarization is a popular task and an active area of research for the Natural Language Processing community.
All publicly available summarization datasets only provide plain text content.
We present LoRaLay, a collection of datasets for long-range summarization with accompanying visual/Lay information.
arXiv Detail & Related papers (2023-01-26T18:50:54Z) - Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues
and Documents [13.755637074366813]
SummN is a simple, flexible, and effective multi-stage framework for input texts longer than the maximum context lengths of typical pretrained LMs.
It can process input text of arbitrary length by adjusting the number of stages while keeping the LM context size fixed.
Our experiments demonstrate that SummN significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2021-10-16T06:19:54Z) - Pretrained Language Models for Text Generation: A Survey [46.03096493973206]
We present an overview of the major advances achieved in the topic of pretrained language models (PLMs) for text generation.
We discuss how to adapt existing PLMs to model different input data and satisfy special properties in the generated text.
arXiv Detail & Related papers (2021-05-21T12:27:44Z) - Natural Language Inference in Context -- Investigating Contextual
Reasoning over Long Texts [19.894104911338353]
ConTRoL is a new dataset for ConTextual Reasoning over Long texts.
It consists of 8,325 expert-designed "context-hypothesis" pairs with gold labels.
It is derived from competitive selection and recruitment test (verbal reasoning test) for police recruitment, with expert level quality.
arXiv Detail & Related papers (2020-11-10T02:31:31Z) - Deep Learning for Text Style Transfer: A Survey [71.8870854396927]
Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text.
We present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017.
We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data.
arXiv Detail & Related papers (2020-11-01T04:04:43Z) - Enabling Language Models to Fill in the Blanks [81.59381915581892]
We present a simple approach for text infilling, the task of predicting missing spans of text at any position in a document.
We train (or fine-tune) off-the-shelf language models on sequences containing the concatenation of artificially-masked text and the text which was masked.
We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics.
arXiv Detail & Related papers (2020-05-11T18:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.