Fine-Grained Relevance Annotations for Multi-Task Document Ranking and
Question Answering
- URL: http://arxiv.org/abs/2008.05363v1
- Date: Wed, 12 Aug 2020 14:59:50 GMT
- Title: Fine-Grained Relevance Annotations for Multi-Task Document Ranking and
Question Answering
- Authors: Sebastian Hofst\"atter, Markus Zlabinger, Mete Sertkan, Michael
Schr\"oder, Allan Hanbury
- Abstract summary: We present FiRA: a novel dataset of Fine-Grained Relevances.
We extend the ranked retrieval annotations of the Deep Learning track of TREC 2019 with passage and word level graded relevance annotations for all relevant documents.
As an example, we evaluate the recently introduced TKL document ranking model. We find that although TKL exhibits state-of-the-art retrieval results for long documents, it misses many relevant passages.
- Score: 9.480648914353035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There are many existing retrieval and question answering datasets. However,
most of them either focus on ranked list evaluation or single-candidate
question answering. This divide makes it challenging to properly evaluate
approaches concerned with ranking documents and providing snippets or answers
for a given query. In this work, we present FiRA: a novel dataset of
Fine-Grained Relevance Annotations. We extend the ranked retrieval annotations
of the Deep Learning track of TREC 2019 with passage and word level graded
relevance annotations for all relevant documents. We use our newly created data
to study the distribution of relevance in long documents, as well as the
attention of annotators to specific positions of the text. As an example, we
evaluate the recently introduced TKL document ranking model. We find that
although TKL exhibits state-of-the-art retrieval results for long documents, it
misses many relevant passages.
Related papers
- Evaluating D-MERIT of Partial-annotation on Information Retrieval [77.44452769932676]
Retrieval models are often evaluated on partially-annotated datasets.
We show that using partially-annotated datasets in evaluation can paint a distorted picture.
arXiv Detail & Related papers (2024-06-23T08:24:08Z) - Attention Sorting Combats Recency Bias In Long Context Language Models [69.06809365227504]
Current language models often fail to incorporate long contexts efficiently during generation.
We show that a major contributor to this issue are attention priors that are likely learned during pre-training.
We leverage this fact to introduce attention sorting'': perform one step of decoding, sort documents by the attention they receive, repeat the process, generate the answer with the newly sorted context.
arXiv Detail & Related papers (2023-09-28T05:19:06Z) - PDFTriage: Question Answering over Long, Structured Documents [60.96667912964659]
Representing structured documents as plain text is incongruous with the user's mental model of these documents with rich structure.
We propose PDFTriage that enables models to retrieve the context based on either structure or content.
Our benchmark dataset consists of 900+ human-generated questions over 80 structured documents.
arXiv Detail & Related papers (2023-09-16T04:29:05Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - Open Domain Multi-document Summarization: A Comprehensive Study of Model
Brittleness under Retrieval [42.73076855699184]
Multi-document summarization (MDS) assumes a set of topic-related documents are provided as input.
We study this more challenging setting by formalizing the task and bootstrapping it using existing datasets, retrievers and summarizers.
arXiv Detail & Related papers (2022-12-20T18:41:38Z) - Cross-document Event Coreference Search: Task, Dataset and Modeling [26.36068336169796]
We propose an appealing, and often more applicable, complementary set up for the task - Cross-document Coreference Search.
To support research on this task, we create a corresponding dataset, which is derived from Wikipedia.
We present a novel model that integrates a powerful coreference scoring scheme into the DPR architecture, yielding improved performance.
arXiv Detail & Related papers (2022-10-23T08:21:25Z) - D2S: Document-to-Slide Generation Via Query-Based Text Summarization [27.576875048631265]
We contribute a new dataset, SciDuet, consisting of pairs of papers and their corresponding slides decks from recent years' NLP and ML conferences.
Secondly, we present D2S, a novel system that tackles the document-to-slides task with a two-step approach.
Our evaluation suggests that long-form QA outperforms state-of-the-art summarization baselines on both automated ROUGE metrics and qualitative human evaluation.
arXiv Detail & Related papers (2021-05-08T10:29:41Z) - CSFCube -- A Test Collection of Computer Science Research Articles for
Faceted Query by Example [43.01717754418893]
We introduce the task of faceted Query by Example.
Users can also specify a finer grained aspect in addition to the input query document.
We envision models which are able to retrieve scientific papers analogous to a query scientific paper.
arXiv Detail & Related papers (2021-03-24T01:02:12Z) - Knowledge-Aided Open-Domain Question Answering [58.712857964048446]
We propose a knowledge-aided open-domain QA (KAQA) method which targets at improving relevant document retrieval and answer reranking.
During document retrieval, a candidate document is scored by considering its relationship to the question and other documents.
During answer reranking, a candidate answer is reranked using not only its own context but also the clues from other documents.
arXiv Detail & Related papers (2020-06-09T13:28:57Z) - Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers.
This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.