Overview of the TREC 2020 deep learning track
- URL: http://arxiv.org/abs/2102.07662v1
- Date: Mon, 15 Feb 2021 16:47:00 GMT
- Title: Overview of the TREC 2020 deep learning track
- Authors: Nick Craswell, Bhaskar Mitra, Emine Yilmaz and Daniel Campos
- Abstract summary: This year we have a document retrieval task and a passage retrieval task, each with hundreds of thousands of human-labeled training queries.
We evaluate using single-shot TREC-style evaluation, to give us a picture of which ranking methods work best when large data is available.
This year we have further evidence that rankers with BERT-style pretraining outperform other rankers in the large data regime.
- Score: 30.531644711518414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This is the second year of the TREC Deep Learning Track, with the goal of
studying ad hoc ranking in the large training data regime. We again have a
document retrieval task and a passage retrieval task, each with hundreds of
thousands of human-labeled training queries. We evaluate using single-shot
TREC-style evaluation, to give us a picture of which ranking methods work best
when large data is available, with much more comprehensive relevance labeling
on the small number of test queries. This year we have further evidence that
rankers with BERT-style pretraining outperform other rankers in the large data
regime.
Related papers
- Overview of the TREC 2023 Product Product Search Track [70.56592126043546]
This is the first year of the TREC Product search track.
The focus was the creation of a reusable collection.
We leverage the new product search corpus, which includes contextual metadata.
arXiv Detail & Related papers (2023-11-14T02:25:18Z) - Hybrid Retrieval and Multi-stage Text Ranking Solution at TREC 2022 Deep
Learning Track [22.81602641419962]
We explain the hybrid text retrieval and multi-stage text ranking method adopted in our solution.
In the ranking stage, in addition to the full interaction-based ranking model built on large pre-trained language model, we also proposes a lightweight sub-ranking module.
Our models achieve the 1st and 4th rank on the test set of passage ranking and document ranking respectively.
arXiv Detail & Related papers (2023-08-23T09:56:59Z) - Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data.
Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z) - PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking [20.260222175405215]
This paper describes the PASH participation in TREC 2021 Deep Learning Track.
In the recall stage, we adopt a scheme combining sparse and dense retrieval method.
In the multi-stage ranking phase, point-wise and pair-wise ranking strategies are used.
arXiv Detail & Related papers (2022-05-18T04:38:15Z) - LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text
Retrieval [55.097573036580066]
Experimental results show that LaPraDoR achieves state-of-the-art performance compared with supervised dense retrieval models.
Compared to re-ranking, our lexicon-enhanced approach can be run in milliseconds (22.5x faster) while achieving superior performance.
arXiv Detail & Related papers (2022-03-11T18:53:12Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - Self-training Improves Pre-training for Natural Language Understanding [63.78927366363178]
We study self-training as another way to leverage unlabeled data through semi-supervised learning.
We introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data.
Our approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks.
arXiv Detail & Related papers (2020-10-05T17:52:25Z) - Overview of the TREC 2019 deep learning track [36.23357487158591]
The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc ranking in a large data regime.
It is the first track with large human-labeled training sets, introducing two sets corresponding to two tasks.
This year 15 groups submitted a total of 75 runs, using various combinations of deep learning, transfer learning and traditional IR ranking methods.
arXiv Detail & Related papers (2020-03-17T17:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.