Brown University at TREC Deep Learning 2019
- URL: http://arxiv.org/abs/2009.04016v1
- Date: Tue, 8 Sep 2020 22:54:03 GMT
- Title: Brown University at TREC Deep Learning 2019
- Authors: George Zerveas, Ruochen Zhang, Leila Kim, Carsten Eickhoff
- Abstract summary: This paper describes Brown University's submission to the TREC 2019 Deep Learning track.
Brown's team ranked 3rd in the passage retrieval task (including full ranking and re-ranking), and 2nd when considering only re-ranking submissions.
- Score: 11.63256359906015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes Brown University's submission to the TREC 2019 Deep
Learning track. We followed a 2-phase method for producing a ranking of
passages for a given input query: In the the first phase, the user's query is
expanded by appending 3 queries generated by a transformer model which was
trained to rephrase an input query into semantically similar queries. The
expanded query can exhibit greater similarity in surface form and vocabulary
overlap with the passages of interest and can therefore serve as enriched input
to any downstream information retrieval method. In the second phase, we use a
BERT-based model pre-trained for language modeling but fine-tuned for query -
document relevance prediction to compute relevance scores for a set of 1000
candidate passages per query and subsequently obtain a ranking of passages by
sorting them based on the predicted relevance scores. According to the results
published in the official Overview of the TREC Deep Learning Track 2019, our
team ranked 3rd in the passage retrieval task (including full ranking and
re-ranking), and 2nd when considering only re-ranking submissions.
Related papers
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval [54.54576644403115]
Many complex real-world queries require in-depth reasoning to identify relevant documents.
We introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents.
Our dataset consists of 1,384 real-world queries spanning diverse domains, such as economics, psychology, mathematics, and coding.
arXiv Detail & Related papers (2024-07-16T17:58:27Z) - Mixed-initiative Query Rewriting in Conversational Passage Retrieval [11.644235288057123]
We report our methods and experiments for the TREC Conversational Assistance Track (CAsT) 2022.
We propose a mixed-initiative query rewriting module, which achieves query rewriting based on the mixed-initiative interaction between the users and the system.
Experiments on both TREC CAsT 2021 and TREC CAsT 2022 datasets show the effectiveness of our mixed-initiative-based query rewriting (or query reformulation) method.
arXiv Detail & Related papers (2023-07-17T19:38:40Z) - Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context.
Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR.
For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z) - Leveraging Query Resolution and Reading Comprehension for Conversational
Passage Retrieval [6.490148466525755]
This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track.
Our pipeline consists of (i) an initial retrieval module that uses BM25, and (ii) a re-ranking module that combines the score of a BERT ranking model with the score of a machine comprehension model adjusted for passage retrieval.
arXiv Detail & Related papers (2021-02-17T14:41:57Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Pretrained Transformers for Text Ranking: BERT and Beyond [53.83210899683987]
This survey provides an overview of text ranking with neural network architectures known as transformers.
The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing.
arXiv Detail & Related papers (2020-10-13T15:20:32Z) - IR-BERT: Leveraging BERT for Semantic Search in Background Linking for
News Articles [2.707154152696381]
This work describes our two approaches for the background linking task of TREC 2020 News Track.
The main objective of this task is to recommend a list of relevant articles that the reader should refer to in order to understand the context.
We empirically show that employing a language model benefits our approach in understanding the context as well as the background of the query article.
arXiv Detail & Related papers (2020-07-24T16:02:14Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z) - Transformer Based Language Models for Similar Text Retrieval and Ranking [0.0]
We introduce novel approaches for effectively applying neural transformer models to similar text retrieval and ranking.
By eliminating the bag-of-words-based step, our approach is able to accurately retrieve and rank results even when they have no non-stopwords in common with the query.
arXiv Detail & Related papers (2020-05-10T06:12:53Z) - Pre-training Tasks for Embedding-based Large-scale Retrieval [68.01167604281578]
We consider the large-scale query-document retrieval problem.
Given a query (e.g., a question), return the set of relevant documents from a large document corpus.
We show that the key ingredient of learning a strong embedding-based Transformer model is the set of pre-training tasks.
arXiv Detail & Related papers (2020-02-10T16:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.