DREQ: Document Re-Ranking Using Entity-based Query Understanding
- URL: http://arxiv.org/abs/2401.05939v1
- Date: Thu, 11 Jan 2024 14:27:12 GMT
- Title: DREQ: Document Re-Ranking Using Entity-based Query Understanding
- Authors: Shubham Chatterjee, Iain Mackie, Jeff Dalton
- Abstract summary: DREQ is an entity-oriented dense document re-ranking model.
We emphasize the query-relevant entities within a document's representation while simultaneously attenuating the less relevant ones.
We show that DREQ outperforms state-of-the-art neural and non-neural re-ranking methods.
- Score: 6.675805308519988
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: While entity-oriented neural IR models have advanced significantly, they
often overlook a key nuance: the varying degrees of influence individual
entities within a document have on its overall relevance. Addressing this gap,
we present DREQ, an entity-oriented dense document re-ranking model. Uniquely,
we emphasize the query-relevant entities within a document's representation
while simultaneously attenuating the less relevant ones, thus obtaining a
query-specific entity-centric document representation. We then combine this
entity-centric document representation with the text-centric representation of
the document to obtain a "hybrid" representation of the document. We learn a
relevance score for the document using this hybrid representation. Using four
large-scale benchmarks, we show that DREQ outperforms state-of-the-art neural
and non-neural re-ranking methods, highlighting the effectiveness of our
entity-oriented representation approach.
Related papers
- Generative Retrieval Meets Multi-Graded Relevance [104.75244721442756]
We introduce a framework called GRaded Generative Retrieval (GR$2$)
GR$2$ focuses on two key components: ensuring relevant and distinct identifiers, and implementing multi-graded constrained contrastive training.
Experiments on datasets with both multi-graded and binary relevance demonstrate the effectiveness of GR$2$.
arXiv Detail & Related papers (2024-09-27T02:55:53Z) - Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - CAPSTONE: Curriculum Sampling for Dense Retrieval with Document
Expansion [68.19934563919192]
We propose a curriculum sampling strategy that utilizes pseudo queries during training and progressively enhances the relevance between the generated query and the real query.
Experimental results on both in-domain and out-of-domain datasets demonstrate that our approach outperforms previous dense retrieval models.
arXiv Detail & Related papers (2022-12-18T15:57:46Z) - Learning Diverse Document Representations with Deep Query Interactions
for Dense Retrieval [79.37614949970013]
We propose a new dense retrieval model which learns diverse document representations with deep query interactions.
Our model encodes each document with a set of generated pseudo-queries to get query-informed, multi-view document representations.
arXiv Detail & Related papers (2022-08-08T16:00:55Z) - Relation-Specific Attentions over Entity Mentions for Enhanced
Document-Level Relation Extraction [4.685620089585031]
We propose RSMAN which performs selective attentions over different entity mentions with respect to candidate relations.
Our experiments upon two benchmark datasets show that our RSMAN can bring significant improvements for some backbone models.
arXiv Detail & Related papers (2022-05-28T10:40:31Z) - Document-Level Relation Extraction with Sentences Importance Estimation
and Focusing [52.069206266557266]
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences.
We propose a Sentence Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss.
Experimental results on two domains show that our SIEF not only improves overall performance, but also makes DocRE models more robust.
arXiv Detail & Related papers (2022-04-27T03:20:07Z) - Augmenting Document Representations for Dense Retrieval with
Interpolation and Perturbation [49.940525611640346]
Document Augmentation for dense Retrieval (DAR) framework augments the representations of documents with their Dense Augmentation and perturbations.
We validate the performance of DAR on retrieval tasks with two benchmark datasets, showing that the proposed DAR significantly outperforms relevant baselines on the dense retrieval of both the labeled and unlabeled documents.
arXiv Detail & Related papers (2022-03-15T09:07:38Z) - CODER: An efficient framework for improving retrieval through
COntextualized Document Embedding Reranking [11.635294568328625]
We present a framework for improving the performance of a wide class of retrieval models at minimal computational cost.
It utilizes precomputed document representations extracted by a base dense retrieval method.
It incurs a negligible computational overhead on top of any first-stage method at run time, allowing it to be easily combined with any state-of-the-art dense retrieval method.
arXiv Detail & Related papers (2021-12-16T10:25:26Z) - Improving Query Representations for Dense Retrieval with Pseudo
Relevance Feedback [29.719150565643965]
This paper proposes ANCE-PRF, a new query encoder that uses pseudo relevance feedback (PRF) to improve query representations for dense retrieval.
ANCE-PRF uses a BERT encoder that consumes the query and the top retrieved documents from a dense retrieval model, ANCE, and it learns to produce better query embeddings directly from relevance labels.
Analysis shows that the PRF encoder effectively captures the relevant and complementary information from PRF documents, while ignoring the noise with its learned attention mechanism.
arXiv Detail & Related papers (2021-08-30T18:10:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.