Sparse, Dense, and Attentional Representations for Text Retrieval
- URL: http://arxiv.org/abs/2005.00181v3
- Date: Tue, 16 Feb 2021 23:18:25 GMT
- Title: Sparse, Dense, and Attentional Representations for Text Retrieval
- Authors: Yi Luan, Jacob Eisenstein, Kristina Toutanova, Michael Collins
- Abstract summary: Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors.
We investigate the capacity of this architecture relative to sparse bag-of-words models and attentional neural networks.
We propose a simple neural model that combines the efficiency of dual encoders with some of the expressiveness of more costly attentional architectures.
- Score: 25.670835450331943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dual encoders perform retrieval by encoding documents and queries into dense
lowdimensional vectors, scoring each document by its inner product with the
query. We investigate the capacity of this architecture relative to sparse
bag-of-words models and attentional neural networks. Using both theoretical and
empirical analysis, we establish connections between the encoding dimension,
the margin between gold and lower-ranked documents, and the document length,
suggesting limitations in the capacity of fixed-length encodings to support
precise retrieval of long documents. Building on these insights, we propose a
simple neural model that combines the efficiency of dual encoders with some of
the expressiveness of more costly attentional architectures, and explore
sparse-dense hybrids to capitalize on the precision of sparse retrieval. These
models outperform strong alternatives in large-scale retrieval.
Related papers
- Summarizing long regulatory documents with a multi-step pipeline [2.2591852560804675]
We show that the effectiveness of a two-step architecture for summarizing long regulatory texts varies depending on the model used.
For abstractive encoder-decoder models with short context lengths, the effectiveness of an extractive step varies, whereas for long-context encoder-decoder models, the extractive step worsens their performance.
arXiv Detail & Related papers (2024-08-19T08:07:25Z) - Sequence Shortening for Context-Aware Machine Translation [5.803309695504831]
We show that a special case of multi-encoder architecture achieves higher accuracy on contrastive datasets.
We introduce two novel methods - Latent Grouping and Latent Selecting, where the network learns to group tokens or selects the tokens to be cached as context.
arXiv Detail & Related papers (2024-02-02T13:55:37Z) - SparseCoder: Identifier-Aware Sparse Transformer for File-Level Code
Summarization [51.67317895094664]
This paper studies file-level code summarization, which can assist programmers in understanding and maintaining large source code projects.
We propose SparseCoder, an identifier-aware sparse transformer for effectively handling long code sequences.
arXiv Detail & Related papers (2024-01-26T09:23:27Z) - Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization [76.57699934689468]
We propose a fine-grained Token-level retrieval-augmented mechanism (Tram) on the decoder side to enhance the performance of neural models.
To overcome the challenge of token-level retrieval in capturing contextual code semantics, we also propose integrating code semantics into individual summary tokens.
arXiv Detail & Related papers (2023-05-18T16:02:04Z) - Learning Diverse Document Representations with Deep Query Interactions
for Dense Retrieval [79.37614949970013]
We propose a new dense retrieval model which learns diverse document representations with deep query interactions.
Our model encodes each document with a set of generated pseudo-queries to get query-informed, multi-view document representations.
arXiv Detail & Related papers (2022-08-08T16:00:55Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - Efficient Cross-Modal Retrieval via Deep Binary Hashing and Quantization [5.799838997511804]
Cross-modal retrieval aims to search for data with similar semantic meanings across different content modalities.
We propose a jointly learned deep hashing and quantization network (HQ) for cross-modal retrieval.
Experimental results on the NUS-WIDE, MIR-Flickr, and Amazon datasets demonstrate that HQ achieves boosts of more than 7% in precision.
arXiv Detail & Related papers (2022-02-15T22:00:04Z) - End-to-End Information Extraction by Character-Level Embedding and
Multi-Stage Attentional U-Net [0.9137554315375922]
We propose a novel deep learning architecture for end-to-end information extraction on the 2D character-grid embedding of the document.
We show that our model outperforms the baseline U-Net architecture by a large margin while using 40% fewer parameters.
arXiv Detail & Related papers (2021-06-02T05:42:51Z) - Rethinking Text Line Recognition Models [57.47147190119394]
We consider two decoder families (Connectionist Temporal Classification and Transformer) and three encoder modules (Bidirectional LSTMs, Self-Attention, and GRCLs)
We compare their accuracy and performance on widely used public datasets of scene and handwritten text.
Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length.
arXiv Detail & Related papers (2021-04-15T21:43:13Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.