Challenging Decoder helps in Masked Auto-Encoder Pre-training for Dense
Passage Retrieval
- URL: http://arxiv.org/abs/2305.13197v1
- Date: Mon, 22 May 2023 16:27:10 GMT
- Title: Challenging Decoder helps in Masked Auto-Encoder Pre-training for Dense
Passage Retrieval
- Authors: Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie
- Abstract summary: Masked auto-encoder (MAE) pre-training architecture has emerged as the most promising.
We propose a novel token importance aware masking strategy based on pointwise mutual information to intensify the challenge of the decoder.
- Score: 10.905033385938982
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, various studies have been directed towards exploring dense passage
retrieval techniques employing pre-trained language models, among which the
masked auto-encoder (MAE) pre-training architecture has emerged as the most
promising. The conventional MAE framework relies on leveraging the passage
reconstruction of decoder to bolster the text representation ability of
encoder, thereby enhancing the performance of resulting dense retrieval
systems. Within the context of building the representation ability of the
encoder through passage reconstruction of decoder, it is reasonable to
postulate that a ``more demanding'' decoder will necessitate a corresponding
increase in the encoder's ability. To this end, we propose a novel token
importance aware masking strategy based on pointwise mutual information to
intensify the challenge of the decoder. Importantly, our approach can be
implemented in an unsupervised manner, without adding additional expenses to
the pre-training phase. Our experiments verify that the proposed method is both
effective and robust on large-scale supervised passage retrieval datasets and
out-of-domain zero-shot retrieval benchmarks.
Related papers
- $ε$-VAE: Denoising as Visual Decoding [61.29255979767292]
In generative modeling, tokenization simplifies complex data into compact, structured representations, creating a more efficient, learnable space.
Current visual tokenization methods rely on a traditional autoencoder framework, where the encoder compresses data into latent representations, and the decoder reconstructs the original input.
We propose denoising as decoding, shifting from single-step reconstruction to iterative refinement. Specifically, we replace the decoder with a diffusion process that iteratively refines noise to recover the original image, guided by the latents provided by the encoder.
We evaluate our approach by assessing both reconstruction (rFID) and generation quality (
arXiv Detail & Related papers (2024-10-05T08:27:53Z) - Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval [26.00149743478937]
Masked auto-encoder pre-training has emerged as a prevalent technique for initializing and enhancing dense retrieval systems.
We propose a modification to the traditional MAE by replacing the decoder of a masked auto-encoder with a completely simplified Bag-of-Word prediction task.
Our proposed method achieves state-of-the-art retrieval performance on several large-scale retrieval benchmarks without requiring any additional parameters.
arXiv Detail & Related papers (2024-01-20T15:02:33Z) - Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization [76.57699934689468]
We propose a fine-grained Token-level retrieval-augmented mechanism (Tram) on the decoder side to enhance the performance of neural models.
To overcome the challenge of token-level retrieval in capturing contextual code semantics, we also propose integrating code semantics into individual summary tokens.
arXiv Detail & Related papers (2023-05-18T16:02:04Z) - ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval.
It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding.
We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z) - Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder.
We train a Transformer-based sequence encoder over a large set of short sequences.
Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z) - Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation [56.44853893149365]
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers.
We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content.
In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
arXiv Detail & Related papers (2020-07-19T18:44:34Z) - Rethinking and Improving Natural Language Generation with Layer-Wise
Multi-View Decoding [59.48857453699463]
In sequence-to-sequence learning, the decoder relies on the attention mechanism to efficiently extract information from the encoder.
Recent work has proposed to use representations from different encoder layers for diversified levels of information.
We propose layer-wise multi-view decoding, where for each decoder layer, together with the representations from the last encoder layer, which serve as a global view, those from other encoder layers are supplemented for a stereoscopic view of the source sequences.
arXiv Detail & Related papers (2020-05-16T20:00:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.