Modeling Document Interactions for Learning to Rank with Regularized
Self-Attention
- URL: http://arxiv.org/abs/2005.03932v1
- Date: Fri, 8 May 2020 09:53:31 GMT
- Title: Modeling Document Interactions for Learning to Rank with Regularized
Self-Attention
- Authors: Shuo Sun, Kevin Duh
- Abstract summary: We explore modeling documents interactions with self-attention based neural networks.
We propose simple yet effective regularization terms designed to model interactions between documents.
We show that training self-attention network with our proposed regularization terms can significantly outperform existing learning to rank methods.
- Score: 22.140197412459393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning to rank is an important task that has been successfully deployed in
many real-world information retrieval systems. Most existing methods compute
relevance judgments of documents independently, without holistically
considering the entire set of competing documents. In this paper, we explore
modeling documents interactions with self-attention based neural networks.
Although self-attention networks have achieved state-of-the-art results in many
NLP tasks, we find empirically that self-attention provides little benefit over
baseline neural learning to rank architecture. To improve the learning of
self-attention weights, We propose simple yet effective regularization terms
designed to model interactions between documents. Evaluations on publicly
available Learning to Rank (LETOR) datasets show that training self-attention
network with our proposed regularization terms can significantly outperform
existing learning to rank methods.
Related papers
- A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - EAML: Ensemble Self-Attention-based Mutual Learning Network for Document
Image Classification [1.1470070927586016]
We design a self-attention-based fusion module that serves as a block in our ensemble trainable network.
It allows to simultaneously learn the discriminant features of image and text modalities throughout the training stage.
This is the first time to leverage a mutual learning approach along with a self-attention-based fusion module to perform document image classification.
arXiv Detail & Related papers (2023-05-11T16:05:03Z) - VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal
Document Classification [3.7798600249187295]
Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream task.
In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues.
The proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities.
arXiv Detail & Related papers (2022-05-24T12:28:12Z) - Few-shot Learning for Slot Tagging with Attentive Relational Network [35.624877181332636]
Metric-based learning is a well-known family of methods for few-shot learning, especially in computer vision.
We propose a novel metric-based learning architecture - Attentive Network.
The results on SNIPS show that our proposed method outperforms other state-of-the-art metric-based learning methods.
arXiv Detail & Related papers (2021-03-03T11:24:24Z) - Self-Attention Meta-Learner for Continual Learning [5.979373021392084]
Self-Attention Meta-Learner (SAM) learns a prior knowledge for continual learning that permits learning a sequence of tasks.
SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task.
We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task inference.
arXiv Detail & Related papers (2021-01-28T17:35:04Z) - Improving Few-Shot Learning with Auxiliary Self-Supervised Pretext Tasks [0.0]
Recent work on few-shot learning shows that quality of learned representations plays an important role in few-shot classification performance.
On the other hand, the goal of self-supervised learning is to recover useful semantic information of the data without the use of class labels.
We exploit the complementarity of both paradigms via a multi-task framework where we leverage recent self-supervised methods as auxiliary tasks.
arXiv Detail & Related papers (2021-01-24T23:21:43Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Graph-Based Neural Network Models with Multiple Self-Supervised
Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points.
We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Automated Relational Meta-learning [95.02216511235191]
We propose an automated relational meta-learning framework that automatically extracts the cross-task relations and constructs the meta-knowledge graph.
We conduct extensive experiments on 2D toy regression and few-shot image classification and the results demonstrate the superiority of ARML over state-of-the-art baselines.
arXiv Detail & Related papers (2020-01-03T07:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.