Rationalizing Text Matching: Learning Sparse Alignments via Optimal
Transport
- URL: http://arxiv.org/abs/2005.13111v1
- Date: Wed, 27 May 2020 01:20:49 GMT
- Title: Rationalizing Text Matching: Learning Sparse Alignments via Optimal
Transport
- Authors: Kyle Swanson, Lili Yu, Tao Lei
- Abstract summary: In this work, we extend this selective rationalization approach to text matching.
The goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction.
Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs.
- Score: 14.86310501896212
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Selecting input features of top relevance has become a popular method for
building self-explaining models. In this work, we extend this selective
rationalization approach to text matching, where the goal is to jointly select
and align text pieces, such as tokens or sentences, as a justification for the
downstream prediction. Our approach employs optimal transport (OT) to find a
minimal cost alignment between the inputs. However, directly applying OT often
produces dense and therefore uninterpretable alignments. To overcome this
limitation, we introduce novel constrained variants of the OT problem that
result in highly sparse alignments with controllable sparsity. Our model is
end-to-end differentiable using the Sinkhorn algorithm for OT and can be
trained without any alignment annotations. We evaluate our model on the
StackExchange, MultiNews, e-SNLI, and MultiRC datasets. Our model achieves very
sparse rationale selections with high fidelity while preserving prediction
accuracy compared to strong attention baseline models.
Related papers
- Margin-aware Preference Optimization for Aligning Diffusion Models without Reference [19.397326645617422]
This paper focuses on the alignment of recent text-to-image diffusion models, such as Stable Diffusion XL (SDXL)
We propose a novel and memory-friendly preference alignment method for diffusion models that does not depend on any reference model, coined margin-aware preference optimization (MaPO)
MaPO jointly maximizes the likelihood margin between the preferred and dispreferred image sets and the likelihood of the preferred sets, simultaneously learning general stylistic features and preferences.
arXiv Detail & Related papers (2024-06-10T16:14:45Z) - Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback [70.32795295142648]
Linear alignment is a novel algorithm that aligns language models with human preferences in one single inference step.
Experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment.
arXiv Detail & Related papers (2024-01-21T10:46:23Z) - Sequential Recommendation via Stochastic Self-Attention [68.52192964559829]
Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items.
We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues.
We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
arXiv Detail & Related papers (2022-01-16T12:38:45Z) - Using Optimal Transport as Alignment Objective for fine-tuning
Multilingual Contextualized Embeddings [7.026476782041066]
We propose using Optimal Transport (OT) as an alignment objective during fine-tuning to improve multilingual contextualized representations.
This approach does not require word-alignment pairs prior to fine-tuning and instead learns the word alignments within context in an unsupervised manner.
arXiv Detail & Related papers (2021-10-06T16:13:45Z) - Controlled Text Generation as Continuous Optimization with Multiple
Constraints [23.71027518888138]
We propose a flexible and modular algorithm for controllable inference from pretrained models.
We make use of Lagrangian multipliers and gradient-descent based techniques to generate the desired text.
We evaluate our approach on controllable machine translation and style transfer with multiple sentence-level attributes.
arXiv Detail & Related papers (2021-08-04T05:25:20Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Logic Constrained Pointer Networks for Interpretable Textual Similarity [11.142649867439406]
We introduce a novel pointer network based model with a sentinel gating function to align constituent chunks.
We improve this base model with a loss function to equally penalize misalignments in both sentences, ensuring the alignments are bidirectional.
The model achieves an F1 score of 97.73 and 96.32 on the benchmark SemEval datasets for the chunk alignment task.
arXiv Detail & Related papers (2020-07-15T13:01:44Z) - Graph Optimal Transport for Cross-Domain Alignment [121.80313648519203]
Cross-domain alignment is fundamental to computer vision and natural language processing.
We propose Graph Optimal Transport (GOT), a principled framework that germinates from recent advances in Optimal Transport (OT)
Experiments show consistent outperformance of GOT over baselines across a wide range of tasks.
arXiv Detail & Related papers (2020-06-26T01:14:23Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.