Related papers: Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

URL: http://arxiv.org/abs/2005.13111v1
Date: Wed, 27 May 2020 01:20:49 GMT
Title: Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Authors: Kyle Swanson, Lili Yu, Tao Lei
Abstract summary: In this work, we extend this selective rationalization approach to text matching. The goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs.
Score: 14.86310501896212
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs. However, directly applying OT often produces dense and therefore uninterpretable alignments. To overcome this limitation, we introduce novel constrained variants of the OT problem that result in highly sparse alignments with controllable sparsity. Our model is end-to-end differentiable using the Sinkhorn algorithm for OT and can be trained without any alignment annotations. We evaluate our model on the StackExchange, MultiNews, e-SNLI, and MultiRC datasets. Our model achieves very sparse rationale selections with high fidelity while preserving prediction accuracy compared to strong attention baseline models.

Related papers

RankPO: Preference Optimization for Job-Talent Matching [7.385902340910447]
We propose a two-stage training framework for large language models (LLMs) In the first stage, a contrastive learning approach is used to train the model on a dataset constructed from real-world matching rules. In the second stage, we introduce a novel preference-based fine-tuning method inspired by Direct Preference Optimization (DPO) to align the model with AI-curated pairwise preferences.
arXiv Detail & Related papers (2025-03-13T10:14:37Z)
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference [19.397326645617422]
This paper focuses on the alignment of recent text-to-image diffusion models, such as Stable Diffusion XL (SDXL) We propose a novel and memory-friendly preference alignment method for diffusion models that does not depend on any reference model, coined margin-aware preference optimization (MaPO) MaPO jointly maximizes the likelihood margin between the preferred and dispreferred image sets and the likelihood of the preferred sets, simultaneously learning general stylistic features and preferences.
arXiv Detail & Related papers (2024-06-10T16:14:45Z)
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback [70.32795295142648]
Linear alignment is a novel algorithm that aligns language models with human preferences in one single inference step. Experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment.
arXiv Detail & Related papers (2024-01-21T10:46:23Z)
Sequential Recommendation via Stochastic Self-Attention [68.52192964559829]
Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items. We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues. We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
arXiv Detail & Related papers (2022-01-16T12:38:45Z)
Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings [7.026476782041066]
We propose using Optimal Transport (OT) as an alignment objective during fine-tuning to improve multilingual contextualized representations. This approach does not require word-alignment pairs prior to fine-tuning and instead learns the word alignments within context in an unsupervised manner.
arXiv Detail & Related papers (2021-10-06T16:13:45Z)
Controlled Text Generation as Continuous Optimization with Multiple Constraints [23.71027518888138]
We propose a flexible and modular algorithm for controllable inference from pretrained models. We make use of Lagrangian multipliers and gradient-descent based techniques to generate the desired text. We evaluate our approach on controllable machine translation and style transfer with multiple sentence-level attributes.
arXiv Detail & Related papers (2021-08-04T05:25:20Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes. An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)
Logic Constrained Pointer Networks for Interpretable Textual Similarity [11.142649867439406]
We introduce a novel pointer network based model with a sentinel gating function to align constituent chunks. We improve this base model with a loss function to equally penalize misalignments in both sentences, ensuring the alignments are bidirectional. The model achieves an F1 score of 97.73 and 96.32 on the benchmark SemEval datasets for the chunk alignment task.
arXiv Detail & Related papers (2020-07-15T13:01:44Z)
Graph Optimal Transport for Cross-Domain Alignment [121.80313648519203]
Cross-domain alignment is fundamental to computer vision and natural language processing. We propose Graph Optimal Transport (GOT), a principled framework that germinates from recent advances in Optimal Transport (OT) Experiments show consistent outperformance of GOT over baselines across a wide range of tasks.
arXiv Detail & Related papers (2020-06-26T01:14:23Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.