Related papers: Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts

Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts

URL: http://arxiv.org/abs/2302.12530v1
Date: Fri, 24 Feb 2023 09:29:55 GMT
Title: Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts
Authors: Chao Xue and Di Liang and Sirui Wang and Wei Wu and Jing Zhang
Abstract summary: Transformer-based pre-trained models have achieved great improvements in semantic matching. Existing models still suffer from insufficient ability to capture subtle differences. We propose a novel Dual Path Modeling Framework to enhance the model's ability to perceive subtle differences.
Score: 14.563722352134949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformer-based pre-trained models have achieved great improvements in semantic matching. However, existing models still suffer from insufficient ability to capture subtle differences. The modification, addition and deletion of words in sentence pairs may make it difficult for the model to predict their relationship. To alleviate this problem, we propose a novel Dual Path Modeling Framework to enhance the model's ability to perceive subtle differences in sentence pairs by separately modeling affinity and difference semantics. Based on dual-path modeling framework we design the Dual Path Modeling Network (DPM-Net) to recognize semantic relations. And we conduct extensive experiments on 10 well-studied semantic matching and robustness test datasets, and the experimental results show that our proposed method achieves consistent improvements over baselines.

Related papers

Multi-Level Collaboration in Model Merging [56.31088116526825]
This paper explores the intrinsic connections between model merging and model ensembling. We find that even when previous restrictions are not met, there is still a way for model merging to attain a near-identical and superior performance similar to that of ensembling.
arXiv Detail & Related papers (2025-03-03T07:45:04Z)
Comateformer: Combined Attention Transformer for Semantic Sentence Matching [11.746010399185437]
We propose a novel semantic sentence matching model named Combined Attention Network based on Transformer model (Comateformer) In Comateformer model, we design a novel transformer-based quasi-attention mechanism with compositional properties. Our proposed approach builds on the intuition of similarity and dissimilarity (negative affinity) when calculating dual affinity scores.
arXiv Detail & Related papers (2024-12-10T06:18:07Z)
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity [11.302828987873497]
We present a Cross-Architecture Layerwise Distillation (CALD) approach that jointly converts a transformer model to a linear time substitute and fine-tunes it to a target task. We show that CALD can effectively recover the result of the original model, and that the guiding strategy contributes to the result.
arXiv Detail & Related papers (2024-10-09T13:06:43Z)
InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion [53.90516061351706]
We present InterHandGen, a novel framework that learns the generative prior of two-hand interaction. For sampling, we combine anti-penetration and synthesis-free guidance to enable plausible generation. Our method significantly outperforms baseline generative models in terms of plausibility and diversity.
arXiv Detail & Related papers (2024-03-26T06:35:55Z)
Robust Training of Federated Models with Extremely Label Deficiency [84.00832527512148]
Federated semi-supervised learning (FSSL) has emerged as a powerful paradigm for collaboratively training machine learning models using distributed data with label deficiency. We propose a novel twin-model paradigm, called Twin-sight, designed to enhance mutual guidance by providing insights from different perspectives of labeled and unlabeled data. Our comprehensive experiments on four benchmark datasets provide substantial evidence that Twin-sight can significantly outperform state-of-the-art methods across various experimental settings.
arXiv Detail & Related papers (2024-02-22T10:19:34Z)
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding [57.42429912884543]
We propose Diff-LM-Speech, Tetra-Diff-Speech and Tri-Diff-Speech to solve high dimensionality and waveform distortion problems. We also introduce a prompt encoder structure based on a variational autoencoder and a prosody bottleneck to improve prompt representation ability. Experimental results show that our proposed methods outperform baseline methods.
arXiv Detail & Related papers (2023-07-28T11:20:23Z)
ContraFeat: Contrasting Deep Features for Semantic Discovery [102.4163768995288]
StyleGAN has shown strong potential for disentangled semantic control. Existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results. We propose a model that automates this process and achieves state-of-the-art semantic discovery performance.
arXiv Detail & Related papers (2022-12-14T15:22:13Z)
DABERT: Dual Attention Enhanced BERT for Semantic Matching [12.348661150707313]
We propose a novel Dual Attention Enhanced BERT (DABERT) to enhance the ability of BERT to capture fine-grained differences in sentence pairs. DABERT comprises (1) Dual Attention module, which measures soft word matches by introducing a new dual channel alignment mechanism. (2) Adaptive Fusion module, this module uses attention to learn the aggregation of difference and affinity features, and generates a vector describing the matching details of sentence pairs.
arXiv Detail & Related papers (2022-10-07T10:54:49Z)
Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation [21.594361495948316]
We propose a novel feature-level adversarial training method named FLAT. FLAT incorporates variational word masks in neural networks to learn global word importance. Experiments show the effectiveness of FLAT in improving the robustness with respect to both predictions and interpretations.
arXiv Detail & Related papers (2022-03-23T20:04:14Z)
IMACS: Image Model Attribution Comparison Summaries [16.80986701058596]
We introduce IMACS, a method that combines gradient-based model attributions with aggregation and visualization techniques. IMACS extracts salient input features from an evaluation dataset, clusters them based on similarity, then visualizes differences in model attributions for similar input features. We show how our technique can uncover behavioral differences caused by domain shift between two models trained on satellite images.
arXiv Detail & Related papers (2022-01-26T21:35:14Z)
Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images. We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation. We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.