Co-Driven Recognition of Semantic Consistency via the Fusion of
Transformer and HowNet Sememes Knowledge
- URL: http://arxiv.org/abs/2302.10570v1
- Date: Tue, 21 Feb 2023 09:53:19 GMT
- Title: Co-Driven Recognition of Semantic Consistency via the Fusion of
Transformer and HowNet Sememes Knowledge
- Authors: Fan Chen, Yan Huang, Xinfang Zhang, Kang Luo, Jinxuan Zhu, Ruixian He
- Abstract summary: This paper proposes a co-driven semantic consistency recognition method based on the fusion of Transformer and HowNet sememes knowledge.
BiLSTM is exploited to encode the conceptual semantic information and infer the semantic consistency.
- Score: 6.184249194474601
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic consistency recognition aims to detect and judge whether the
semantics of two text sentences are consistent with each other. However, the
existing methods usually encounter the challenges of synonyms, polysemy and
difficulty to understand long text. To solve the above problems, this paper
proposes a co-driven semantic consistency recognition method based on the
fusion of Transformer and HowNet sememes knowledge. Multi-level encoding of
internal sentence structures via data-driven is carried out firstly by
Transformer, sememes knowledge base HowNet is introduced for knowledge-driven
to model the semantic knowledge association among sentence pairs. Then,
interactive attention calculation is carried out utilizing soft-attention and
fusion the knowledge with sememes matrix. Finally, bidirectional long
short-term memory network (BiLSTM) is exploited to encode the conceptual
semantic information and infer the semantic consistency. Experiments are
conducted on two financial text matching datasets (BQ, AFQMC) and a
cross-lingual adversarial dataset (PAWSX) for paraphrase identification.
Compared with lightweight models including DSSM, MwAN, DRCN, and pre-training
models such as ERNIE etc., the proposed model can not only improve the accuracy
of semantic consistency recognition effectively (by 2.19%, 5.57% and 6.51%
compared with the DSSM, MWAN and DRCN models on the BQ dataset), but also
reduce the number of model parameters (to about 16M). In addition, driven by
the HowNet sememes knowledge, the proposed method is promising to adapt to
scenarios with long text.
Related papers
- Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Unsupervised Mismatch Localization in Cross-Modal Sequential Data [5.932046800902776]
We develop an unsupervised learning algorithm that can infer the relationship between content-mismatched cross-modal data.
We propose a hierarchical Bayesian deep learning model, named mismatch localization variational autoencoder (ML-VAE), that decomposes the generative process of the speech into hierarchically structured latent variables.
Our experimental results show that ML-VAE successfully locates the mismatch between text and speech, without the need for human annotations.
arXiv Detail & Related papers (2022-05-05T14:23:27Z) - Exploring Multi-Modal Representations for Ambiguity Detection &
Coreference Resolution in the SIMMC 2.0 Challenge [60.616313552585645]
We present models for effective Ambiguity Detection and Coreference Resolution in Conversational AI.
Specifically, we use TOD-BERT and LXMERT based models, compare them to a number of baselines and provide ablation experiments.
Our results show that (1) language models are able to exploit correlations in the data to detect ambiguity; and (2) unimodal coreference resolution models can avoid the need for a vision component.
arXiv Detail & Related papers (2022-02-25T12:10:02Z) - Explaining Neural Network Predictions on Sentence Pairs via Learning
Word-Group Masks [21.16662651409811]
We propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together.
The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets.
arXiv Detail & Related papers (2021-04-09T17:14:34Z) - Cascaded Semantic and Positional Self-Attention Network for Document
Classification [9.292885582770092]
We propose a new architecture to aggregate the two sources of information using cascaded semantic and positional self-attention network (CSPAN)
The CSPAN uses a semantic self-attention layer cascaded with Bi-LSTM to process the semantic and positional information in a sequential manner, and then adaptively combine them together through a residue connection.
We evaluate the CSPAN model on several benchmark data sets for document classification with careful ablation studies, and demonstrate the encouraging results compared with state of the art.
arXiv Detail & Related papers (2020-09-15T15:02:28Z) - Towards Accurate Scene Text Recognition with Semantic Reasoning Networks [52.86058031919856]
We propose a novel end-to-end trainable framework named semantic reasoning network (SRN) for accurate scene text recognition.
GSRM is introduced to capture global semantic context through multi-way parallel transmission.
Results on 7 public benchmarks, including regular text, irregular text and non-Latin long text, verify the effectiveness and robustness of the proposed method.
arXiv Detail & Related papers (2020-03-27T09:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.