A novel hybrid methodology of measuring sentence similarity
- URL: http://arxiv.org/abs/2105.00648v1
- Date: Mon, 3 May 2021 06:50:54 GMT
- Title: A novel hybrid methodology of measuring sentence similarity
- Authors: Yongmin Yoo, Tak-Sung Heo, Yeongjoon Park
- Abstract summary: It is necessary to measure the similarity between sentences accurately.
Deep learning methodology shows a state-of-the-art performance in many natural language processing fields.
Considering the structure of the sentence or the word structure that makes up the sentence is also important.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The problem of measuring sentence similarity is an essential issue in the
natural language processing (NLP) area. It is necessary to measure the
similarity between sentences accurately. There are many approaches to measuring
sentence similarity. Deep learning methodology shows a state-of-the-art
performance in many natural language processing fields and is used a lot in
sentence similarity measurement methods. However, in the natural language
processing field, considering the structure of the sentence or the word
structure that makes up the sentence is also important. In this study, we
propose a methodology combined with both deep learning methodology and a method
considering lexical relationships. Our evaluation metric is the Pearson
correlation coefficient and Spearman correlation coefficient. As a result, the
proposed method outperforms the current approaches on a KorSTS standard
benchmark Korean dataset. Moreover, it performs a maximum of 65% increase than
only using deep learning methodology. Experiments show that our proposed method
generally results in better performance than those with only a deep learning
model.
Related papers
- DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Relation-aware Ensemble Learning for Knowledge Graph Embedding [68.94900786314666]
We propose to learn an ensemble by leveraging existing methods in a relation-aware manner.
exploring these semantics using relation-aware ensemble leads to a much larger search space than general ensemble methods.
We propose a divide-search-combine algorithm RelEns-DSC that searches the relation-wise ensemble weights independently.
arXiv Detail & Related papers (2023-10-13T07:40:12Z) - A Comparative Study of Sentence Embedding Models for Assessing Semantic
Variation [0.0]
We compare several recent sentence embedding methods via time-series of semantic similarity between successive sentences and matrices of pairwise sentence similarity for multiple books of literature.
We find that most of the sentence embedding methods considered do infer highly correlated patterns of semantic similarity in a given document, but show interesting differences.
arXiv Detail & Related papers (2023-08-08T23:31:10Z) - Automatic Design of Semantic Similarity Ensembles Using Grammatical Evolution [0.0]
No single semantic similarity measure is the most appropriate for all tasks, and researchers often use ensemble strategies to ensure performance.
This research work proposes a method for automatically designing semantic similarity ensembles.
Our proposed method uses grammatical evolution, for the first time, to automatically select and aggregate measures from a pool of candidates to create an ensemble that maximizes correlation to human judgment.
arXiv Detail & Related papers (2023-07-03T10:53:05Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Differentiable Data Augmentation for Contrastive Sentence Representation
Learning [6.398022050054328]
The proposed method yields significant improvements over existing methods under both semi-supervised and supervised settings.
Our experiments under a low labeled data setting also show that our method is more label-efficient than the state-of-the-art contrastive learning methods.
arXiv Detail & Related papers (2022-10-29T08:57:45Z) - A New Sentence Ordering Method Using BERT Pretrained Model [2.1793134762413433]
We propose a method for sentence ordering which does not need a training phase and consequently a large corpus for learning.
Our proposed method outperformed other baselines on ROCStories, a corpus of 5-sentence human-made stories.
Among other advantages of this method are its interpretability and needlessness to linguistic knowledge.
arXiv Detail & Related papers (2021-08-26T18:47:15Z) - On Sampling-Based Training Criteria for Neural Language Modeling [97.35284042981675]
We consider Monte Carlo sampling, importance sampling, a novel method we call compensated partial summation, and noise contrastive estimation.
We show that all these sampling methods can perform equally well, as long as we correct for the intended class posterior probabilities.
Experimental results in language modeling and automatic speech recognition on Switchboard and LibriSpeech support our claim.
arXiv Detail & Related papers (2021-04-21T12:55:52Z) - A Statistical Analysis of Summarization Evaluation Metrics using
Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are.
Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z) - A Topological Method for Comparing Document Semantics [0.0]
We propose a novel algorithm for comparing semantics similarity between two documents.
Our experiments are conducted on a document dataset with human judges' results.
Our algorithm can produce highly human-consistent results, and also beats most state-of-the-art methods though ties with NLTK.
arXiv Detail & Related papers (2020-12-08T04:21:40Z) - Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance.
We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations.
Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.