A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A
Transformer-based Approach
- URL: http://arxiv.org/abs/2207.11716v3
- Date: Sun, 25 Jun 2023 11:18:51 GMT
- Title: A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A
Transformer-based Approach
- Authors: Praneeth Nemani, Satyanarayana Vollala
- Abstract summary: We perform semantic similarity analysis and modeling on the U.S. Patent Phrase to Phrase Matching dataset using both traditional and transformer-based techniques.
The experimental results demonstrate our methodology's enhanced performance compared to traditional techniques, with an average Pearson correlation score of 0.79.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic similarity analysis and modeling is a fundamentally acclaimed task
in many pioneering applications of natural language processing today. Owing to
the sensation of sequential pattern recognition, many neural networks like RNNs
and LSTMs have achieved satisfactory results in semantic similarity modeling.
However, these solutions are considered inefficient due to their inability to
process information in a non-sequential manner, thus leading to the improper
extraction of context. Transformers function as the state-of-the-art
architecture due to their advantages like non-sequential data processing and
self-attention. In this paper, we perform semantic similarity analysis and
modeling on the U.S Patent Phrase to Phrase Matching Dataset using both
traditional and transformer-based techniques. We experiment upon four different
variants of the Decoding Enhanced BERT - DeBERTa and enhance its performance by
performing K-Fold Cross-Validation. The experimental results demonstrate our
methodology's enhanced performance compared to traditional techniques, with an
average Pearson correlation score of 0.79.
Related papers
- Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis [63.66763657191476]
We show that efficient numerical training and inference algorithms as low-rank computation have impressive performance for learning Transformer-based adaption.
We analyze how magnitude-based models affect generalization while improving adaption.
We conclude that proper magnitude-based has a slight on the testing performance.
arXiv Detail & Related papers (2024-06-24T23:00:58Z) - Differentiable Retrieval Augmentation via Generative Language Modeling
for E-commerce Query Intent Classification [8.59563091603226]
We propose Differentiable Retrieval Augmentation via Generative lANguage modeling(Dragan) to address this problem by a novel differentiable reformulation.
We demonstrate the effectiveness of our proposed method on a challenging NLP task in e-commerce search, namely query intent classification.
arXiv Detail & Related papers (2023-08-18T05:05:35Z) - Extensive Evaluation of Transformer-based Architectures for Adverse Drug
Events Extraction [6.78974856327994]
Adverse Event (ADE) extraction is one of the core tasks in digital pharmacovigilance.
We evaluate 19 Transformer-based models for ADE extraction on informal texts.
At the end of our analyses, we identify a list of take-home messages that can be derived from the experimental data.
arXiv Detail & Related papers (2023-06-08T15:25:24Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - Learning Semantic Textual Similarity via Topic-informed Discrete Latent
Variables [17.57873577962635]
We develop a topic-informed discrete latent variable model for semantic textual similarity.
Our model learns a shared latent space for sentence-pair representation via vector quantization.
We show that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.
arXiv Detail & Related papers (2022-11-07T15:09:58Z) - BayesFormer: Transformer with Uncertainty Estimation [31.206243748162553]
We introduce BayesFormer, a Transformer model with dropouts designed by Bayesian theory.
We show improvements across the board: language modeling and classification, long-sequence understanding, machine translation and acquisition function for active learning.
arXiv Detail & Related papers (2022-06-02T01:54:58Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Enriching Non-Autoregressive Transformer with Syntactic and
SemanticStructures for Neural Machine Translation [54.864148836486166]
We propose to incorporate the explicit syntactic and semantic structures of languages into a non-autoregressive Transformer.
Our model achieves a significantly faster speed, as well as keeps the translation quality when compared with several state-of-the-art non-autoregressive models.
arXiv Detail & Related papers (2021-01-22T04:12:17Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.