Related papers: Boosting the Performance of Transformer Architectures for Semantic Textual Similarity

Boosting the Performance of Transformer Architectures for Semantic Textual Similarity

URL: http://arxiv.org/abs/2306.00708v1
Date: Thu, 1 Jun 2023 14:16:53 GMT
Title: Boosting the Performance of Transformer Architectures for Semantic Textual Similarity
Authors: Ivan Rep, Vladimir \v{C}eperi\'c
Abstract summary: We fine-tune transformer architectures for semantic textual similarity on the Semantic Textual Similarity Benchmark. We experiment with BERT, RoBERTa, and DeBERTaV3 cross-encoders by approaching the problem as a binary classification task or a regression task.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Semantic textual similarity is the task of estimating the similarity between the meaning of two texts. In this paper, we fine-tune transformer architectures for semantic textual similarity on the Semantic Textual Similarity Benchmark by tuning the model partially and then end-to-end. We experiment with BERT, RoBERTa, and DeBERTaV3 cross-encoders by approaching the problem as a binary classification task or a regression task. We combine the outputs of the transformer models and use handmade features as inputs for boosting algorithms. Due to worse test set results coupled with improvements on the validation set, we experiment with different dataset splits to further investigate this occurrence. We also provide an error analysis, focused on the edges of the prediction range.

Related papers

Comateformer: Combined Attention Transformer for Semantic Sentence Matching [11.746010399185437]
We propose a novel semantic sentence matching model named Combined Attention Network based on Transformer model (Comateformer) In Comateformer model, we design a novel transformer-based quasi-attention mechanism with compositional properties. Our proposed approach builds on the intuition of similarity and dissimilarity (negative affinity) when calculating dual affinity scores.
arXiv Detail & Related papers (2024-12-10T06:18:07Z)
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective [50.261681681643076]
We propose a novel metric called SemVarEffect and a benchmark named SemVarBench to evaluate the causality between semantic variations in inputs and outputs in text-to-image synthesis. Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.
arXiv Detail & Related papers (2024-10-14T08:45:35Z)
Table Transformers for Imputing Textual Attributes [15.823533688884105]
We propose a novel end-to-end approach called Table Transformers for Imputing Textual Attributes (TTITA) Our approach shows competitive performance outperforming baseline models such as recurrent neural networks and Llama2. We incorporate multi-task learning to simultaneously impute for heterogeneous columns, boosting the performance for text imputation.
arXiv Detail & Related papers (2024-08-04T19:54:12Z)
Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval [68.61855682218298]
Cross-modal retrieval methods employ two-stream encoders with different architectures for images and texts. Inspired by recent advances of Transformers in vision tasks, we propose to unify the encoder architectures with Transformers for both modalities. We design a cross-modal retrieval framework purely based on two-stream Transformers, dubbed textbfHierarchical Alignment Transformers (HAT), which consists of an image Transformer, a text Transformer, and a hierarchical alignment module.
arXiv Detail & Related papers (2023-08-08T15:43:59Z)
Automatic Design of Semantic Similarity Ensembles Using Grammatical Evolution [0.0]
This paper presents an automated strategy based on grammatical evolution for constructing semantic similarity ensembles. Experiments on standard benchmark datasets demonstrate that the proposed approach outperforms existing ensemble techniques in terms of accuracy.
arXiv Detail & Related papers (2023-07-03T10:53:05Z)
BERT-Based Combination of Convolutional and Recurrent Neural Network for Indonesian Sentiment Analysis [0.0]
This research extends the previous hybrid deep learning using BERT representation for Indonesian sentiment analysis. Our simulation shows that the BERT representation improves the accuracies of all hybrid architectures.
arXiv Detail & Related papers (2022-11-10T00:32:40Z)
Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages [120.74406230847904]
TP-Transformer augments the traditional Transformer architecture to include an additional component to represent structure. The second method imbues structure at the data level by segmenting the data with morphological tokenization. We find that each of these two approaches allows the network to achieve better performance, but this improvement is dependent on the size of the dataset.
arXiv Detail & Related papers (2022-08-11T22:42:24Z)
A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A Transformer-based Approach [0.0]
We perform semantic similarity analysis and modeling on the U.S. Patent Phrase to Phrase Matching dataset using both traditional and transformer-based techniques. The experimental results demonstrate our methodology's enhanced performance compared to traditional techniques, with an average Pearson correlation score of 0.79.
arXiv Detail & Related papers (2022-07-24T11:06:56Z)
HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization. Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z)
Syntax-Enhanced Pre-trained Model [49.1659635460369]
We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa. Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages. We present a model that utilizes the syntax of text in both pre-training and fine-tuning stages.
arXiv Detail & Related papers (2020-12-28T06:48:04Z)
Logic Constrained Pointer Networks for Interpretable Textual Similarity [11.142649867439406]
We introduce a novel pointer network based model with a sentinel gating function to align constituent chunks. We improve this base model with a loss function to equally penalize misalignments in both sentences, ensuring the alignments are bidirectional. The model achieves an F1 score of 97.73 and 96.32 on the benchmark SemEval datasets for the chunk alignment task.
arXiv Detail & Related papers (2020-07-15T13:01:44Z)
Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. We formulate the extractive summarization task as a semantic text matching problem. We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.