A Novel Plagiarism Detection Approach Combining BERT-based Word
Embedding, Attention-based LSTMs and an Improved Differential Evolution
Algorithm
- URL: http://arxiv.org/abs/2305.02374v1
- Date: Wed, 3 May 2023 18:26:47 GMT
- Title: A Novel Plagiarism Detection Approach Combining BERT-based Word
Embedding, Attention-based LSTMs and an Improved Differential Evolution
Algorithm
- Authors: Seyed Vahid Moravvej, Seyed Jalaleddin Mousavirad, Diego Oliva, Fardin
Mohammadi
- Abstract summary: We propose a novel method for detecting plagiarism based on attention mechanism-based long short-term memory (LSTM) and bidirectional encoder representations from transformers (BERT) word embedding.
BERT could be included in a downstream task and fine-tuned as a task-specific structure, while the trained BERT model is capable of detecting various linguistic characteristics.
- Score: 11.142354615369273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting plagiarism involves finding similar items in two different sources.
In this article, we propose a novel method for detecting plagiarism that is
based on attention mechanism-based long short-term memory (LSTM) and
bidirectional encoder representations from transformers (BERT) word embedding,
enhanced with optimized differential evolution (DE) method for pre-training and
a focal loss function for training. BERT could be included in a downstream task
and fine-tuned as a task-specific BERT can be included in a downstream task and
fine-tuned as a task-specific structure, while the trained BERT model is
capable of detecting various linguistic characteristics. Unbalanced
classification is one of the primary issues with plagiarism detection. We
suggest a focal loss-based training technique that carefully learns minority
class instances to solve this. Another issue that we tackle is the training
phase itself, which typically employs gradient-based methods like
back-propagation for the learning process and thus suffers from some drawbacks,
including sensitivity to initialization. To initiate the BP process, we suggest
a novel DE algorithm that makes use of a clustering-based mutation operator.
Here, a winning cluster is identified for the current DE population, and a
fresh updating method is used to produce potential answers. We evaluate our
proposed approach on three benchmark datasets ( MSRP, SNLI, and SemEval2014)
and demonstrate that it performs well when compared to both conventional and
population-based methods.
Related papers
- A Contrastive Symmetric Forward-Forward Algorithm (SFFA) for Continual Learning Tasks [7.345136916791223]
Forward-Forward Algorithm (FFA) has recently gained momentum as an alternative to the conventional back-propagation algorithm for neural network learning.
This work proposes the Symmetric Forward-Forward Algorithm (SFFA), a novel modification of the original FFA which partitions each layer into positive and negative neurons.
arXiv Detail & Related papers (2024-09-11T16:21:44Z) - Multi-Granularity Semantic Revision for Large Language Model Distillation [66.03746866578274]
We propose a multi-granularity semantic revision method for LLM distillation.
At the sequence level, we propose a sequence correction and re-generation strategy.
At the token level, we design a distribution adaptive clipping Kullback-Leibler loss as the distillation objective function.
At the span level, we leverage the span priors of a sequence to compute the probability correlations within spans, and constrain the teacher and student's probability correlations to be consistent.
arXiv Detail & Related papers (2024-07-14T03:51:49Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - AICSD: Adaptive Inter-Class Similarity Distillation for Semantic
Segmentation [12.92102548320001]
This paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation.
The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs.
Experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-08-08T13:17:20Z) - Informative regularization for a multi-layer perceptron RR Lyrae
classifier under data shift [3.303002683812084]
We propose a scalable and easily adaptable approach based on an informative regularization and an ad-hoc training procedure to mitigate the shift problem.
Our method provides a new path to incorporate knowledge from characteristic features into artificial neural networks to manage the underlying data shift problem.
arXiv Detail & Related papers (2023-03-12T02:49:19Z) - An LSTM-based Plagiarism Detection via Attention Mechanism and a
Population-based Approach for Pre-Training Parameters with imbalanced Classes [1.9949261242626626]
This paper proposes an architecture based on a Long Short-Term Memory (LSTM) and attention mechanism called LSTM-AM-ABC.
Our proposed algorithm can find the initial values for model learning in all LSTM, attention mechanism, and feed-forward neural network, simultaneously.
arXiv Detail & Related papers (2021-10-17T09:20:03Z) - Training ELECTRA Augmented with Multi-word Selection [53.77046731238381]
We present a new text encoder pre-training method that improves ELECTRA based on multi-task learning.
Specifically, we train the discriminator to simultaneously detect replaced tokens and select original tokens from candidate sets.
arXiv Detail & Related papers (2021-05-31T23:19:00Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Simultaneous Perturbation Stochastic Approximation for Few-Shot Learning [0.5801044612920815]
We propose a prototypical-like few-shot learning approach based on the prototypical networks method.
The results of experiments on the benchmark dataset demonstrate that the proposed method is superior to the original networks.
arXiv Detail & Related papers (2020-06-09T09:47:58Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.