Visual Spoofing in content based spam detection
- URL: http://arxiv.org/abs/2004.05265v2
- Date: Tue, 10 Nov 2020 01:44:32 GMT
- Title: Visual Spoofing in content based spam detection
- Authors: Mark Sokolov, Kehinde Olufowobi and Nic Herndon
- Abstract summary: We present a vulnerability in which one could replace some characters with corresponding characters from a different alphabet.
With this approach spammers can create messages that bypass existing spam filters.
We show that this approach can be used to avoid plagiarism detection, and in other applications that use natural language processing for automatic analysis of text documents.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although the problem of spam classification seems to be solved, there are
still vulnerabilities in the current spam filters that could be easily
exploited. We present one such vulnerability, in which one could replace some
characters with corresponding characters from a different alphabet. These
characters are visually similar, yet have a different Unicode encoding. With
this approach spammers can create messages that bypass existing spam filters.
Moreover, we show that this approach can be used to avoid plagiarism detection,
and in other applications that use natural language processing for automatic
analysis of text documents.
Related papers
- A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks.
There are concerns that LLMs are possible to be used improperly or even illegally.
We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z) - Application of BadNets in Spam Filters [1.5755923640031848]
We design backdoor attacks in the domain of spam filtering.
We highlight the need for careful consideration and evaluation of the models used in spam filters.
arXiv Detail & Related papers (2023-07-18T21:39:39Z) - On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document.
We find that watermarks remain detectable even after human and machine paraphrasing.
We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z) - Watermarking Text Generated by Black-Box Language Models [103.52541557216766]
A watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation.
A detection algorithm aware of the list can identify the watermarked text.
We develop a watermarking framework for black-box language model usage scenarios.
arXiv Detail & Related papers (2023-05-14T07:37:33Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - Building an Effective Email Spam Classification Model with spaCy [0.0]
Author has used spaCy natural language processing library and 3 machine learning (ML) algorithms Naive Bayes (NB), Decision Tree C45 and Multilayer Perceptron (MLP) in Python programming language to detect spam emails collected from Gmail service.
arXiv Detail & Related papers (2023-03-15T17:41:11Z) - A Late Multi-Modal Fusion Model for Detecting Hybrid Spam E-mail [5.182080825408661]
A few studies have been conducted with the goal of detecting hybrid spam e-mails.
Optical Character Recognition is a very successful technique in processing text-and-image hybrid spam.
We propose new late multi-modal fusion training frameworks for a text-and-image hybrid spam e-mail filtering system.
arXiv Detail & Related papers (2022-10-26T10:47:12Z) - Tracing Text Provenance via Context-Aware Lexical Substitution [81.49359106648735]
We propose a natural language watermarking scheme based on context-aware lexical substitution.
Under both objective and subjective metrics, our watermarking scheme can well preserve the semantic integrity of original sentences.
arXiv Detail & Related papers (2021-12-15T04:27:33Z) - Privacy-Preserving Spam Filtering using Functional Encryption [1.0019926246026924]
We construct a spam classification framework that enables the classification of encrypted emails.
Our model is based on a neural network with a quadratic network part and a multi-layer perception network part.
The evaluation results on real-world spam datasets indicate that our proposed spam classification model achieves an accuracy of over 96%.
arXiv Detail & Related papers (2020-12-08T02:14:28Z) - Adversarial Watermarking Transformer: Towards Tracing Text Provenance
with Data Hiding [80.3811072650087]
We study natural language watermarking as a defense to help better mark and trace the provenance of text.
We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training.
AWT is the first end-to-end model to hide data in text by automatically learning -- without ground truth -- word substitutions along with their locations.
arXiv Detail & Related papers (2020-09-07T11:01:24Z) - DeepQuarantine for Suspicious Mail [0.0]
DeepQuarantine (DQ) is a cloud technology to detect and quarantine potential spam messages.
Most of the quarantined mail is spam, which allows clients to use email without delay.
arXiv Detail & Related papers (2020-01-13T11:32:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.