Related papers: Visual Spoofing in content based spam detection

Visual Spoofing in content based spam detection

URL: http://arxiv.org/abs/2004.05265v2
Date: Tue, 10 Nov 2020 01:44:32 GMT
Title: Visual Spoofing in content based spam detection
Authors: Mark Sokolov, Kehinde Olufowobi and Nic Herndon
Abstract summary: We present a vulnerability in which one could replace some characters with corresponding characters from a different alphabet. With this approach spammers can create messages that bypass existing spam filters. We show that this approach can be used to avoid plagiarism detection, and in other applications that use natural language processing for automatic analysis of text documents.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although the problem of spam classification seems to be solved, there are still vulnerabilities in the current spam filters that could be easily exploited. We present one such vulnerability, in which one could replace some characters with corresponding characters from a different alphabet. These characters are visually similar, yet have a different Unicode encoding. With this approach spammers can create messages that bypass existing spam filters. Moreover, we show that this approach can be used to avoid plagiarism detection, and in other applications that use natural language processing for automatic analysis of text documents.

Related papers

Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors [65.27124213266491]
We propose textbfContrastive textbfParaphrase textbfAttack (CoPA), a training-free method that effectively deceives text detectors.<n>CoPA constructs an auxiliary machine-like word distribution as a contrast to the human-like distribution generated by large language models.<n>Our theoretical analysis suggests the superiority of the proposed attack.
arXiv Detail & Related papers (2025-05-21T10:08:39Z)
Different Victims, Same Layout: Email Visual Similarity Detection for Enhanced Email Protection [0.3683202928838613]
We propose an email visual similarity detection approach, named Pisco, to improve the detection capabilities of an email threat defense system. Our results show that email kits are being reused extensively and visually similar emails are sent to our customers at various time intervals.
arXiv Detail & Related papers (2024-08-29T23:51:51Z)
Topic-Based Watermarks for LLM-Generated Text [46.71493672772134]
This paper proposes a novel topic-based watermarking algorithm for large language models (LLMs) By using topic-specific token biases, we embed a topic-sensitive watermarking into the generated text. We demonstrate that our proposed watermarking scheme classifies various watermarked text topics with 99.99% confidence.
arXiv Detail & Related papers (2024-04-02T17:49:40Z)
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks. There are concerns that LLMs are possible to be used improperly or even illegally. We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z)
Application of BadNets in Spam Filters [1.5755923640031848]
We design backdoor attacks in the domain of spam filtering. We highlight the need for careful consideration and evaluation of the models used in spam filters.
arXiv Detail & Related papers (2023-07-18T21:39:39Z)
On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document. We find that watermarks remain detectable even after human and machine paraphrasing. We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
Watermarking Text Generated by Black-Box Language Models [103.52541557216766]
A watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation. A detection algorithm aware of the list can identify the watermarked text. We develop a watermarking framework for black-box language model usage scenarios.
arXiv Detail & Related papers (2023-05-14T07:37:33Z)
Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc. Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques. In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
Building an Effective Email Spam Classification Model with spaCy [0.0]
Author has used spaCy natural language processing library and 3 machine learning (ML) algorithms Naive Bayes (NB), Decision Tree C45 and Multilayer Perceptron (MLP) in Python programming language to detect spam emails collected from Gmail service.
arXiv Detail & Related papers (2023-03-15T17:41:11Z)
Tracing Text Provenance via Context-Aware Lexical Substitution [81.49359106648735]
We propose a natural language watermarking scheme based on context-aware lexical substitution. Under both objective and subjective metrics, our watermarking scheme can well preserve the semantic integrity of original sentences.
arXiv Detail & Related papers (2021-12-15T04:27:33Z)
Privacy-Preserving Spam Filtering using Functional Encryption [1.0019926246026924]
We construct a spam classification framework that enables the classification of encrypted emails. Our model is based on a neural network with a quadratic network part and a multi-layer perception network part. The evaluation results on real-world spam datasets indicate that our proposed spam classification model achieves an accuracy of over 96%.
arXiv Detail & Related papers (2020-12-08T02:14:28Z)
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding [80.3811072650087]
We study natural language watermarking as a defense to help better mark and trace the provenance of text. We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training. AWT is the first end-to-end model to hide data in text by automatically learning -- without ground truth -- word substitutions along with their locations.
arXiv Detail & Related papers (2020-09-07T11:01:24Z)
DeepQuarantine for Suspicious Mail [0.0]
DeepQuarantine (DQ) is a cloud technology to detect and quarantine potential spam messages. Most of the quarantined mail is spam, which allows clients to use email without delay.
arXiv Detail & Related papers (2020-01-13T11:32:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.