EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation
- URL: http://arxiv.org/abs/2012.04864v2
- Date: Mon, 12 Apr 2021 03:12:53 GMT
- Title: EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation
- Authors: Qi Zhou, Haipeng Chen, Yitao Zheng, Zhen Wang
- Abstract summary: We study whether Latent Dirichlet Allocation models are vulnerable to adversarial perturbations during inference time.
We propose a novel and efficient algorithm, EvaLDA, to solve it.
Our work provides significant insights into the power and limitations of evasion attacks to LDA models.
- Score: 9.277398460006394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As one of the most powerful topic models, Latent Dirichlet Allocation (LDA)
has been used in a vast range of tasks, including document understanding,
information retrieval and peer-reviewer assignment. Despite its tremendous
popularity, the security of LDA has rarely been studied. This poses severe
risks to security-critical tasks such as sentiment analysis and peer-reviewer
assignment that are based on LDA. In this paper, we are interested in knowing
whether LDA models are vulnerable to adversarial perturbations of benign
document examples during inference time. We formalize the evasion attack to LDA
models as an optimization problem and prove it to be NP-hard. We then propose a
novel and efficient algorithm, EvaLDA to solve it. We show the effectiveness of
EvaLDA via extensive empirical evaluations. For instance, in the NIPS dataset,
EvaLDA can averagely promote the rank of a target topic from 10 to around 7 by
only replacing 1% of the words with similar words in a victim document. Our
work provides significant insights into the power and limitations of evasion
attacks to LDA models.
Related papers
- Evading Data Provenance in Deep Neural Networks [15.428092329709823]
We introduce a unified evasion framework, in which a teacher model first learns from the copyright dataset and then transfers task-relevant yet identifier-independent domain knowledge to a surrogate student.<n>Our approach simultaneously eliminates all copyright identifiers and significantly outperforms nine state-of-the-art evasion attacks in both generalization and effectiveness.<n>As a proof of concept, we reveal key vulnerabilities in current DOV methods, highlighting the need for long-term development to enhance practicality.
arXiv Detail & Related papers (2025-08-01T21:13:45Z) - Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers [61.57691030102618]
We propose a novel jailbreaking method, Paper Summary Attack (llmnamePSA)<n>It synthesizes content from either attack-focused or defense-focused LLM safety paper to construct an adversarial prompt template.<n>Experiments show significant vulnerabilities not only in base LLMs, but also in state-of-the-art reasoning model like Deepseek-R1.
arXiv Detail & Related papers (2025-07-17T18:33:50Z) - No Query, No Access [50.18709429731724]
We introduce the textbfVictim Data-based Adrial Attack (VDBA), which operates using only victim texts.<n>To prevent access to the victim model, we create a shadow dataset with publicly available pre-trained models and clustering methods.<n>Experiments on the Emotion and SST5 datasets show that VDBA outperforms state-of-the-art methods, achieving an ASR improvement of 52.08%.
arXiv Detail & Related papers (2025-05-12T06:19:59Z) - LLM-Safety Evaluations Lack Robustness [58.334290876531036]
We argue that current safety alignment research efforts for large language models are hindered by many intertwined sources of noise.
We propose a set of guidelines for reducing noise and bias in evaluations of future attack and defense papers.
arXiv Detail & Related papers (2025-03-04T12:55:07Z) - Training Data Attribution (TDA): Examining Its Adoption & Use Cases [5.256285764938807]
This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for reducing extreme risks from AI.
We discuss the plausibility and amount of effort it would take to bring existing TDA research efforts from their current state, to an efficient and accurate tool for TDA inference.
We list and discuss a series of policies and systems that may be enabled by TDA.
arXiv Detail & Related papers (2025-01-22T05:03:51Z) - Unveiling the Superior Paradigm: A Comparative Study of Source-Free Domain Adaptation and Unsupervised Domain Adaptation [52.36436121884317]
We show that Source-Free Domain Adaptation (SFDA) generally outperforms Unsupervised Domain Adaptation (UDA) in real-world scenarios.
SFDA offers advantages in time efficiency, storage requirements, targeted learning objectives, reduced risk of negative transfer, and increased robustness against overfitting.
We propose a novel weight estimation method that effectively integrates available source data into multi-SFDA approaches.
arXiv Detail & Related papers (2024-11-24T13:49:29Z) - Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector [97.92369017531038]
We build a new laRge-scale Adervsarial images dataset with Diverse hArmful Responses (RADAR)
We then develop a novel iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of Visual Language Models (VLMs) to achieve the detection of adversarial images against benign ones in the input.
arXiv Detail & Related papers (2024-10-30T10:33:10Z) - Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - ToDA: Target-oriented Diffusion Attacker against Recommendation System [19.546532220090793]
Recommendation systems (RS) are susceptible to malicious attacks where adversaries can manipulate user profiles, leading to biased recommendations.
Recent research often integrates additional modules using generative models to craft these deceptive user profiles.
We propose a novel Target-oriented Diffusion Attack model (ToDA)
It incorporates a pre-trained autoencoder that transforms user profiles into a high dimensional space, paired with a Latent Diffusion Attacker (LDA)-the core component of ToDA.
arXiv Detail & Related papers (2024-01-23T09:12:26Z) - Why Should Adversarial Perturbations be Imperceptible? Rethink the
Research Paradigm in Adversarial NLP [83.66405397421907]
We rethink the research paradigm of textual adversarial samples in security scenarios.
We first collect, process, and release a security datasets collection Advbench.
Next, we propose a simple method based on rules that can easily fulfill the actual adversarial goals to simulate real-world attack methods.
arXiv Detail & Related papers (2022-10-19T15:53:36Z) - Exploring Adversarially Robust Training for Unsupervised Domain
Adaptation [71.94264837503135]
Unsupervised Domain Adaptation (UDA) methods aim to transfer knowledge from a labeled source domain to an unlabeled target domain.
This paper explores how to enhance the unlabeled data robustness via AT while learning domain-invariant features for UDA.
We propose a novel Adversarially Robust Training method for UDA accordingly, referred to as ARTUDA.
arXiv Detail & Related papers (2022-02-18T17:05:19Z) - A Review of Adversarial Attack and Defense for Classification Methods [78.50824774203495]
This paper focuses on the generation and guarding of adversarial examples.
It is the hope of the authors that this paper will encourage more statisticians to work on this important and exciting field of generating and defending against adversarial examples.
arXiv Detail & Related papers (2021-11-18T22:13:43Z) - DAMIA: Leveraging Domain Adaptation as a Defense against Membership
Inference Attacks [22.10053473193636]
We propose and implement DAMIA, leveraging Domain Adaptation (DA) as a defense aginist membership inference attacks.
Our observation is that DA obfuscates the dataset to be protected using another related dataset, and derives a model that underlyingly extracts the features from both datasets.
The model trained by DAMIA has a negligible footprint to the usability.
arXiv Detail & Related papers (2020-05-16T15:24:28Z) - Improving Reliability of Latent Dirichlet Allocation by Assessing Its
Stability Using Clustering Techniques on Replicated Runs [0.3499870393443268]
We study the stability of LDA by comparing assignments from replicated runs.
We propose to quantify the similarity of two generated topics by a modified Jaccard coefficient.
We show that the measure S-CLOP is useful for assessing the stability of LDA models.
arXiv Detail & Related papers (2020-02-14T07:10:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.