Dark patterns in e-commerce: a dataset and its baseline evaluations
- URL: http://arxiv.org/abs/2211.06543v1
- Date: Sat, 12 Nov 2022 01:53:49 GMT
- Title: Dark patterns in e-commerce: a dataset and its baseline evaluations
- Authors: Yuki Yada, Jiaying Feng, Tsuneo Matsumoto, Nao Fukushima, Fuyuko Kido,
Hayato Yamana
- Abstract summary: We constructed a dataset for dark pattern detection with state-of-the-art machine learning methods.
As a result of 5-fold cross-validation, we achieved the highest accuracy of 0.975 with RoBERTa.
- Score: 0.14680035572775535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dark patterns, which are user interface designs in online services, induce
users to take unintended actions. Recently, dark patterns have been raised as
an issue of privacy and fairness. Thus, a wide range of research on detecting
dark patterns is eagerly awaited. In this work, we constructed a dataset for
dark pattern detection and prepared its baseline detection performance with
state-of-the-art machine learning methods. The original dataset was obtained
from Mathur et al.'s study in 2019, which consists of 1,818 dark pattern texts
from shopping sites. Then, we added negative samples, i.e., non-dark pattern
texts, by retrieving texts from the same websites as Mathur et al.'s dataset.
We also applied state-of-the-art machine learning methods to show the automatic
detection accuracy as baselines, including BERT, RoBERTa, ALBERT, and XLNet. As
a result of 5-fold cross-validation, we achieved the highest accuracy of 0.975
with RoBERTa. The dataset and baseline source codes are available at
https://github.com/yamanalab/ec-darkpattern.
Related papers
- Detecting Deceptive Dark Patterns in E-commerce Platforms [0.0]
Dark patterns are deceptive user interfaces employed by e-commerce websites to manipulate user's behavior in a way that benefits the website, often unethically.
Existing solutions include UIGuard, which uses computer vision and natural language processing, and approaches that categorize dark patterns based on detectability or utilize machine learning models trained on datasets.
We propose combining web scraping techniques with fine-tuned BERT language models and generative capabilities to identify dark patterns, including outliers.
arXiv Detail & Related papers (2024-05-27T16:32:40Z) - Why is the User Interface a Dark Pattern? : Explainable Auto-Detection
and its Analysis [1.4474137122906163]
Dark patterns are deceptive user interface designs for online services that make users behave in unintended ways.
We study interpretable dark pattern auto-detection, that is, why a particular user interface is detected as having dark patterns.
Our findings may prevent users from being manipulated by dark patterns, and aid in the construction of more equitable internet services.
arXiv Detail & Related papers (2023-12-30T03:53:58Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Forecasting with Deep Learning [0.0]
This paper presents a method for time series forecasting with deep learning and its assessment on two datasets.
A single time series can be used to train deep learning networks if time series in a dataset contain patterns that repeat even with a certain variation.
For less structured time series such as stock market closing prices, the networks perform just like a baseline that repeats the last observed value.
arXiv Detail & Related papers (2023-02-17T10:09:22Z) - On the Blind Spots of Model-Based Evaluation Metrics for Text Generation [79.01422521024834]
We explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics.
We design and synthesize a wide range of potential errors and check whether they result in a commensurate drop in the metric scores.
Our experiments reveal interesting insensitivities, biases, or even loopholes in existing metrics.
arXiv Detail & Related papers (2022-12-20T06:24:25Z) - TeST: Test-time Self-Training under Distribution Shift [99.68465267994783]
Test-Time Self-Training (TeST) is a technique that takes as input a model trained on some source data and a novel data distribution at test time.
We find that models adapted using TeST significantly improve over baseline test-time adaptation algorithms.
arXiv Detail & Related papers (2022-09-23T07:47:33Z) - Automated detection of dark patterns in cookie banners: how to do it
poorly and why it is hard to do it any other way [7.2834950390171205]
A dataset of cookie banners of 300 news websites was used to train a prediction model that does exactly that.
The accuracy of the trained model is promising, but allows a lot of room for improvement.
We provide an in-depth analysis of the interdisciplinary challenges that automated dark pattern detection poses to artificial intelligence.
arXiv Detail & Related papers (2022-04-21T12:10:27Z) - Double Perturbation: On the Robustness of Robustness and Counterfactual
Bias Evaluation [109.06060143938052]
We propose a "double perturbation" framework to uncover model weaknesses beyond the test dataset.
We apply this framework to study two perturbation-based approaches that are used to analyze models' robustness and counterfactual bias in English.
arXiv Detail & Related papers (2021-04-12T06:57:36Z) - Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks [75.46678178805382]
In a emphdata poisoning attack, an attacker modifies, deletes, and/or inserts some training examples to corrupt the learnt machine learning model.
We prove the intrinsic certified robustness of bagging against data poisoning attacks.
Our method achieves a certified accuracy of $91.1%$ on MNIST when arbitrarily modifying, deleting, and/or inserting 100 training examples.
arXiv Detail & Related papers (2020-08-11T03:12:42Z) - Evaluating Models' Local Decision Boundaries via Contrast Sets [119.38387782979474]
We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data.
We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets.
Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets.
arXiv Detail & Related papers (2020-04-06T14:47:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.