Measuring risks inherent to our digital economies using Amazon purchase histories from US consumers
- URL: http://arxiv.org/abs/2502.18774v1
- Date: Wed, 26 Feb 2025 03:06:04 GMT
- Title: Measuring risks inherent to our digital economies using Amazon purchase histories from US consumers
- Authors: Alex Berke, Kent Larson, Sandy Pentland, Dana Calacci,
- Abstract summary: We show that purchases for pickles and trampolines risk revealing clues about customers' personal attributes - in this case, their race.<n>This work provides the first open analysis measuring these risks, using purchase histories crowdsourced from (N=4248) US Amazon.com customers.<n>We demonstrate how easily consumers' personal attributes, such as health and lifestyle information, gender, age, and race, can be inferred from purchases.<n>To better understand the risks that highly resourced firms like Amazon, data brokers, and advertisers present to consumers, we measure how our models' predictive power scales with more data.
- Score: 6.090518383948741
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: What do pickles and trampolines have in common? In this paper we show that while purchases for these products may seem innocuous, they risk revealing clues about customers' personal attributes - in this case, their race. As online retail and digital purchases become increasingly common, consumer data has become increasingly valuable, raising the risks of privacy violations and online discrimination. This work provides the first open analysis measuring these risks, using purchase histories crowdsourced from (N=4248) US Amazon.com customers and survey data on their personal attributes. With this limited sample and simple models, we demonstrate how easily consumers' personal attributes, such as health and lifestyle information, gender, age, and race, can be inferred from purchases. For example, our models achieve AUC values over 0.9 for predicting gender and over 0.8 for predicting diabetes status. To better understand the risks that highly resourced firms like Amazon, data brokers, and advertisers present to consumers, we measure how our models' predictive power scales with more data. Finally, we measure and highlight how different product categories contribute to inference risk in order to make our findings more interpretable and actionable for future researchers and privacy advocates.
Related papers
- Evaluating Amazon Effects and the Limited Impact of COVID-19 With Purchases Crowdsourced from US Consumers [42.80166440735519]
We leverage a recently published dataset of Amazon purchase histories, crowdsourced from thousands of US consumers.<n>We study how online purchasing behaviors have changed over time, how changes vary across demographic groups, the impact of the COVID-19 pandemic, and relationships between online and offline retail.
arXiv Detail & Related papers (2025-01-17T23:03:56Z) - How Unique is Whose Web Browser? The role of demographics in browser fingerprinting among US users [50.699390248359265]
Browser fingerprinting can be used to identify and track users across the Web, even without cookies.
This technique and resulting privacy risks have been studied for over a decade.
We provide a first-of-its-kind dataset to enable further research.
arXiv Detail & Related papers (2024-10-09T14:51:58Z) - Language Models Can Reduce Asymmetry in Information Markets [100.38786498942702]
We introduce an open-source simulated digital marketplace where intelligent agents, powered by language models, buy and sell information on behalf of external participants.
The central mechanism enabling this marketplace is the agents' dual capabilities: they have the capacity to assess the quality of privileged information but also come equipped with the ability to forget.
To perform well, agents must make rational decisions, strategically explore the marketplace through generated sub-queries, and synthesize answers from purchased information.
arXiv Detail & Related papers (2024-03-21T14:48:37Z) - Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning.
By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Characterization of Frequent Online Shoppers using Statistical Learning
with Sparsity [54.26540039514418]
This work reports a method to learn the shopping preferences of frequent shoppers to an online gift store by combining ideas from retail analytics and statistical learning with sparsity.
arXiv Detail & Related papers (2021-11-11T05:36:39Z) - Predicting Customer Lifetime Values -- ecommerce use case [0.0]
This work compares two approaches to predict customer future purchases, first using a 'buy-till-you-die' statistical model to predict customer behavior and later using a neural network on the same dataset and comparing the results.
arXiv Detail & Related papers (2021-02-10T23:17:16Z) - Face to Purchase: Predicting Consumer Choices with Structured Facial and
Behavioral Traits Embedding [53.02059906193556]
We propose to predict consumers' purchases based on their facial features and purchasing histories.
We design a semi-supervised model based on a hierarchical embedding network to extract high-level features of consumers.
Our experimental results on a real-world dataset demonstrate the positive effect of incorporating facial information in predicting consumers' purchasing behaviors.
arXiv Detail & Related papers (2020-07-14T06:06:41Z) - The wisdom of the few: Predicting collective success from individual
behavior [0.0]
Small sets of "discoverers" offer reliable success predictions for the brick-and-mortar stores they visit.
We find that the purchasing history alone enables the detection of small sets of discoverers"
Our findings show that companies and organizations with access to large-scale purchasing data can detect the discoverers and leverage their behavior to anticipate market trends.
arXiv Detail & Related papers (2020-01-14T13:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.