When Ads Become Profiles: Large-Scale Audit of Algorithmic Biases and LLM Profiling Risks
- URL: http://arxiv.org/abs/2509.18874v1
- Date: Tue, 23 Sep 2025 10:10:37 GMT
- Title: When Ads Become Profiles: Large-Scale Audit of Algorithmic Biases and LLM Profiling Risks
- Authors: Baiyu Chen, Benjamin Tag, Hao Xue, Daniel Angus, Flora Salim,
- Abstract summary: Automated ad targeting on social media is opaque, creating risks of exploitation and invisibility to external scrutiny.<n>We introduce a multi-stage auditing framework to investigate these risks.<n>A large-scale audit of over 435,000 ad impressions delivered to 891 Australian Facebook users reveals algorithmic biases.
- Score: 10.267951162011475
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automated ad targeting on social media is opaque, creating risks of exploitation and invisibility to external scrutiny. Users may be steered toward harmful content while independent auditing of these processes remains blocked. Large Language Models (LLMs) raise a new concern: the potential to reverse-engineer sensitive user attributes from exposure alone. We introduce a multi-stage auditing framework to investigate these risks. First, a large-scale audit of over 435,000 ad impressions delivered to 891 Australian Facebook users reveals algorithmic biases, including disproportionate Gambling and Politics ads shown to socioeconomically vulnerable and politically aligned groups. Second, a multimodal LLM can reconstruct users' demographic profiles from ad streams, outperforming census-based baselines and matching or exceeding human performance. Our results provide the first empirical evidence that ad streams constitute rich digital footprints for public AI inference, highlighting urgent privacy risks and the need for content-level auditing and governance.
Related papers
- (Mis-)Informed Consent: Predatory Apps and the Exploitation of Populations with Limited Literacy [1.5370108793508594]
This paper examines how informed consent is often abused by predatory financial applications.<n>We analyze a dataset of 50 Google Play Store apps to measure how many omit or obfuscate critical privacy disclosures.<n>Our findings show that 85% of study participants did not understand basic app permissions.
arXiv Detail & Related papers (2026-01-16T20:23:33Z) - SoK: Privacy Risks and Mitigations in Retrieval-Augmented Generation Systems [53.51921540246166]
Retrieval-Augmented Generation (RAG) techniques have become widely popular.<n>RAG involves the coupling of Large Language Models (LLMs) with domain-specific knowledge bases.<n>The proliferation of RAG has sparked concerns about data privacy.
arXiv Detail & Related papers (2026-01-07T14:50:41Z) - When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms [101.2197679948061]
We study the risks of collective financial fraud in large-scale multi-agent systems powered by large language model (LLM) agents.<n>We present MultiAgentFraudBench, a large-scale benchmark for simulating financial fraud scenarios.
arXiv Detail & Related papers (2025-11-09T16:30:44Z) - Disclosure Audits for LLM Agents [44.27620230177312]
Large Language Model agents have begun to appear as personal assistants, customer service bots, and clinical aides.<n>This study proposes an auditing framework for conversational privacy that quantifies and audits these risks.
arXiv Detail & Related papers (2025-06-11T20:47:37Z) - Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation [17.859942323017133]
We provide the first systematic analysis of MRAG privacy vulnerabilities across vision-language and speech-language modalities.<n>Our experiments reveal that LMMs can both directly generate outputs resembling retrieved content and produce descriptions that indirectly expose sensitive information.
arXiv Detail & Related papers (2025-05-20T05:37:22Z) - Automated Profile Inference with Language Model Agents [67.32226960040514]
We study a new threat that LLMs pose to online pseudonymity, called automated profile inference.<n>An adversary can instruct LLMs to automatically scrape and extract sensitive personal attributes from publicly visible user activities on pseudonymous platforms.<n>We introduce an automated profiling framework called AutoProfiler to assess the feasibility of such threats in real-world scenarios.
arXiv Detail & Related papers (2025-05-18T13:05:17Z) - Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication.
In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness.
Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z) - Auditing for Bias in Ad Delivery Using Inferred Demographic Attributes [50.37313459134418]
We study the effects of inference error on auditing for bias in one prominent application: black-box audit of ad delivery using paired ads.<n>We propose a way to mitigate the inference error when evaluating skew in ad delivery algorithms.
arXiv Detail & Related papers (2024-10-30T18:57:03Z) - Mathematical Framework for Online Social Media Auditing [5.384630221560811]
Social media platforms (SMPs) leverage algorithmic filtering (AF) as a means of selecting the content that constitutes a user's feed with the aim of maximizing their rewards.
Selectively choosing the contents to be shown on the user's feed may yield a certain extent of influence, either minor or major, on the user's decision-making.
We mathematically formalize this framework and utilize it to construct a data-driven statistical auditing procedure to regulate AF from deflecting users' beliefs over time, along with sample complexity guarantees.
arXiv Detail & Related papers (2022-09-12T19:04:14Z) - Having your Privacy Cake and Eating it Too: Platform-supported Auditing
of Social Media Algorithms for Public Interest [70.02478301291264]
Social media platforms curate access to information and opportunities, and so play a critical role in shaping public discourse.
Prior studies have used black-box methods to show that these algorithms can lead to biased or discriminatory outcomes.
We propose a new method for platform-supported auditing that can meet the goals of the proposed legislation.
arXiv Detail & Related papers (2022-07-18T17:32:35Z) - Joint Multisided Exposure Fairness for Recommendation [76.75990595228666]
This paper formalizes a family of exposure fairness metrics that model the problem jointly from the perspective of both the consumers and producers.
Specifically, we consider group attributes for both types of stakeholders to identify and mitigate fairness concerns that go beyond individual users and items towards more systemic biases in recommendation.
arXiv Detail & Related papers (2022-04-29T19:13:23Z) - Adverse Media Mining for KYC and ESG Compliance [2.381399746981591]
Adverse media or negative news screening is crucial for the identification of such non-financial risks.
We present an automated system to conduct both real-time and batch search of adverse media for users' queries.
Our scalable, machine-learning driven approach to high-precision, adverse news filtering is based on four perspectives.
arXiv Detail & Related papers (2021-10-22T01:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.