Related papers: Mitigating fairwashing using Two-Source Audits

Related papers

CLAIM: An Intent-Driven Multi-Agent Framework for Analyzing Manipulation in Courtroom Dialogues [0.0]
Despite the growing advancements in NLP, its application in detecting and analyzing manipulation within the legal domain remains largely unexplored.<n>Our work addresses this gap by introducing LegalCon, a dataset of 1,063 annotated courtroom conversations labeled for manipulation detection.<n>We propose CLAIM, a two-stage, Intent-driven Multi-agent framework designed to enhance manipulation analysis by enabling context-aware and informed decision-making.
arXiv Detail & Related papers (2025-06-04T16:22:59Z)
Robust ML Auditing using Prior Knowledge [3.513282443657269]
Audit manipulation occurs when a platform deliberately alters its answers to a regulator to pass an audit without modifying its answers to other users.<n>This paper introduces a novel approach to manipulation-proof auditing by taking into account the auditor's prior knowledge of the task solved by the platform.
arXiv Detail & Related papers (2025-05-07T20:46:48Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
P2NIA: Privacy-Preserving Non-Iterative Auditing [5.619344845505019]
The emergence of AI legislation has increased the need to assess the ethical compliance of high-risk AI systems.<n>Traditional auditing methods rely on platforms' application programming interfaces (APIs)<n>We present P2NIA, a novel auditing scheme that proposes a mutually beneficial collaboration for both the auditor and the platform.
arXiv Detail & Related papers (2025-04-01T15:04:58Z)
Fine Grained Insider Risk Detection [0.0]
We present a method to detect departures from business-justified among support agents. We apply our method to help audit millions of actions of over three thousand support agents.
arXiv Detail & Related papers (2024-11-04T22:07:38Z)
FIRE: Fact-checking with Iterative Retrieval and Verification [63.67320352038525]
FIRE is a novel framework that integrates evidence retrieval and claim verification in an iterative manner. It achieves slightly better performance while reducing large language model (LLM) costs by an average of 7.6 times and search costs by 16.5 times. These results indicate that FIRE holds promise for application in large-scale fact-checking operations.
arXiv Detail & Related papers (2024-10-17T06:44:18Z)
DeepREST: Automated Test Case Generation for REST APIs Exploiting Deep Reinforcement Learning [5.756036843502232]
This paper introduces DeepREST, a novel black-box approach for automatically testing REST APIs. It leverages deep reinforcement learning to uncover implicit API constraints, that is, constraints hidden from API documentation. Our empirical validation suggests that the proposed approach is very effective in achieving high test coverage and fault detection.
arXiv Detail & Related papers (2024-08-16T08:03:55Z)
Auditing Differential Privacy Guarantees Using Density Estimation [3.830092569453011]
We present a novel method for accurately auditing the differential privacy guarantees of DP mechanisms. In particular, our solution is applicable to auditing DP guarantees of machine learning (ML) models.
arXiv Detail & Related papers (2024-06-07T10:52:15Z)
Pragmatic auditing: a pilot-driven approach for auditing Machine Learning systems [5.26895401335509]
We present a respective procedure that extends the AI-HLEG guidelines published by the European Commission. Our audit procedure is based on an ML lifecycle model that explicitly focuses on documentation, accountability, and quality assurance. We describe two pilots conducted on real-world use cases from two different organisations.
arXiv Detail & Related papers (2024-05-21T20:40:37Z)
Trustless Audits without Revealing Data or Models [49.23322187919369]
We show that it is possible to allow model providers to keep their model weights (but not architecture) and data secret while allowing other parties to trustlessly audit model and data properties. We do this by designing a protocol called ZkAudit in which model providers publish cryptographic commitments of datasets and model weights.
arXiv Detail & Related papers (2024-04-06T04:43:06Z)
Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain. We propose an adversarial algorithm to make the retriever component robust against distribution shift. We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z)
Under manipulations, are some AI models harder to audit? [2.699900017799093]
We study the feasibility of robust audits in realistic settings, in which models exhibit large capacities. We first prove a constraining result: if a web platform uses models that may fit any data, no audit strategy can outperform random sampling. We then relate the manipulability of audits to the capacity of the targeted models, using the Rademacher complexity.
arXiv Detail & Related papers (2024-02-14T09:38:09Z)
On the Detection of Reviewer-Author Collusion Rings From Paper Bidding [71.43634536456844]
Collusion rings pose a major threat to the peer-review systems of computer science conferences. One approach to solve this problem would be to detect the colluding reviewers from their manipulated bids. No research has yet established that detecting collusion rings is even possible.
arXiv Detail & Related papers (2024-02-12T18:12:09Z)
The Decisive Power of Indecision: Low-Variance Risk-Limiting Audits and Election Contestation via Marginal Mark Recording [51.82772358241505]
Risk-limiting audits (RLAs) are techniques for verifying the outcomes of large elections. We define new families of audits that improve efficiency and offer advances in statistical power. New audits are enabled by revisiting the standard notion of a cast-vote record so that it can declare multiple possible mark interpretations.
arXiv Detail & Related papers (2024-02-09T16:23:54Z)
Exploring API Behaviours Through Generated Examples [0.768721532845575]
We present an approach to automatically generate relevant examples of behaviours of an API. Our method can produce small and relevant examples that can help engineers to understand the system under exploration.
arXiv Detail & Related papers (2023-08-29T11:05:52Z)
Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight. They only give tight estimates under implausible worst-case assumptions. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z)
Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP [64.45845091719002]
Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct. This paper seeks to clarify the current situation and plot a course for meaningful progress in fair learning.
arXiv Detail & Related papers (2023-02-11T14:54:00Z)
REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service [67.0982378001551]
We show how a service provider pre-trains an encoder and then deploys it as a cloud service API. A client queries the cloud service API to obtain feature vectors for its training/testing inputs. We show that the cloud service only needs to provide two APIs to enable a client to certify the robustness of its downstream classifier.
arXiv Detail & Related papers (2023-01-07T17:40:11Z)
Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest [70.02478301291264]
Social media platforms curate access to information and opportunities, and so play a critical role in shaping public discourse. Prior studies have used black-box methods to show that these algorithms can lead to biased or discriminatory outcomes. We propose a new method for platform-supported auditing that can meet the goals of the proposed legislation.
arXiv Detail & Related papers (2022-07-18T17:32:35Z)
Algorithmic audits of algorithms, and the law [3.9103337761169943]
We focus on external audits that are conducted by interacting with the user side of the target algorithm. The legal framework in which these audits take place is mostly ambiguous to researchers developing them. This article highlights the relation of current audits with law, in order to structure the growing field of algorithm auditing.
arXiv Detail & Related papers (2022-02-15T14:20:53Z)
Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data. There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups. We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.