Related papers: Scaling up Search Engine Audits: Practical Insights for Algorithm Auditing

Scaling up Search Engine Audits: Practical Insights for Algorithm Auditing

URL: http://arxiv.org/abs/2106.05831v3
Date: Mon, 25 Apr 2022 13:14:45 GMT
Title: Scaling up Search Engine Audits: Practical Insights for Algorithm Auditing
Authors: Roberto Ulloa and Mykola Makhortykh and Aleksandra Urman
Abstract summary: We set up experiments for eight search engines with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time.
Score: 68.8204255655161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Algorithm audits have increased in recent years due to a growing need to independently assess the performance of automatically curated services that process, filter, and rank the large and dynamic amount of information available on the internet. Among several methodologies to perform such audits, virtual agents stand out because they offer the ability to perform systematic experiments, simulating human behaviour without the associated costs of recruiting participants. Motivated by the importance of research transparency and replicability of results, this paper focuses on the challenges of such an approach. It provides methodological details, recommendations, lessons learned, and limitations based on our experience of setting up experiments for eight search engines (including main, news, image and video sections) with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections, with diverse experimental designs, and point to different changes and strategies that improve the quality of the method. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time, and we hope that this paper can serve as a basis for further research in this area.

Related papers

Let the Barbarians In: How AI Can Accelerate Systems Performance Research [80.43506848683633]
We term this iterative cycle of generation, evaluation, and refinement AI-Driven Research for Systems.<n>We demonstrate that ADRS-generated solutions can match or even outperform human state-of-the-art designs.
arXiv Detail & Related papers (2025-12-16T18:51:23Z)
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
An experimental survey and Perspective View on Meta-Learning for Automated Algorithms Selection and Parametrization [0.0]
We provide an overview of the state of the art in this continuously evolving field. AutoML makes machine learning techniques accessible to domain scientists who are interested in applying advanced analytics.
arXiv Detail & Related papers (2025-04-08T16:51:22Z)
Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey [8.250749654561423]
A novel approach to automated valuation, multimodal machine learning, has taken shape. multimodal machine learning significantly outperforms single-modality or fewer-modality approaches in terms of prediction accuracy.
arXiv Detail & Related papers (2025-03-28T03:47:06Z)
Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review [50.67937325077047]
This paper is devoted to a comprehensive review of realizing the sample efficiency and generalization of RL algorithms through transfer and inverse reinforcement learning (T-IRL) Our findings denote that a majority of recent research works have dealt with the aforementioned challenges by utilizing human-in-the-loop and sim-to-real strategies. Under the IRL structure, training schemes that require a low number of experience transitions and extension of such frameworks to multi-agent and multi-intention problems have been the priority of researchers in recent years.
arXiv Detail & Related papers (2024-11-15T15:18:57Z)
Learning to Rank for Multiple Retrieval-Augmented Models through Iterative Utility Maximization [21.115495457454365]
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents. We introduce an iterative approach where the search engine generates retrieval results for these RAG agents and gathers feedback on the quality of the retrieved documents during an offline phase. We adapt this approach to an online setting, allowing the search engine to refine its behavior based on real-time individual agents feedback.
arXiv Detail & Related papers (2024-10-13T17:53:50Z)
Performance Evaluation in Multimedia Retrieval [7.801919915773585]
Performance evaluation in multimedia retrieval relies heavily on retrieval experiments. These can involve human-in-the-loop and machine-only settings for the retrieval process itself and the subsequent verification of results. We present a formal model to express all relevant aspects of such retrieval experiments, as well as a flexible open-source evaluation infrastructure.
arXiv Detail & Related papers (2024-10-09T08:06:15Z)
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? [1.9116784879310031]
In deep Reinforcement Learning (RL), value functions are approximated using deep neural networks and trained via mean squared error regression objectives. Recent research has proposed an alternative approach, utilizing the cross-entropy classification objective. Our work seeks to empirically investigate the impact of such a replacement in an offline RL setup.
arXiv Detail & Related papers (2024-06-10T14:25:11Z)
A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z)
Process Variant Analysis Across Continuous Features: A Novel Framework [0.0]
This research addresses the challenge of effectively segmenting cases within operational processes. We present a novel approach employing a sliding window technique combined with the earth mover's distance to detect changes in control flow behavior. We validate our methodology through a real-life case study in collaboration with UWV, the Dutch employee insurance agency.
arXiv Detail & Related papers (2024-05-06T16:10:13Z)
Multiobjective Optimization Analysis for Finding Infrastructure-as-Code Deployment Configurations [0.3774866290142281]
This paper is focused on a multiobjective problem related to Infrastructure-as-Code deployment configurations. We resort in this paper to nine different evolutionary-based multiobjective algorithms. Results obtained by each method after 10 independent runs have been compared using Friedman's non-parametric tests.
arXiv Detail & Related papers (2024-01-18T13:55:32Z)
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains. This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z)
Accelerating exploration and representation learning with offline pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset. We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z)
GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.