Competition over data: how does data purchase affect users?
- URL: http://arxiv.org/abs/2201.10774v1
- Date: Wed, 26 Jan 2022 06:44:55 GMT
- Title: Competition over data: how does data purchase affect users?
- Authors: Yongchan Kwon, Antonio Ginart, James Zou
- Abstract summary: We study what happens when the competing predictors can acquire additional labeled data to improve their prediction quality.
We show that this phenomenon naturally arises due to a trade-off whereby competition pushes each predictor to specialize in a subset of the population.
- Score: 15.644822986029377
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: As machine learning (ML) is deployed by many competing service providers, the
underlying ML predictors also compete against each other, and it is
increasingly important to understand the impacts and biases from such
competition. In this paper, we study what happens when the competing predictors
can acquire additional labeled data to improve their prediction quality. We
introduce a new environment that allows ML predictors to use active learning
algorithms to purchase labeled data within their budgets while competing
against each other to attract users. Our environment models a critical aspect
of data acquisition in competing systems which has not been well-studied
before. We found that the overall performance of an ML predictor improves when
predictors can purchase additional labeled data. Surprisingly, however, the
quality that users experience -- i.e. the accuracy of the predictor selected by
each user -- can decrease even as the individual predictors get better. We show
that this phenomenon naturally arises due to a trade-off whereby competition
pushes each predictor to specialize in a subset of the population while data
purchase has the effect of making predictors more uniform. We support our
findings with both experiments and theories.
Related papers
- Do We Really Even Need Data? [2.3749120526936465]
Researchers increasingly use predictions from pre-trained algorithms as outcome variables.
Standard tools for inference can misrepresent the association between independent variables and the outcome of interest when the true, unobserved outcome is replaced by a predicted value.
arXiv Detail & Related papers (2024-01-14T23:19:21Z) - Cross-Prediction-Powered Inference [15.745692520785074]
Cross-prediction is a method for valid inference powered by machine learning.
We show that cross-prediction is consistently more powerful than an adaptation of prediction-powered inference.
arXiv Detail & Related papers (2023-09-28T17:01:58Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Is augmentation effective to improve prediction in imbalanced text
datasets? [3.1690891866882236]
We argue that adjusting the cutoffs without data augmentation can produce similar results to oversampling techniques.
Our findings contribute to a better understanding of the strengths and limitations of different approaches to dealing with imbalanced data.
arXiv Detail & Related papers (2023-04-20T13:07:31Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Prediction-Powered Inference [68.97619568620709]
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system.
The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients.
Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning.
arXiv Detail & Related papers (2023-01-23T18:59:28Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Competing AI: How does competition feedback affect machine learning? [14.350250426090893]
We show that competition causes predictors to specialize for specific sub-populations at the cost of worse performance over the general population.
We show that having too few or too many competing predictors in a market can hurt the overall prediction quality.
arXiv Detail & Related papers (2020-09-15T00:13:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.