Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome
Homogenization?
- URL: http://arxiv.org/abs/2211.13972v1
- Date: Fri, 25 Nov 2022 09:33:11 GMT
- Title: Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome
Homogenization?
- Authors: Rishi Bommasani, Kathleen A. Creel, Ananya Kumar, Dan Jurafsky, Percy
Liang
- Abstract summary: A recurring theme in machine learning is algorithmic monoculture: the same systems, or systems that share components, are deployed by multiple decision-makers.
We propose the component-sharing hypothesis: if decision-makers share components like training data or specific models, then they will produce more homogeneous outcomes.
We test this hypothesis on algorithmic fairness benchmarks, demonstrating that sharing training data reliably exacerbates homogenization.
We conclude with philosophical analyses of and societal challenges for outcome homogenization, with an eye towards implications for deployed machine learning systems.
- Score: 90.35044668396591
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the scope of machine learning broadens, we observe a recurring theme of
algorithmic monoculture: the same systems, or systems that share components
(e.g. training data), are deployed by multiple decision-makers. While sharing
offers clear advantages (e.g. amortizing costs), does it bear risks? We
introduce and formalize one such risk, outcome homogenization: the extent to
which particular individuals or groups experience negative outcomes from all
decision-makers. If the same individuals or groups exclusively experience
undesirable outcomes, this may institutionalize systemic exclusion and
reinscribe social hierarchy. To relate algorithmic monoculture and outcome
homogenization, we propose the component-sharing hypothesis: if decision-makers
share components like training data or specific models, then they will produce
more homogeneous outcomes. We test this hypothesis on algorithmic fairness
benchmarks, demonstrating that sharing training data reliably exacerbates
homogenization, with individual-level effects generally exceeding group-level
effects. Further, given the dominant paradigm in AI of foundation models, i.e.
models that can be adapted for myriad downstream tasks, we test whether model
sharing homogenizes outcomes across tasks. We observe mixed results: we find
that for both vision and language settings, the specific methods for adapting a
foundation model significantly influence the degree of outcome homogenization.
We conclude with philosophical analyses of and societal challenges for outcome
homogenization, with an eye towards implications for deployed machine learning
systems.
Related papers
- Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes [72.13373216644021]
We study the societal impact of machine learning by considering the collection of models that are deployed in a given context.
We find deployed machine learning is prone to systemic failure, meaning some users are exclusively misclassified by all models available.
These examples demonstrate ecosystem-level analysis has unique strengths for characterizing the societal impact of machine learning.
arXiv Detail & Related papers (2023-07-12T01:11:52Z) - On The Impact of Machine Learning Randomness on Group Fairness [11.747264308336012]
We investigate the impact on group fairness of different sources of randomness in training neural networks.
We show that the variance in group fairness measures is rooted in the high volatility of the learning process on under-represented groups.
We show how one can control group-level accuracy, with high efficiency and negligible impact on the model's overall performance, by simply changing the data order for a single epoch.
arXiv Detail & Related papers (2023-07-09T09:36:31Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - MultiFair: Multi-Group Fairness in Machine Learning [52.24956510371455]
We study multi-group fairness in machine learning (MultiFair)
We propose a generic end-to-end algorithmic framework to solve it.
Our proposed framework is generalizable to many different settings.
arXiv Detail & Related papers (2021-05-24T02:30:22Z) - Recommendations for Bayesian hierarchical model specifications for
case-control studies in mental health [0.0]
Researchers must choose whether to assume all subjects are drawn from a common population, or to model them as deriving from separate populations.
We ran systematic simulations on synthetic multi-group behavioural data from a commonly used bandit task.
We found that fitting groups separately provided the most accurate and robust inference across all conditions.
arXiv Detail & Related papers (2020-11-03T14:19:59Z) - Interpretable Assessment of Fairness During Model Evaluation [1.2183405753834562]
We introduce a novel hierarchical clustering algorithm to detect heterogeneity among users in given sets of sub-populations.
We demonstrate the performance of the algorithm on real data from LinkedIn.
arXiv Detail & Related papers (2020-10-26T02:31:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.