Diversified Ensembling: An Experiment in Crowdsourced Machine Learning
- URL: http://arxiv.org/abs/2402.10795v1
- Date: Fri, 16 Feb 2024 16:20:43 GMT
- Title: Diversified Ensembling: An Experiment in Crowdsourced Machine Learning
- Authors: Ira Globus-Harris, Declan Harrison, Michael Kearns, Pietro Perona,
Aaron Roth
- Abstract summary: In arXiv:2201.10408, the authors developed an alternative crowdsourcing framework in the context of fair machine learning.
We present the first medium-scale experimental evaluation of this framework, with 46 participating teams attempting to generate models.
- Score: 18.192916651221882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Crowdsourced machine learning on competition platforms such as Kaggle is a
popular and often effective method for generating accurate models. Typically,
teams vie for the most accurate model, as measured by overall error on a
holdout set, and it is common towards the end of such competitions for teams at
the top of the leaderboard to ensemble or average their models outside the
platform mechanism to get the final, best global model. In arXiv:2201.10408,
the authors developed an alternative crowdsourcing framework in the context of
fair machine learning, in order to integrate community feedback into models
when subgroup unfairness is present and identifiable. There, unlike in
classical crowdsourced ML, participants deliberately specialize their efforts
by working on subproblems, such as demographic subgroups in the service of
fairness. Here, we take a broader perspective on this work: we note that within
this framework, participants may both specialize in the service of fairness and
simply to cater to their particular expertise (e.g., focusing on identifying
bird species in an image classification task). Unlike traditional
crowdsourcing, this allows for the diversification of participants' efforts and
may provide a participation mechanism to a larger range of individuals (e.g. a
machine learning novice who has insight into a specific fairness concern). We
present the first medium-scale experimental evaluation of this framework, with
46 participating teams attempting to generate models to predict income from
American Community Survey data. We provide an empirical analysis of teams'
approaches, and discuss the novel system architecture we developed. From here,
we give concrete guidance for how best to deploy such a framework.
Related papers
- Benchmarking Robustness and Generalization in Multi-Agent Systems: A
Case Study on Neural MMO [50.58083807719749]
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
This competition targets robustness and generalization in multi-agent systems.
We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
arXiv Detail & Related papers (2023-08-30T07:16:11Z) - Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants.
Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - MultiModal Bias: Introducing a Framework for Stereotypical Bias
Assessment beyond Gender and Race in Vision Language Models [40.12132844347926]
We provide a visual and textual bias benchmark called MMBias, consisting of around 3,800 images and phrases covering 14 population subgroups.
We utilize this dataset to assess bias in several prominent self supervised multimodal models, including CLIP, ALBEF, and ViLT.
We introduce a debiasing method designed specifically for such large pre-trained models that can be applied as a post-processing step to mitigate bias.
arXiv Detail & Related papers (2023-03-16T17:36:37Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - Blackbox Post-Processing for Multiclass Fairness [1.5305403478254664]
We consider modifying the predictions of a blackbox machine learning classifier in order to achieve fairness in a multiclass setting.
We explore when our approach produces both fair and accurate predictions through systematic synthetic experiments.
We find that overall, our approach produces minor drops in accuracy and enforces fairness when the number of individuals in the dataset is high.
arXiv Detail & Related papers (2022-01-12T13:21:20Z) - Toward Annotator Group Bias in Crowdsourcing [26.754873038110595]
We show that annotators within the same demographic group tend to show consistent group bias in annotation tasks.
We develop a novel probabilistic graphical framework GroupAnno to capture annotator group bias with a new extended Expectation Maximization (EM) training algorithm.
arXiv Detail & Related papers (2021-10-08T05:54:36Z) - Towards Unbiased and Accurate Deferral to Multiple Experts [19.24068936057053]
We propose a framework that simultaneously learns a classifier and a deferral system, with the deferral system choosing to defer to one or more human experts.
We test our framework on a synthetic dataset and a content moderation dataset with biased synthetic experts, and show that it significantly improves the accuracy and fairness of the final predictions.
arXiv Detail & Related papers (2021-02-25T17:08:39Z) - FairALM: Augmented Lagrangian Method for Training Fair Models with
Little Regret [42.66567001275493]
It is now accepted that because of biases in the datasets we present to the models, a fairness-oblivious training will lead to unfair models.
Here, we study mechanisms that impose fairness concurrently while training the model.
arXiv Detail & Related papers (2020-04-03T03:18:53Z) - Exploiting the Matching Information in the Support Set for Few Shot
Event Classification [66.31312496170139]
We investigate event classification under the few-shot learningsetting.
We propose a novel training method for this problem that exten-sively exploit the support set during the training process.
Our experiments ontwo benchmark EC datasets show that the proposed method can improvethe best reported few-shot learning models by up to 10% on accuracy for event classification.
arXiv Detail & Related papers (2020-02-13T00:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.