Evaluating Fairness in Self-supervised and Supervised Models for
Sequential Data
- URL: http://arxiv.org/abs/2401.01640v1
- Date: Wed, 3 Jan 2024 09:31:43 GMT
- Title: Evaluating Fairness in Self-supervised and Supervised Models for
Sequential Data
- Authors: Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena
Vakali, Daniele Quercia, Fahim Kawsar
- Abstract summary: Self-supervised learning (SSL) has become the de facto training paradigm of large models.
This study explores the impact of pre-training and fine-tuning strategies on fairness.
- Score: 10.626503137418636
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning (SSL) has become the de facto training paradigm of
large models where pre-training is followed by supervised fine-tuning using
domain-specific data and labels. Hypothesizing that SSL models would learn more
generic, hence less biased, representations, this study explores the impact of
pre-training and fine-tuning strategies on fairness (i.e., performing equally
on different demographic breakdowns). Motivated by human-centric applications
on real-world timeseries data, we interpret inductive biases on the model,
layer, and metric levels by systematically comparing SSL models to their
supervised counterparts. Our findings demonstrate that SSL has the capacity to
achieve performance on par with supervised methods while significantly
enhancing fairness--exhibiting up to a 27% increase in fairness with a mere 1%
loss in performance through self-supervision. Ultimately, this work underscores
SSL's potential in human-centric computing, particularly high-stakes,
data-scarce application domains like healthcare.
Related papers
- A Survey of the Self Supervised Learning Mechanisms for Vision Transformers [5.152455218955949]
The application of self supervised learning (SSL) in vision tasks has gained significant attention.
We develop a comprehensive taxonomy of systematically classifying the SSL techniques.
We discuss the motivations behind SSL, review popular pre-training tasks, and highlight the challenges and advancements in this field.
arXiv Detail & Related papers (2024-08-30T07:38:28Z) - A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - A Self-Supervised Learning Pipeline for Demographically Fair Facial Attribute Classification [3.5092955099876266]
This paper proposes a fully self-supervised pipeline for demographically fair facial attribute classification.
We leverage completely unlabeled data pseudolabeled via pre-trained encoders, diverse data curation techniques, and meta-learning-based weighted contrastive learning.
arXiv Detail & Related papers (2024-07-14T07:11:57Z) - Using Self-supervised Learning Can Improve Model Fairness [10.028637666224093]
Self-supervised learning (SSL) has become the de facto training paradigm of large models.
This study explores the impact of pre-training and fine-tuning strategies on fairness.
We introduce a fairness assessment framework for SSL, comprising five stages: defining dataset requirements, pre-training, fine-tuning with gradual unfreezing, assessing representation similarity conditioned on demographics, and establishing domain-specific evaluation processes.
arXiv Detail & Related papers (2024-06-04T14:38:30Z) - What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets.
We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning.
The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z) - Reinforcement Learning-Guided Semi-Supervised Learning [20.599506122857328]
We propose a novel Reinforcement Learning Guided SSL method, RLGSSL, that formulates SSL as a one-armed bandit problem.
RLGSSL incorporates a carefully designed reward function that balances the use of labeled and unlabeled data to enhance generalization performance.
We demonstrate the effectiveness of RLGSSL through extensive experiments on several benchmark datasets and show that our approach achieves consistent superior performance compared to state-of-the-art SSL methods.
arXiv Detail & Related papers (2024-05-02T21:52:24Z) - Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls
and Opportunities [50.231837687221685]
Self-supervised learning (SSL) has transformed machine learning and its many real world applications.
Unsupervised anomaly detection (AD) has also capitalized on SSL, by self-generating pseudo-anomalies.
arXiv Detail & Related papers (2023-08-28T07:55:01Z) - On Higher Adversarial Susceptibility of Contrastive Self-Supervised
Learning [104.00264962878956]
Contrastive self-supervised learning (CSL) has managed to match or surpass the performance of supervised learning in image and video classification.
It is still largely unknown if the nature of the representation induced by the two learning paradigms is similar.
We identify the uniform distribution of data representation over a unit hypersphere in the CSL representation space as the key contributor to this phenomenon.
We devise strategies that are simple, yet effective in improving model robustness with CSL training.
arXiv Detail & Related papers (2022-07-22T03:49:50Z) - Self-Supervision Can Be a Good Few-Shot Learner [42.06243069679068]
We propose an effective unsupervised few-shot learning method, learning representations with self-supervision.
Specifically, we maximize the mutual information (MI) of instances and their representations with a low-bias MI estimator.
We show that self-supervised pre-training can outperform supervised pre-training under the appropriate conditions.
arXiv Detail & Related papers (2022-07-19T10:23:40Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.