A review of clustering models in educational data science towards
fairness-aware learning
- URL: http://arxiv.org/abs/2301.03421v1
- Date: Mon, 9 Jan 2023 15:18:51 GMT
- Title: A review of clustering models in educational data science towards
fairness-aware learning
- Authors: Tai Le Quy, Gunnar Friege, Eirini Ntoutsi
- Abstract summary: This chapter comprehensively surveys clustering models and their fairness in educational activities.
We especially focus on investigating the fair clustering models applied in educational activities.
These models are believed to be practical tools for analyzing students' data and ensuring fairness in EDS.
- Score: 14.051419173519308
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensuring fairness is essential for every education system. Machine learning
is increasingly supporting the education system and educational data science
(EDS) domain, from decision support to educational activities and learning
analytics. However, the machine learning-based decisions can be biased because
the algorithms may generate the results based on students' protected attributes
such as race or gender. Clustering is an important machine learning technique
to explore student data in order to support the decision-maker, as well as
support educational activities, such as group assignments. Therefore, ensuring
high-quality clustering models along with satisfying fairness constraints are
important requirements. This chapter comprehensively surveys clustering models
and their fairness in EDS. We especially focus on investigating the fair
clustering models applied in educational activities. These models are believed
to be practical tools for analyzing students' data and ensuring fairness in
EDS.
Related papers
- Federated Learning driven Large Language Models for Swarm Intelligence: A Survey [2.769238399659845]
Federated learning (FL) offers a compelling framework for training large language models (LLMs)
We focus on machine unlearning, a crucial aspect for complying with privacy regulations like the Right to be Forgotten.
We explore various strategies that enable effective unlearning, such as perturbation techniques, model decomposition, and incremental learning.
arXiv Detail & Related papers (2024-06-14T08:40:58Z) - Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets.
We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning.
The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z) - Fair Mixed Effects Support Vector Machine [0.0]
Fairness in machine learning aims to mitigate biases present in the training data and model imperfections.
This is achieved by preventing the model from making decisions based on sensitive characteristics like ethnicity or sexual orientation.
We present a fair mixed effects support vector machine algorithm that can handle both problems simultaneously.
arXiv Detail & Related papers (2024-05-10T12:25:06Z) - Enhancing Generative Class Incremental Learning Performance with Model Forgetting Approach [50.36650300087987]
This study presents a novel approach to Generative Class Incremental Learning (GCIL) by introducing the forgetting mechanism.
We have found that integrating the forgetting mechanisms significantly enhances the models' performance in acquiring new knowledge.
arXiv Detail & Related papers (2024-03-27T05:10:38Z) - A Probabilistic Model behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - GEDI: GEnerative and DIscriminative Training for Self-Supervised
Learning [3.6804038214708563]
We study state-of-the-art self-supervised learning objectives and propose a unified formulation based on likelihood learning.
We refer to this combined framework as GEDI, which stands for GEnerative and DIscriminative training.
We show that GEDI outperforms existing self-supervised learning strategies in terms of clustering performance by a wide margin.
arXiv Detail & Related papers (2022-12-27T09:33:50Z) - FairFed: Enabling Group Fairness in Federated Learning [22.913999279079878]
Federated learning has been viewed as a promising solution for learning machine learning models among multiple parties.
We propose FairFed, a novel algorithm to enhance group fairness via a fairness-aware aggregation method.
Our proposed method outperforms the state-of-the-art fair federated learning frameworks under a high heterogeneous sensitive attribute distribution.
arXiv Detail & Related papers (2021-10-02T17:55:20Z) - A survey on datasets for fairness-aware machine learning [6.962333053044713]
A large variety of fairness-aware machine learning solutions have been proposed.
In this paper, we overview real-world datasets used for fairness-aware machine learning.
For a deeper understanding of bias and fairness in the datasets, we investigate the interesting relationships using exploratory analysis.
arXiv Detail & Related papers (2021-10-01T16:54:04Z) - Decentralized Federated Learning Preserves Model and Data Privacy [77.454688257702]
We propose a fully decentralized approach, which allows to share knowledge between trained models.
Students are trained on the output of their teachers via synthetically generated input data.
The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher.
arXiv Detail & Related papers (2021-02-01T14:38:54Z) - Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce
Discrimination [53.3082498402884]
A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair.
We present a framework of fair semi-supervised learning in the pre-processing phase, including pseudo labeling to predict labels for unlabeled data.
A theoretical decomposition analysis of bias, variance and noise highlights the different sources of discrimination and the impact they have on fairness in semi-supervised learning.
arXiv Detail & Related papers (2020-09-25T05:48:56Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.