SHaSaM: Submodular Hard Sample Mining for Fair Facial Attribute Recognition
- URL: http://arxiv.org/abs/2602.05162v1
- Date: Thu, 05 Feb 2026 00:28:46 GMT
- Title: SHaSaM: Submodular Hard Sample Mining for Fair Facial Attribute Recognition
- Authors: Anay Majee, Rishabh Iyer,
- Abstract summary: Deep neural networks often inherit social and demographic biases from annotated data during model training, leading to unfair predictions.<n>We propose SHaSaM, a novel approach that models fairness-driven representation learning as a submodular hard-sample mining problem.<n> SHaSaM achieves state-of-the-art results, with up to 2.7 points improvement in model fairness (Equalized Odds) and a 3.5% gain in Accuracy, within fewer epochs as compared to existing methods.
- Score: 4.483276453936335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks often inherit social and demographic biases from annotated data during model training, leading to unfair predictions, especially in the presence of sensitive attributes like race, age, gender etc. Existing methods fall prey to the inherent data imbalance between attribute groups and inadvertently emphasize on sensitive attributes, worsening unfairness and performance. To surmount these challenges, we propose SHaSaM (Submodular Hard Sample Mining), a novel combinatorial approach that models fairness-driven representation learning as a submodular hard-sample mining problem. Our two-stage approach comprises of SHaSaM-MINE, which introduces a submodular subset selection strategy to mine hard positives and negatives - effectively mitigating data imbalance, and SHaSaM-LEARN, which introduces a family of combinatorial loss functions based on Submodular Conditional Mutual Information to maximize the decision boundary between target classes while minimizing the influence of sensitive attributes. This unified formulation restricts the model from learning features tied to sensitive attributes, significantly enhancing fairness without sacrificing performance. Experiments on CelebA and UTKFace demonstrate that SHaSaM achieves state-of-the-art results, with up to 2.7 points improvement in model fairness (Equalized Odds) and a 3.5% gain in Accuracy, within fewer epochs as compared to existing methods.
Related papers
- Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation [61.248535801314375]
Subset-Selected Counterfactual Augmentation (SS-CA)<n>We develop Counterfactual LIMA to identify minimal spatial region sets whose removal can selectively alter model predictions.<n>Experiments show that SS-CA improves generalization on in-distribution (ID) test data and achieves superior performance on out-of-distribution (OOD) benchmarks.
arXiv Detail & Related papers (2025-11-15T08:39:22Z) - MAB Optimizer for Estimating Math Question Difficulty via Inverse CV without NLP [3.9566483499208633]
This study introduces the Approach of Passive Measures among Educands (APME), a reinforcement learning-based Multi-Armed Bandit (MAB) framework.<n>By leveraging the inverse coefficient of variation as a risk-adjusted metric, the model provides an explainable and scalable mechanism for adaptive assessment.
arXiv Detail & Related papers (2025-08-26T13:23:31Z) - Conformalized Exceptional Model Mining: Telling Where Your Model Performs (Not) Well [31.013018198280506]
This paper introduces a novel framework, Conformalized Exceptional Model Mining.<n>It combines the rigor of Conformal Prediction with the explanatory power of Exceptional Model Mining.<n>We develop a new model class, mSMoPE, which quantifies uncertainty through conformal prediction's rigorous coverage guarantees.
arXiv Detail & Related papers (2025-08-21T13:43:14Z) - Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - Pedestrian Attribute Recognition as Label-balanced Multi-label Learning [12.605514698358165]
We propose a novel framework that successfully decouples label-balanced data re-sampling from the curse of attributes co-occurrence.
Our work achieves best accuracy on various popular benchmarks, and importantly, with minimal computational budget.
arXiv Detail & Related papers (2024-05-08T07:27:15Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Toward Fair Facial Expression Recognition with Improved Distribution
Alignment [19.442685015494316]
We present a novel approach to mitigate bias in facial expression recognition (FER) models.
Our method aims to reduce sensitive attribute information such as gender, age, or race, in the embeddings produced by FER models.
For the first time, we analyze the notion of attractiveness as an important sensitive attribute in FER models and demonstrate that FER models can indeed exhibit biases towards more attractive faces.
arXiv Detail & Related papers (2023-06-11T14:59:20Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Mining Minority-class Examples With Uncertainty Estimates [102.814407678425]
In the real world, the frequency of occurrence of objects is naturally skewed forming long-tail class distributions.
We propose an effective, yet simple, approach to overcome these challenges.
Our framework enhances the subdued tail-class activations and, thereafter, uses a one-class data-centric approach to effectively identify tail-class examples.
arXiv Detail & Related papers (2021-12-15T02:05:02Z) - Generic Semi-Supervised Adversarial Subject Translation for Sensor-Based
Human Activity Recognition [6.2997667081978825]
This paper presents a novel generic and robust approach for semi-supervised domain adaptation in Human Activity Recognition.
It capitalizes on the advantages of the adversarial framework to tackle the shortcomings, by leveraging knowledge from annotated samples exclusively from the source subject and unlabeled ones of the target subject.
The results demonstrate the effectiveness of our proposed algorithms over state-of-the-art methods, which led in up to 13%, 4%, and 13% improvement of our high-level activities recognition metrics for Opportunity, LISSI, and PAMAP2 datasets.
arXiv Detail & Related papers (2020-11-11T12:16:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.