Domain Adaptive Decision Trees: Implications for Accuracy and Fairness
- URL: http://arxiv.org/abs/2302.13846v2
- Date: Wed, 31 May 2023 08:52:25 GMT
- Title: Domain Adaptive Decision Trees: Implications for Accuracy and Fairness
- Authors: Jose M. Alvarez, Kristen M. Scott, Salvatore Ruggieri, Bettina Berendt
- Abstract summary: This paper contributes to the field of domain adaptation by introducing domain-adaptive decision trees (DADT)
DADT adjusts the information gain split criterion with outside information corresponding to the distribution of the target population.
We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population.
- Score: 28.37613618406726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In uses of pre-trained machine learning models, it is a known issue that the
target population in which the model is being deployed may not have been
reflected in the source population with which the model was trained. This can
result in a biased model when deployed, leading to a reduction in model
performance. One risk is that, as the population changes, certain demographic
groups will be under-served or otherwise disadvantaged by the model, even as
they become more represented in the target population. The field of domain
adaptation proposes techniques for a situation where label data for the target
population does not exist, but some information about the target distribution
does exist. In this paper we contribute to the domain adaptation literature by
introducing domain-adaptive decision trees (DADT). We focus on decision trees
given their growing popularity due to their interpretability and performance
relative to other more complex models. With DADT we aim to improve the accuracy
of models trained in a source domain (or training data) that differs from the
target domain (or test data). We propose an in-processing step that adjusts the
information gain split criterion with outside information corresponding to the
distribution of the target population. We demonstrate DADT on real data and
find that it improves accuracy over a standard decision tree when testing in a
shifted target population. We also study the change in fairness under
demographic parity and equal opportunity. Results show an improvement in
fairness with the use of DADT.
Related papers
- Mitigating the Bias in the Model for Continual Test-Time Adaptation [32.33057968481597]
Continual Test-Time Adaptation (CTA) is a challenging task that aims to adapt a source pre-trained model to continually changing target domains.
We find that a model shows highly biased predictions as it constantly adapts to the chaining distribution of the target data.
This paper mitigates this issue to improve performance in the CTA scenario.
arXiv Detail & Related papers (2024-03-02T23:37:16Z) - GAN-based Domain Inference Attack [3.731168012111833]
We propose a generative adversarial network (GAN) based method to explore likely or similar domains of a target model.
We find that the target model may distract the training procedure less if the domain is more similar to the target domain.
Our experiments show that the auxiliary dataset from an MDI top-ranked domain can effectively boost the result of model-inversion attacks.
arXiv Detail & Related papers (2022-12-22T15:40:53Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - How to Learn when Data Gradually Reacts to Your Model [10.074466859579571]
We propose a new algorithm, Stateful Performative Gradient Descent (Stateful PerfGD), for minimizing the performative loss even in the presence of these effects.
Our experiments confirm that Stateful PerfGD substantially outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2021-12-13T22:05:26Z) - On-target Adaptation [82.77980951331854]
Domain adaptation seeks to mitigate the shift between training on the emphsource domain and testing on the emphtarget domain.
Most adaptation methods rely on the source data by joint optimization over source data and target data.
We show significant improvement by on-target adaptation, which learns the representation purely from target data.
arXiv Detail & Related papers (2021-09-02T17:04:18Z) - Model Transferability With Responsive Decision Subjects [11.07759054787023]
We formalize the discussions of the transferability of a model by studying how the performance of the model trained on the available source distribution would translate to the performance on its induced domain.
We provide both upper bounds for the performance gap due to the induced domain shift, as well as lower bounds for the trade-offs that a classifier has to suffer on either the source training distribution or the induced target distribution.
arXiv Detail & Related papers (2021-07-13T08:21:37Z) - Distill and Fine-tune: Effective Adaptation from a Black-box Source
Model [138.12678159620248]
Unsupervised domain adaptation (UDA) aims to transfer knowledge in previous related labeled datasets (source) to a new unlabeled dataset (target)
We propose a novel two-step adaptation framework called Distill and Fine-tune (Dis-tune)
arXiv Detail & Related papers (2021-04-04T05:29:05Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Selecting Treatment Effects Models for Domain Adaptation Using Causal
Knowledge [82.5462771088607]
We propose a novel model selection metric specifically designed for ITE methods under the unsupervised domain adaptation setting.
In particular, we propose selecting models whose predictions of interventions' effects satisfy known causal structures in the target domain.
arXiv Detail & Related papers (2021-02-11T21:03:14Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.