Factorizable Joint Shift in Multinomial Classification
- URL: http://arxiv.org/abs/2207.14514v1
- Date: Fri, 29 Jul 2022 07:21:44 GMT
- Title: Factorizable Joint Shift in Multinomial Classification
- Authors: Dirk Tasche
- Abstract summary: We derive a representation of factorizable joint shift in terms of the source (training) distribution, the target (test) prior class probabilities and the target marginal distribution of the features.
Other results of the paper include correction formulae for the posterior class probabilities both under general dataset shift and factorizable joint shift.
- Score: 3.3504365823045035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Factorizable joint shift was recently proposed as a type of dataset shift for
which the characteristics can be estimated from observed data. For the
multinomial (multi-class) classification setting, we derive a representation of
factorizable joint shift in terms of the source (training) distribution, the
target (test) prior class probabilities and the target marginal distribution of
the features. On the basis of this result, we propose alternatives to joint
importance aligning, at the same time pointing out the limitations encountered
when making an assumption of factorizable joint shift. Other results of the
paper include correction formulae for the posterior class probabilities both
under general dataset shift and factorizable joint shift. In addition, we
investigate the consequences of assuming factorizable joint shift for the bias
caused by sample selection.
Related papers
- Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Invariance assumptions for class distribution estimation [1.3053649021965603]
We study the problem of class distribution estimation under dataset shift.
On the training dataset, both features and class labels are observed while on the test dataset only the features can be observed.
arXiv Detail & Related papers (2023-11-28T20:57:10Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Disentanglement of Correlated Factors via Hausdorff Factorized Support [53.23740352226391]
We propose a relaxed disentanglement criterion - the Hausdorff Factorized Support (HFS) criterion - that encourages a factorized support, rather than a factorial distribution.
We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks.
arXiv Detail & Related papers (2022-10-13T20:46:42Z) - Bounding Counterfactuals under Selection Bias [60.55840896782637]
We propose a first algorithm to address both identifiable and unidentifiable queries.
We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal.
arXiv Detail & Related papers (2022-07-26T10:33:10Z) - Domain Adaptation with Factorizable Joint Shift [18.95213249351176]
We propose a new assumption, Factorizable Joint Shift (FJS), to handle the co-existence of sampling bias.
FJS assumes the independence of the bias between the two factors.
We also propose Joint Importance Aligning (JIA), a discriminative learning objective to obtain joint importance estimators.
arXiv Detail & Related papers (2022-03-06T07:58:51Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Factoring Multidimensional Data to Create a Sophisticated Bayes
Classifier [0.0]
We derive an explicit formula for calculating the marginal likelihood of a given factorization of a categorical dataset.
These likelihoods can be used to order all possible factorizations and select the "best" way to factor the overall distribution from which the dataset is drawn.
arXiv Detail & Related papers (2021-05-11T16:34:12Z) - Recovery of Joint Probability Distribution from one-way marginals: Low
rank Tensors and Random Projections [2.9929093132587763]
Joint probability mass function (PMF) estimation is a fundamental machine learning problem.
In this work, we link random projections of data to the problem of PMF estimation using ideas from tomography.
We provide a novel algorithm for recovering factors of the tensor from one-way marginals, test it across a variety of synthetic and real-world datasets, and also perform MAP inference on the estimated model for classification.
arXiv Detail & Related papers (2021-03-22T14:00:57Z) - Information-theoretic Feature Selection via Tensor Decomposition and
Submodularity [38.05393186002834]
We introduce a low-rank tensor model of the joint PMF of all variables and indirect targeting as a way of mitigating complexity and maximizing the classification performance for a given number of features.
By indirectly aiming to predict the latent variable of the naive Bayes model instead of the original target variable, it is possible to formulate the feature selection problem as of a monotone submodular function subject to a cardinality constraint.
arXiv Detail & Related papers (2020-10-30T10:36:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.