Unsupervised Learning under Latent Label Shift
- URL: http://arxiv.org/abs/2207.13179v1
- Date: Tue, 26 Jul 2022 20:52:53 GMT
- Title: Unsupervised Learning under Latent Label Shift
- Authors: Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton
- Abstract summary: We introduce unsupervised learning under Latent Label Shift (LLS)
We show that our algorithm can leverage domain information to improve state of the art unsupervised classification methods.
- Score: 21.508249151557244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: What sorts of structure might enable a learner to discover classes from
unlabeled data? Traditional approaches rely on feature-space similarity and
heroic assumptions on the data. In this paper, we introduce unsupervised
learning under Latent Label Shift (LLS), where we have access to unlabeled data
from multiple domains such that the label marginals $p_d(y)$ can shift across
domains but the class conditionals $p(\mathbf{x}|y)$ do not. This work
instantiates a new principle for identifying classes: elements that shift
together group together. For finite input spaces, we establish an isomorphism
between LLS and topic modeling: inputs correspond to words, domains to
documents, and labels to topics. Addressing continuous data, we prove that when
each label's support contains a separable region, analogous to an anchor word,
oracle access to $p(d|\mathbf{x})$ suffices to identify $p_d(y)$ and
$p_d(y|\mathbf{x})$ up to permutation. Thus motivated, we introduce a practical
algorithm that leverages domain-discriminative models as follows: (i) push
examples through domain discriminator $p(d|\mathbf{x})$; (ii) discretize the
data by clustering examples in $p(d|\mathbf{x})$ space; (iii) perform
non-negative matrix factorization on the discrete data; (iv) combine the
recovered $p(y|d)$ with the discriminator outputs $p(d|\mathbf{x})$ to compute
$p_d(y|x) \; \forall d$. With semi-synthetic experiments, we show that our
algorithm can leverage domain information to improve state of the art
unsupervised classification methods. We reveal a failure mode of standard
unsupervised classification methods when feature-space similarity does not
indicate true groupings, and show empirically that our method better handles
this case. Our results establish a deep connection between distribution shift
and topic modeling, opening promising lines for future work.
Related papers
- IT$^3$: Idempotent Test-Time Training [95.78053599609044]
This paper introduces Idempotent Test-Time Training (IT$3$), a novel approach to addressing the challenge of distribution shift.
IT$3$ is based on the universal property of idempotence.
We demonstrate the versatility of our approach across various tasks, including corrupted image classification.
arXiv Detail & Related papers (2024-10-05T15:39:51Z) - Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization [65.8915778873691]
conditional distributions is a central problem in machine learning.
We propose a new learning paradigm that integrates both paired and unpaired data.
Our approach also connects intriguingly with inverse entropic optimal transport (OT)
arXiv Detail & Related papers (2024-10-03T16:12:59Z) - One-Bit Quantization and Sparsification for Multiclass Linear Classification with Strong Regularization [18.427215139020625]
We show that the best classification is achieved when $f(cdot) = |cdot|2$ and $lambda to infty$.
It is often possible to find sparse and one-bit solutions that perform almost as well as one corresponding to $f(cdot) = |cdot|_infty$ in the large $lambda$ regime.
arXiv Detail & Related papers (2024-02-16T06:39:40Z) - Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift.
We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution.
We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z) - Statistical learning on measures: an application to persistence diagrams [0.0]
We consider a binary supervised learning classification problem where instead of having data in a finite-dimensional Euclidean space, we observe measures on a compact space $mathcalX$.
We show that our framework allows more flexibility and diversity in the input data we can deal with.
While such a framework has many possible applications, this work strongly emphasizes on classifying data via topological descriptors called persistence diagrams.
arXiv Detail & Related papers (2023-03-15T09:01:37Z) - HappyMap: A Generalized Multi-calibration Method [23.086009024383024]
Multi-calibration is a powerful and evolving concept originating in the field of algorithmic fairness.
In this work, we view the term $(f(x)-y)$ as just one specific mapping, and explore the power of an enriched class of mappings.
We propose textitHappyMap, a generalization of multi-calibration, which yields a wide range of new applications.
arXiv Detail & Related papers (2023-03-08T05:05:01Z) - Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact
Supervision [53.530957567507365]
In some real-world tasks, each training sample is associated with a candidate label set that contains one ground-truth label and some false positive labels.
In this paper, we formalize such problems as multi-instance partial-label learning (MIPL)
Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems.
arXiv Detail & Related papers (2022-12-18T03:28:51Z) - Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions
with "Spurious" Correlations [44.99833362998488]
Changes in the data distribution at test time can have deleterious effects on the performance of predictive models.
We propose a test-time label shift correction that adapts to changes in the joint distribution $p(y, z)$ using EM applied to unlabeled samples.
arXiv Detail & Related papers (2022-11-28T18:52:33Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.