Group-realizable multi-group learning by minimizing empirical risk
- URL: http://arxiv.org/abs/2601.16922v1
- Date: Fri, 23 Jan 2026 17:30:13 GMT
- Title: Group-realizable multi-group learning by minimizing empirical risk
- Authors: Navid Ardeshir, Samuel Deng, Daniel Hsu, Jingwen Liu,
- Abstract summary: The sample complexity of multi-group learning is shown to improve in the group-realizable setting over the agnostic setting.<n>The improved sample complexity is obtained by empirical risk minimization over the class of group-realizable concepts.
- Score: 10.563254213583622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The sample complexity of multi-group learning is shown to improve in the group-realizable setting over the agnostic setting, even when the family of groups is infinite so long as it has finite VC dimension. The improved sample complexity is obtained by empirical risk minimization over the class of group-realizable concepts, which itself could have infinite VC dimension. Implementing this approach is also shown to be computationally intractable, and an alternative approach is suggested based on improper learning.
Related papers
- GroupCoOp: Group-robust Fine-tuning via Group Prompt Learning [57.888537648437115]
Group Context Optimization (GroupCoOp) is a simple and effective debiased fine-tuning algorithm.<n>It enhances the group robustness of fine-tuned vision-language models (VLMs)<n>GroupCoOp achieved the best results on five benchmarks across five CLIP architectures.
arXiv Detail & Related papers (2025-09-28T09:54:30Z) - Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios.<n>Existing SHGL methods encounter two significant limitations.<n>We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z) - Group-wise oracle-efficient algorithms for online multi-group learning [12.664869982542895]
We study the problem of online multi-group learning, a learning model in which an online learner must simultaneously achieve small prediction regret on a large collection of subsequences corresponding to a family of groups.<n>In this paper, we design such oracle-efficient algorithms with sublinear regret under a variety of settings, including adversarial and adversarial transductive settings.
arXiv Detail & Related papers (2024-06-07T23:00:02Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.<n>We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Self-Learning Symmetric Multi-view Probabilistic Clustering [35.96327818838784]
Multi-view Clustering (MVC) has achieved significant progress, with many efforts dedicated to learn knowledge from multiple views.
Most existing methods are either not applicable or require additional steps for incomplete MVC.
We propose a novel unified framework for incomplete and complete MVC named self-learning symmetric multi-view probabilistic clustering.
arXiv Detail & Related papers (2023-05-12T08:27:03Z) - Distributionally Robust Optimization with Probabilistic Group [24.22720998340643]
We propose a novel framework PG-DRO for distributionally robust optimization.
Key to our framework is soft group membership instead of hard group annotations.
Our framework accommodates samples with group membership ambiguity, offering stronger flexibility and generality than the prior art.
arXiv Detail & Related papers (2023-03-10T09:31:44Z) - Group conditional validity via multi-group learning [5.797821810358083]
We consider the problem of distribution-free conformal prediction and the criterion of group conditional validity.
Existing methods achieve such guarantees under either restrictive grouping structure or distributional assumptions.
We propose a simple reduction to the problem of achieving validity guarantees for individual populations by leveraging algorithms for a problem called multi-group learning.
arXiv Detail & Related papers (2023-03-07T15:51:03Z) - Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain.
We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters.
We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z) - Sample-Efficient Reinforcement Learning in the Presence of Exogenous
Information [77.19830787312743]
In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand.
We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component.
We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
arXiv Detail & Related papers (2022-06-09T05:19:32Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Simple and near-optimal algorithms for hidden stratification and multi-group learning [13.337579367787253]
This paper studies the structure of solutions to the multi-group learning problem.
It provides simple and near-optimal algorithms for the learning problem.
arXiv Detail & Related papers (2021-12-22T19:16:24Z) - Supervised Multivariate Learning with Simultaneous Feature Auto-grouping
and Dimension Reduction [7.093830786026851]
This paper proposes a novel clustered reduced-rank learning framework.
It imposes two joint matrix regularizations to automatically group the features in constructing predictive factors.
It is more interpretable than low-rank modeling and relaxes the stringent sparsity assumption in variable selection.
arXiv Detail & Related papers (2021-12-17T20:11:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.