Parametric Information Maximization for Generalized Category Discovery
- URL: http://arxiv.org/abs/2212.00334v3
- Date: Fri, 14 Jul 2023 15:27:17 GMT
- Title: Parametric Information Maximization for Generalized Category Discovery
- Authors: Florent Chiaroni, Jose Dolz, Ziko Imtiaz Masud, Amar Mitiche, Ismail
Ben Ayed
- Abstract summary: We introduce a Parametric Information Maximization (PIM) model for the Generalized Category Discovery (GCD) problem.
We show that our PIM model consistently sets new state-of-the-art performances in GCD across six different datasets.
- Score: 20.373038652827788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a Parametric Information Maximization (PIM) model for the
Generalized Category Discovery (GCD) problem. Specifically, we propose a
bi-level optimization formulation, which explores a parameterized family of
objective functions, each evaluating a weighted mutual information between the
features and the latent labels, subject to supervision constraints from the
labeled samples. Our formulation mitigates the class-balance bias encoded in
standard information maximization approaches, thereby handling effectively both
short-tailed and long-tailed data sets. We report extensive experiments and
comparisons demonstrating that our PIM model consistently sets new
state-of-the-art performances in GCD across six different datasets, more so
when dealing with challenging fine-grained problems.
Related papers
- Curvature Enhanced Data Augmentation for Regression [4.910937238451485]
We introduce the Curvature-Enhanced Manifold Sampling (CEMS) method for regression tasks.<n>CEMS delivers superior performance in both in-distribution and out-of-distribution scenarios.
arXiv Detail & Related papers (2025-06-07T16:18:37Z) - Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks [81.44256822500257]
RLHF has emerged as a predominant approach for aligning artificial intelligence systems with human preferences.<n> RLHF exhibits insufficient compliance capabilities when confronted with complex multi-instruction tasks.<n>We propose a novel Multi-level Aware Preference Learning (MAPL) framework, capable of enhancing multi-instruction capabilities.
arXiv Detail & Related papers (2025-05-19T08:33:11Z) - Local vs. Global Models for Hierarchical Forecasting [0.0]
This study explores the influence of distinct information utilisation on the accuracy of hierarchical forecasts.
We develop Global Forecasting Models (GFMs) to exploit cross-series and cross-hierarchies information.
Two specific GFMs based on LightGBM are introduced, demonstrating superior accuracy and lower model complexity.
arXiv Detail & Related papers (2024-11-10T08:51:49Z) - Fairness-Aware Estimation of Graphical Models [13.39268712338485]
This paper examines the issue of fairness in the estimation of graphical models (GMs)
Standard GMs can result in biased outcomes, especially when the underlying data involves sensitive characteristics or protected groups.
We introduce a comprehensive framework designed to reduce bias in the estimation of GMs related to protected attributes.
arXiv Detail & Related papers (2024-08-30T16:30:00Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - Semi-Supervised U-statistics [22.696630428733204]
We introduce semi-supervised U-statistics enhanced by the abundance of unlabeled data.
We show that the proposed approach exhibits notable efficiency gains over classical U-statistics.
We propose a refined approach that outperforms the classical U-statistic across all degeneracy regimes.
arXiv Detail & Related papers (2024-02-29T07:29:27Z) - Toward the Identifiability of Comparative Deep Generative Models [7.5479347719819865]
We propose a theory of identifiability for comparative Deep Generative Models (DGMs)
We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine.
We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance.
arXiv Detail & Related papers (2024-01-29T06:10:54Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Post-mortem on a deep learning contest: a Simpson's paradox and the
complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models.
We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data.
We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z) - Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems.
A theory has shown the importance of the gradient descent (GD) to globally optimal solutions.
We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.