Related papers: Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction

Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction

URL: http://arxiv.org/abs/2508.10731v1
Date: Thu, 14 Aug 2025 15:11:22 GMT
Title: Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
Authors: Luyao Tang, Kunze Huang, Chaoqi Chen, Yuxuan Yuan, Chenxin Li, Xiaotong Tu, Xinghao Ding, Yue Huang,
Abstract summary: We present a solution inspired by the human cognitive process for novel object understanding.<n>We propose ConGCD, which establishes primitive-oriented representations through high-level semantic reconstruction.<n>We implement dominant and contextual consensus units to capture class-discriminative patterns.
Score: 36.73147151458588
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human perceptual systems excel at inducing and recognizing objects across both known and novel categories, a capability far beyond current machine learning frameworks. While generalized category discovery (GCD) aims to bridge this gap, existing methods predominantly focus on optimizing objective functions. We present an orthogonal solution, inspired by the human cognitive process for novel object understanding: decomposing objects into visual primitives and establishing cross-knowledge comparisons. We propose ConGCD, which establishes primitive-oriented representations through high-level semantic reconstruction, binding intra-class shared attributes via deconstruction. Mirroring human preference diversity in visual processing, where distinct individuals leverage dominant or contextual cues, we implement dominant and contextual consensus units to capture class-discriminative patterns and inherent distributional invariants, respectively. A consensus scheduler dynamically optimizes activation pathways, with final predictions emerging through multiplex consensus integration. Extensive evaluations across coarse- and fine-grained benchmarks demonstrate ConGCD's effectiveness as a consensus-aware paradigm. Code is available at github.com/lytang63/ConGCD.

Related papers

UniDGF: A Unified Detection-to-Generation Framework for Hierarchical Object Visual Recognition [14.256812146187565]
We introduce a detection-guided generative framework that predicts hierarchical category and attribute tokens.<n>For each detected object, we extract refined ROI-level features and employ a BART-based generator to produce semantic tokens.<n> Experiments on both large-scale proprietary e-commerce datasets and open-source datasets demonstrate that our approach significantly outperforms existing similarity-based pipelines.
arXiv Detail & Related papers (2025-11-20T02:37:43Z)
Learning Human-Object Interaction as Groups [52.28258599873394]
GroupHOI is a framework that propagates contextual information in terms of geometric proximity and semantic similarity.<n>It exhibits leading performance on the more challenging Nonverbal Interaction Detection task.
arXiv Detail & Related papers (2025-10-21T07:25:10Z)
HAMLET-FFD: Hierarchical Adaptive Multi-modal Learning Embeddings Transformation for Face Forgery Detection [6.060036926093259]
HAMLET-FFD is a cross-domain generalization framework for face forgery detection.<n>It integrates visual evidence with conceptual cues, emulating expert forensic analysis.<n>By design, HAMLET-FFD freezes all pretrained parameters, serving as an external plugin.
arXiv Detail & Related papers (2025-07-28T15:09:52Z)
Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning [65.75756724642932]
In incomplete multi-view clustering, missing data induce prototype shifts within views and semantic inconsistencies across views.<n>We propose an IMVC framework, imputation- and alignment-free for consensus semantics learning (FreeCSL)<n>FreeCSL achieves more confident and robust assignments on IMVC task, compared to state-of-the-art competitors.
arXiv Detail & Related papers (2025-05-16T12:37:10Z)
From Visual Explanations to Counterfactual Explanations with Latent Diffusion [11.433402357922414]
We propose a new approach to tackle two key challenges in recent prominent works.<n>First, we determine which specific counterfactual features are crucial for distinguishing the "concept" of the target class from the original class.<n>Second, we provide valuable explanations for the non-robust classifier without relying on the support of an adversarially robust model.
arXiv Detail & Related papers (2025-04-12T13:04:00Z)
Concept Guided Co-salient Object Detection [22.82243087156918]
ConceptCoSOD is a concept-guided framework that introduces high-level semantic knowledge to enhance co-saliency detection.<n>By extracting shared text-based concepts from the input image group, ConceptCoSOD provides semantic guidance that anchors the detection process.
arXiv Detail & Related papers (2024-12-21T12:47:12Z)
Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning [5.920971285288677]
Human Activity Recognition (HAR) aims to recognize activities by training models on massive sensor data.<n>One crucial aspect of HAR that has been largely overlooked is that the test sets may have different distributions from training sets.<n>We propose a Categorical Concept Invariant Learning framework for generalizable activity recognition.
arXiv Detail & Related papers (2024-12-18T08:18:03Z)
Enhancing Graph Contrastive Learning with Reliable and Informative Augmentation for Recommendation [84.45144851024257]
We propose a novel framework that aims to enhance graph contrastive learning by constructing contrastive views with stronger collaborative information via discrete codes.<n>The core idea is to map users and items into discrete codes rich in collaborative information for reliable and informative contrastive view generation.
arXiv Detail & Related papers (2024-09-09T14:04:17Z)
Learning the Precise Feature for Cluster Assignment [39.320210567860485]
We propose a framework which integrates representation learning and clustering into a single pipeline for the first time. The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features. Experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-11T04:08:54Z)
Learning and Evaluating Representations for Deep One-class Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification. We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations. In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.