DNA: Denoised Neighborhood Aggregation for Fine-grained Category
Discovery
- URL: http://arxiv.org/abs/2310.10151v1
- Date: Mon, 16 Oct 2023 07:43:30 GMT
- Title: DNA: Denoised Neighborhood Aggregation for Fine-grained Category
Discovery
- Authors: Wenbin An, Feng Tian, Wenkai Shi, Yan Chen, Qinghua Zheng, QianYing
Wang, Ping Chen
- Abstract summary: We propose a self-supervised framework that encodes semantic structures of data into the embedding space.
We retrieve k-nearest neighbors of a query as its positive keys to capture semantic similarities between data and then aggregate information from the neighbors to learn compact cluster representations.
Our method can retrieve more accurate neighbors (21.31% accuracy improvement) and outperform state-of-the-art models by a large margin.
- Score: 25.836440772705505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discovering fine-grained categories from coarsely labeled data is a practical
and challenging task, which can bridge the gap between the demand for
fine-grained analysis and the high annotation cost. Previous works mainly focus
on instance-level discrimination to learn low-level features, but ignore
semantic similarities between data, which may prevent these models learning
compact cluster representations. In this paper, we propose Denoised
Neighborhood Aggregation (DNA), a self-supervised framework that encodes
semantic structures of data into the embedding space. Specifically, we retrieve
k-nearest neighbors of a query as its positive keys to capture semantic
similarities between data and then aggregate information from the neighbors to
learn compact cluster representations, which can make fine-grained categories
more separatable. However, the retrieved neighbors can be noisy and contain
many false-positive keys, which can degrade the quality of learned embeddings.
To cope with this challenge, we propose three principles to filter out these
false neighbors for better representation learning. Furthermore, we
theoretically justify that the learning objective of our framework is
equivalent to a clustering loss, which can capture semantic similarities
between data to form compact fine-grained clusters. Extensive experiments on
three benchmark datasets show that our method can retrieve more accurate
neighbors (21.31% accuracy improvement) and outperform state-of-the-art models
by a large margin (average 9.96% improvement on three metrics). Our code and
data are available at https://github.com/Lackel/DNA.
Related papers
- A robust three-way classifier with shadowed granular-balls based on justifiable granularity [53.39844791923145]
We construct a robust three-way classifier with shadowed GBs for uncertain data.
Our model demonstrates in managing uncertain data and effectively mitigates classification risks.
arXiv Detail & Related papers (2024-07-03T08:54:45Z) - Influence of Swarm Intelligence in Data Clustering Mechanisms [0.0]
Nature inspired Swarm based algorithms are used for data clustering to cope with larger datasets with lack and inconsistency of data.
This paper reviews the performances of these new approaches and compares which is best for certain problematic situation.
arXiv Detail & Related papers (2023-05-07T08:40:50Z) - Learn to Cluster Faces with Better Subgraphs [13.511058277653122]
Face clustering can provide pseudo-labels to the massive unlabeled face data.
Existing clustering methods aggregate features within subgraphs based on a uniform threshold or a learned cutoff position.
This work proposed an efficient neighborhood-aware subgraph adjustment method that can significantly reduce the noise.
arXiv Detail & Related papers (2023-04-21T09:18:55Z) - Nearest Neighbor-Based Contrastive Learning for Hyperspectral and LiDAR
Data Classification [45.026868970899514]
We propose a Nearest Neighbor-based Contrastive Learning Network (NNCNet) to learn discriminative feature representations.
Specifically, we propose a nearest neighbor-based data augmentation scheme to use enhanced semantic relationships among nearby regions.
In addition, we design a bilinear attention module to exploit the second-order and even high-order feature interactions between the HSI and LiDAR data.
arXiv Detail & Related papers (2023-01-09T13:43:54Z) - SSDBCODI: Semi-Supervised Density-Based Clustering with Outliers
Detection Integrated [1.8444322599555096]
Clustering analysis is one of the critical tasks in machine learning.
Due to the fact that the performance of clustering clustering can be significantly eroded by outliers, algorithms try to incorporate the process of outlier detection.
We have proposed SSDBCODI, a semi-supervised detection element.
arXiv Detail & Related papers (2022-08-10T21:06:38Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - CvS: Classification via Segmentation For Small Datasets [52.821178654631254]
This paper presents CvS, a cost-effective classifier for small datasets that derives the classification labels from predicting the segmentation maps.
We evaluate the effectiveness of our framework on diverse problems showing that CvS is able to achieve much higher classification results compared to previous methods when given only a handful of examples.
arXiv Detail & Related papers (2021-10-29T18:41:15Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Integrating Semantics and Neighborhood Information with Graph-Driven
Generative Models for Document Retrieval [51.823187647843945]
In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model.
Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones.
arXiv Detail & Related papers (2021-05-27T11:29:03Z) - How to Design Robust Algorithms using Noisy Comparison Oracle [12.353002222958605]
Metric based comparison operations are fundamental to studying various clustering techniques.
In this paper, we study various problems that include finding maximum, nearest/farthest neighbor search.
We give robust algorithms for k-center clustering and agglomerative hierarchical clustering.
arXiv Detail & Related papers (2021-05-12T16:58:09Z) - Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on
Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models.
We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.