Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution
- URL: http://arxiv.org/abs/2202.03126v4
- Date: Fri, 30 Jun 2023 17:08:15 GMT
- Title: Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution
- Authors: Gabriel Bertocco, Ant\^onio Theophilo, Fernanda Andal\'o and Anderson
Rocha
- Abstract summary: Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
- Score: 77.85461690214551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning from fully-unlabeled data is challenging in Multimedia Forensics
problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing
with fully-unlabeled data in cases where the underlying classes have
significant semantic differences, as intra-class distances are substantially
lower than inter-class distances. However, this is not the case for forensic
applications in which classes have similar semantics and the training and test
sets have disjoint identities. General self-supervised learning methods might
fail to learn discriminative features in this scenario, thus requiring more
robust strategies. We propose a strategy to tackle Person Re-Identification and
Text Authorship Attribution by enabling learning from unlabeled data even when
samples from different classes are not prominently diverse. We propose a novel
ensemble-based clustering strategy whereby clusters derived from different
configurations are combined to generate a better grouping for the data samples
in a fully-unsupervised way. This strategy allows clusters with different
densities and higher variability to emerge, reducing intra-class discrepancies
without requiring the burden of finding an optimal configuration per dataset.
We also consider different Convolutional Neural Networks for feature extraction
and subsequent distance computations between samples. We refine these distances
by incorporating context and grouping them to capture complementary
information. Our method is robust across both tasks, with different data
modalities, and outperforms state-of-the-art methods with a fully-unsupervised
solution without any labeling or human intervention.
Related papers
- Annotation-Efficient Polyp Segmentation via Active Learning [45.59503015577479]
We propose a deep active learning framework for annotation-efficient polyp segmentation.
In practice, we measure the uncertainty of each sample by examining the similarity between features masked by the prediction map of the polyp and the background area.
We show that our proposed method achieved state-of-the-art performance compared to other competitors on both a public dataset and a large-scale in-house dataset.
arXiv Detail & Related papers (2024-03-21T12:25:17Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Deep Metric Learning Assisted by Intra-variance in A Semi-supervised
View of Learning [0.0]
Deep metric learning aims to construct an embedding space where samples of the same class are close to each other, while samples of different classes are far away from each other.
This paper designs a self-supervised generative assisted ranking framework that provides a semi-supervised view of intra-class variance learning scheme for typical supervised deep metric learning.
arXiv Detail & Related papers (2023-04-21T13:30:32Z) - Voxel-wise Adversarial Semi-supervised Learning for Medical Image
Segmentation [4.489713477369384]
We introduce a novel adversarial learning-based semi-supervised segmentation method for medical image segmentation.
Our method embeds both local and global features from multiple hidden layers and learns context relations between multiple classes.
Our method outperforms current best-performing state-of-the-art semi-supervised learning approaches on the image segmentation of the left atrium (single class) and multiorgan datasets (multiclass)
arXiv Detail & Related papers (2022-05-14T06:57:19Z) - The Group Loss++: A deeper look into group loss for deep metric learning [65.19665861268574]
Group Loss is a loss function based on a differentiable label-propagation method that enforces embedding similarity across all samples of a group.
We show state-of-the-art results on clustering and image retrieval on four datasets, and present competitive results on two person re-identification datasets.
arXiv Detail & Related papers (2022-04-04T14:09:58Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Clustering augmented Self-Supervised Learning: Anapplication to Land
Cover Mapping [10.720852987343896]
We introduce a new method for land cover mapping by using a clustering based pretext task for self-supervised learning.
We demonstrate the effectiveness of the method on two societally relevant applications.
arXiv Detail & Related papers (2021-08-16T19:35:43Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.