Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels
with Overclustering and Inverse Cross-Entropy
- URL: http://arxiv.org/abs/2110.06630v1
- Date: Wed, 13 Oct 2021 10:50:50 GMT
- Title: Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels
with Overclustering and Inverse Cross-Entropy
- Authors: Lars Schmarje and Johannes Br\"unger and Monty Santarossa and
Simon-Martin Schr\"oder and Rainer Kiko and Reinhard Koch
- Abstract summary: We propose a novel framework for handling semi-supervised classifications of fuzzy labels.
It is based on the idea of overclustering to detect substructures in these fuzzy labels.
We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels.
- Score: 1.6392706389599345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has been successfully applied to many classification problems
including underwater challenges. However, a long-standing issue with deep
learning is the need for large and consistently labeled datasets. Although
current approaches in semi-supervised learning can decrease the required amount
of annotated data by a factor of 10 or even more, this line of research still
uses distinct classes. For underwater classification, and uncurated real-world
datasets in general, clean class boundaries can often not be given due to a
limited information content in the images and transitional stages of the
depicted objects. This leads to different experts having different opinions and
thus producing fuzzy labels which could also be considered ambiguous or
divergent. We propose a novel framework for handling semi-supervised
classifications of such fuzzy labels. It is based on the idea of overclustering
to detect substructures in these fuzzy labels. We propose a novel loss to
improve the overclustering capability of our framework and show the benefit of
overclustering for fuzzy labels. We show that our framework is superior to
previous state-of-the-art semi-supervised methods when applied to real-world
plankton data with fuzzy labels. Moreover, we acquire 5 to 10\% more consistent
predictions of substructures.
Related papers
- A data-centric approach for assessing progress of Graph Neural Networks [7.2249434861826325]
Graph Neural Networks (GNNs) have achieved state-of-the-art results in node classification tasks.
Most improvements are in multi-class classification, with less focus on the cases where each node could have multiple labels.
First challenge in studying multi-label node classification is the scarcity of publicly available datasets.
arXiv Detail & Related papers (2024-06-18T09:41:40Z) - Active Generalized Category Discovery [60.69060965936214]
Generalized Category Discovery (GCD) endeavors to cluster unlabeled samples from both novel and old classes.
We take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD)
Our method achieves state-of-the-art performance on both generic and fine-grained datasets.
arXiv Detail & Related papers (2024-03-07T07:12:24Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Making Binary Classification from Multiple Unlabeled Datasets Almost
Free of Supervision [128.6645627461981]
We propose a new problem setting, i.e., binary classification from multiple unlabeled datasets with only one pairwise numerical relationship of class priors.
In MU-OPPO, we do not need the class priors for all unlabeled datasets.
We show that our framework brings smaller estimation errors of class priors and better performance of binary classification.
arXiv Detail & Related papers (2023-06-12T11:33:46Z) - Spatiotemporal Classification with limited labels using Constrained
Clustering for large datasets [22.117238467818623]
Separable representations can lead to supervised models with better classification capabilities.
We show how we can learn even better representation using a constrained loss with few labels.
We conclude by showing how our method, using few labels, can pick out new labeled samples from the unlabeled data, which can be used to augment supervised methods leading to better classification.
arXiv Detail & Related papers (2022-10-14T05:05:22Z) - Use All The Labels: A Hierarchical Multi-Label Contrastive Learning
Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes.
We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z) - Mixed Supervision Learning for Whole Slide Image Classification [88.31842052998319]
We propose a mixed supervision learning framework for super high-resolution images.
During the patch training stage, this framework can make use of coarse image-level labels to refine self-supervised learning.
A comprehensive strategy is proposed to suppress pixel-level false positives and false negatives.
arXiv Detail & Related papers (2021-07-02T09:46:06Z) - Beyond Cats and Dogs: Semi-supervised Classification of fuzzy labels
with overclustering [1.6392706389599345]
We propose a novel framework for handling semi-supervised classifications of fuzzy labels.
Our framework is based on the idea of overclustering to detect substructures in these fuzzy labels.
arXiv Detail & Related papers (2020-12-03T08:54:25Z) - The GraphNet Zoo: An All-in-One Graph Based Deep Semi-Supervised
Framework for Medical Image Classification [0.0]
We consider the problem of classifying a medical image dataset when we have a limited amount of labels.
Using semi-supervised learning, one can produce accurate classifications using a significantly reduced amount of labelled data.
We propose an all-in-one framework for deep semi-supervised classification focusing on graph based approaches.
arXiv Detail & Related papers (2020-03-13T19:18:21Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z) - An interpretable semi-supervised classifier using two different
strategies for amended self-labeling [0.0]
Semi-supervised classification techniques combine labeled and unlabeled data during the learning phase.
We present an interpretable self-labeling grey-box classifier that uses a black box to estimate the missing class labels and a white box to explain the final predictions.
arXiv Detail & Related papers (2020-01-26T19:37:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.