Spatiotemporal Classification with limited labels using Constrained
Clustering for large datasets
- URL: http://arxiv.org/abs/2210.07522v1
- Date: Fri, 14 Oct 2022 05:05:22 GMT
- Title: Spatiotemporal Classification with limited labels using Constrained
Clustering for large datasets
- Authors: Praveen Ravirathinam, Rahul Ghosh, Ke Wang, Keyang Xuan, Ankush
Khandelwal, Hilary Dugan, Paul Hanson, Vipin Kumar
- Abstract summary: Separable representations can lead to supervised models with better classification capabilities.
We show how we can learn even better representation using a constrained loss with few labels.
We conclude by showing how our method, using few labels, can pick out new labeled samples from the unlabeled data, which can be used to augment supervised methods leading to better classification.
- Score: 22.117238467818623
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creating separable representations via representation learning and clustering
is critical in analyzing large unstructured datasets with only a few labels.
Separable representations can lead to supervised models with better
classification capabilities and additionally aid in generating new labeled
samples. Most unsupervised and semisupervised methods to analyze large datasets
do not leverage the existing small amounts of labels to get better
representations. In this paper, we propose a spatiotemporal clustering paradigm
that uses spatial and temporal features combined with a constrained loss to
produce separable representations. We show the working of this method on the
newly published dataset ReaLSAT, a dataset of surface water dynamics for over
680,000 lakes across the world, making it an essential dataset in terms of
ecology and sustainability. Using this large unlabelled dataset, we first show
how a spatiotemporal representation is better compared to just spatial or
temporal representation. We then show how we can learn even better
representation using a constrained loss with few labels. We conclude by showing
how our method, using few labels, can pick out new labeled samples from the
unlabeled data, which can be used to augment supervised methods leading to
better classification.
Related papers
- Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets [51.74296438621836]
We introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels.
The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation.
Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations.
arXiv Detail & Related papers (2024-08-22T15:29:08Z) - Label Learning Method Based on Tensor Projection [82.51786483693206]
We propose a label learning method based on tensor projection (LLMTP)
We extend the matrix projection transformation to tensor projection, so that the spatial structure information between views can be fully utilized.
In addition, we introduce the tensor Schatten $p$-norm regularization to make the clustering label matrices of different views as consistent as possible.
arXiv Detail & Related papers (2024-02-26T13:03:26Z) - A Data-efficient Framework for Robotics Large-scale LiDAR Scene Parsing [10.497309421830671]
Existing state-of-the-art 3D point clouds understanding methods only perform well in a fully supervised manner.
This work presents a general and simple framework to tackle point clouds understanding when labels are limited.
arXiv Detail & Related papers (2023-12-03T02:38:51Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Learned Label Aggregation for Weak Supervision [8.819582879892762]
We propose a data programming approach that aggregates weak supervision signals to generate labeled data easily.
The quality of the generated labels depends on a label aggregation model that aggregates all noisy labels from all LFs to infer the ground-truth labels.
We show the model can be trained using synthetically generated data and design an effective architecture for the model.
arXiv Detail & Related papers (2022-07-27T14:36:35Z) - Label-Free Model Evaluation with Semi-Structured Dataset Representations [78.54590197704088]
Label-free model evaluation, or AutoEval, estimates model accuracy on unlabeled test sets.
In the absence of image labels, based on dataset representations, we estimate model performance for AutoEval with regression.
We propose a new semi-structured dataset representation that is manageable for regression learning while containing rich information for AutoEval.
arXiv Detail & Related papers (2021-12-01T18:15:58Z) - Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels
with Overclustering and Inverse Cross-Entropy [1.6392706389599345]
We propose a novel framework for handling semi-supervised classifications of fuzzy labels.
It is based on the idea of overclustering to detect substructures in these fuzzy labels.
We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels.
arXiv Detail & Related papers (2021-10-13T10:50:50Z) - Towards Clustering-friendly Representations: Subspace Clustering via
Graph Filtering [16.60975509085194]
We propose a graph filtering approach by which a smooth representation is achieved.
Experiments on image and document clustering datasets demonstrate that our method improves upon state-of-the-art subspace clustering techniques.
An ablation study shows that graph filtering can remove noise, preserve structure in the image, and increase the separability of classes.
arXiv Detail & Related papers (2021-06-18T02:21:36Z) - Predictive K-means with local models [0.028675177318965035]
Predictive clustering seeks to obtain the best of the two worlds.
We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance.
arXiv Detail & Related papers (2020-12-16T10:49:36Z) - Weakly-Supervised Salient Object Detection via Scribble Annotations [54.40518383782725]
We propose a weakly-supervised salient object detection model to learn saliency from scribble labels.
We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps.
Our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T12:59:50Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.