A Topological Approach for Semi-Supervised Learning
- URL: http://arxiv.org/abs/2205.09617v1
- Date: Thu, 19 May 2022 15:23:39 GMT
- Title: A Topological Approach for Semi-Supervised Learning
- Authors: Adri\'an In\'es, C\'esar Dom\'inguez, J\'onathan Heras, Gadea Mata and
Julio Rubio
- Abstract summary: We present new semi-supervised learning methods based on techniques from Topological Data Analysis (TDA)
In particular, we have created two semi-supervised learning methods following two different topological approaches.
The results show that the methods developed in this work outperform both the results obtained with models trained with only manually labelled data, and those obtained with classical semi-supervised learning methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, Machine Learning and Deep Learning methods have become the
state-of-the-art approach to solve data classification tasks. In order to use
those methods, it is necessary to acquire and label a considerable amount of
data; however, this is not straightforward in some fields, since data
annotation is time consuming and might require expert knowledge. This challenge
can be tackled by means of semi-supervised learning methods that take advantage
of both labelled and unlabelled data. In this work, we present new
semi-supervised learning methods based on techniques from Topological Data
Analysis (TDA), a field that is gaining importance for analysing large amounts
of data with high variety and dimensionality. In particular, we have created
two semi-supervised learning methods following two different topological
approaches. In the former, we have used a homological approach that consists in
studying the persistence diagrams associated with the data using the Bottleneck
and Wasserstein distances. In the latter, we have taken into account the
connectivity of the data. In addition, we have carried out a thorough analysis
of the developed methods using 3 synthetic datasets, 5 structured datasets, and
2 datasets of images. The results show that the semi-supervised methods
developed in this work outperform both the results obtained with models trained
with only manually labelled data, and those obtained with classical
semi-supervised learning methods, reaching improvements of up to a 16%.
Related papers
- Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution [62.71425232332837]
We show that training amortized models with noisy labels is inexpensive and surprisingly effective.
This approach significantly accelerates several feature attribution and data valuation methods, often yielding an order of magnitude speedup over existing approaches.
arXiv Detail & Related papers (2024-01-29T03:42:37Z) - Latent Graphs for Semi-Supervised Learning on Biomedical Tabular Data [4.498659756007485]
In this work, we provide an approach for inferring latent graphs that capture the intrinsic data relationships.
By leveraging graph-based representations, our approach facilitates the seamless propagation of information throughout the graph.
Our work demonstrates the significance of inter-instance relationship discovery as practical means for constructing robust latent graphs.
arXiv Detail & Related papers (2023-09-27T16:13:36Z) - Reinforcement Learning Based Multi-modal Feature Fusion Network for
Novel Class Discovery [47.28191501836041]
In this paper, we employ a Reinforcement Learning framework to simulate the cognitive processes of humans.
We also deploy a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information.
We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets.
arXiv Detail & Related papers (2023-08-26T07:55:32Z) - Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary
Data [100.33096338195723]
We focus on Few-shot Learning with Auxiliary Data (FLAD)
FLAD assumes access to auxiliary data during few-shot learning in hopes of improving generalization.
We propose two algorithms -- EXP3-FLAD and UCB1-FLAD -- and compare them with prior FLAD methods that either explore or exploit.
arXiv Detail & Related papers (2023-02-01T18:59:36Z) - Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation.
We develop a new adversarial learning based method, which is simple and efficient to apply.
We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z) - Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning.
I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Enhancing ensemble learning and transfer learning in multimodal data
analysis by adaptive dimensionality reduction [10.646114896709717]
In multimodal data analysis, not all observations would show the same level of reliability or information quality.
We propose an adaptive approach for dimensionality reduction to overcome this issue.
We test our approach on multimodal datasets acquired in diverse research fields.
arXiv Detail & Related papers (2021-05-08T11:53:12Z) - Joint Geometric and Topological Analysis of Hierarchical Datasets [7.098759778181621]
In this paper, we focus on high-dimensional data that are organized into several hierarchical datasets.
The main novelty in this work lies in the combination of two powerful data-analytic approaches: topological data analysis and geometric manifold learning.
We show that our new method gives rise to superior classification results compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-04-03T13:02:00Z) - Siloed Federated Learning for Multi-Centric Histopathology Datasets [0.17842332554022694]
This paper proposes a novel federated learning approach for deep learning architectures in the medical domain.
Local-statistic batch normalization (BN) layers are introduced, resulting in collaboratively-trained, yet center-specific models.
We benchmark the proposed method on the classification of tumorous histopathology image patches extracted from the Camelyon16 and Camelyon17 datasets.
arXiv Detail & Related papers (2020-08-17T15:49:30Z) - Dual-Teacher: Integrating Intra-domain and Inter-domain Teachers for
Annotation-efficient Cardiac Segmentation [65.81546955181781]
We propose a novel semi-supervised domain adaptation approach, namely Dual-Teacher.
The student model learns the knowledge of unlabeled target data and labeled source data by two teacher models.
We demonstrate that our approach is able to concurrently utilize unlabeled data and cross-modality data with superior performance.
arXiv Detail & Related papers (2020-07-13T10:00:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.