Related papers: Data driven algorithms for limited labeled data learning

Data driven algorithms for limited labeled data learning

URL: http://arxiv.org/abs/2103.10547v1
Date: Thu, 18 Mar 2021 22:19:19 GMT
Title: Data driven algorithms for limited labeled data learning
Authors: Maria-Florina Balcan, Dravyansh Sharma
Abstract summary: We focus on graph-based techniques, where the unlabeled examples are connected in a graph under the implicit assumption that similar nodes likely have similar labels. We present a novel data driven approach for learning the graph and provide strong formal guarantees in both the distributional and online learning formalizations.
Score: 35.193000364580975
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider a novel data driven approach for designing learning algorithms that can effectively learn with only a small number of labeled examples. This is crucial for modern machine learning applications where labels are scarce or expensive to obtain. We focus on graph-based techniques, where the unlabeled examples are connected in a graph under the implicit assumption that similar nodes likely have similar labels. Over the past decades, several elegant graph-based semi-supervised and active learning algorithms for how to infer the labels of the unlabeled examples given the graph and a few labeled examples have been proposed. However, the problem of how to create the graph (which impacts the practical usefulness of these methods significantly) has been relegated to domain-specific art and heuristics and no general principles have been proposed. In this work we present a novel data driven approach for learning the graph and provide strong formal guarantees in both the distributional and online learning formalizations. We show how to leverage problem instances coming from an underlying problem domain to learn the graph hyperparameters from commonly used parametric families of graphs that perform well on new instances coming from the same domain. We obtain low regret and efficient algorithms in the online setting, and generalization guarantees in the distributional setting. We also show how to combine several very different similarity metrics and learn multiple hyperparameters, providing general techniques to apply to large classes of problems. We expect some of the tools and techniques we develop along the way to be of interest beyond semi-supervised and active learning, for data driven algorithms for combinatorial problems more generally.

Related papers

Efficiently Learning the Graph for Semi-supervised Learning [4.518012967046983]
We show how to learn the best graphs from the sparse families efficiently using the conjugate gradient method. Our approach can also be used to learn the graph efficiently online with sub-linear regret, under mild smoothness assumptions. We implement our approach and demonstrate significant ($sim$10-100x) speedups over prior work on semi-supervised learning with learned graphs on benchmark datasets.
arXiv Detail & Related papers (2023-06-12T13:22:06Z)
State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more. Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures. Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z)
A simple way to learn metrics between attributed graphs [11.207372645301094]
We propose a new Simple Graph Metric Learning - SGML - model with few trainable parameters. This model allows us to build an appropriate distance from a database of labeled (attributed) graphs to improve the performance of simple classification algorithms.
arXiv Detail & Related papers (2022-09-26T14:32:38Z)
A Survey of Learning on Small Data: Generalization, Optimization, and Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI. This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data. Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z)
Budget-aware Few-shot Learning via Graph Convolutional Network [56.41899553037247]
This paper tackles the problem of few-shot learning, which aims to learn new visual concepts from a few examples. A common problem setting in few-shot classification assumes random sampling strategy in acquiring data labels. We introduce a new budget-aware few-shot learning problem that aims to learn novel object categories.
arXiv Detail & Related papers (2022-01-07T02:46:35Z)
Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking [58.30147362745852]
Data association across frames is at the core of Multiple Object Tracking (MOT) task. Existing methods mostly ignore the context information among tracklets and intra-frame detections. We propose a novel learnable graph matching method to address these issues.
arXiv Detail & Related papers (2021-03-30T08:58:45Z)
A Unifying Generative Model for Graph Learning Algorithms: Label Propagation, Graph Convolutions, and Combinations [39.8498896531672]
Semi-supervised learning on graphs is a widely applicable problem in network science and machine learning. We develop a Markov random field model for the data generation process of node attributes. We show that label propagation, a linearized graph convolutional network, and their combination can all be derived as conditional expectations.
arXiv Detail & Related papers (2021-01-19T17:07:08Z)
Distributed Training of Graph Convolutional Networks using Subgraph Approximation [72.89940126490715]
We propose a training strategy that mitigates the lost information across multiple partitions of a graph through a subgraph approximation scheme. The subgraph approximation approach helps the distributed training system converge at single-machine accuracy.
arXiv Detail & Related papers (2020-12-09T09:23:49Z)
Graph topology inference benchmarks for machine learning [16.857405938139525]
We introduce several benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods. We also contrast some of the most prominent techniques in the literature.
arXiv Detail & Related papers (2020-07-16T09:40:32Z)
Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation [37.742218733235084]
We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs. Our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN) for the prediction phase and maximizes the expected error reduction in the query phase. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-09T18:00:53Z)
Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning. We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.