Fine-grained Generalization Analysis of Structured Output Prediction
- URL: http://arxiv.org/abs/2106.00115v1
- Date: Mon, 31 May 2021 21:44:14 GMT
- Title: Fine-grained Generalization Analysis of Structured Output Prediction
- Authors: Waleed Mustafa, Yunwen Lei, Antoine Ledent, Marius Kloft
- Abstract summary: In machine learning we often encounter structured output prediction problems (SOPPs)
In this paper, we significantly improve the state of the art by developing novel high-probability bounds with a logarithmic dependency on $d$.
Our results therefore build a solid theoretical foundation for learning in large-scale SOPPs.
- Score: 25.842980122940784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In machine learning we often encounter structured output prediction problems
(SOPPs), i.e. problems where the output space admits a rich internal structure.
Application domains where SOPPs naturally occur include natural language
processing, speech recognition, and computer vision. Typical SOPPs have an
extremely large label set, which grows exponentially as a function of the size
of the output. Existing generalization analysis implies generalization bounds
with at least a square-root dependency on the cardinality $d$ of the label set,
which can be vacuous in practice. In this paper, we significantly improve the
state of the art by developing novel high-probability bounds with a logarithmic
dependency on $d$. Moreover, we leverage the lens of algorithmic stability to
develop generalization bounds in expectation without any dependency on $d$. Our
results therefore build a solid theoretical foundation for learning in
large-scale SOPPs. Furthermore, we extend our results to learning with weakly
dependent data.
Related papers
- Learning Efficient Positional Encodings with Graph Neural Networks [109.8653020407373]
We introduce PEARL, a novel framework of learnable PEs for graphs.
PEARL approximates equivariant functions of eigenvectors with linear complexity, while rigorously establishing its stability and high expressive power.
Our analysis demonstrates that PEARL approximates equivariant functions of eigenvectors with linear complexity, while rigorously establishing its stability and high expressive power.
arXiv Detail & Related papers (2025-02-03T07:28:53Z) - RDGCN: Reinforced Dependency Graph Convolutional Network for
Aspect-based Sentiment Analysis [43.715099882489376]
We propose a new reinforced dependency graph convolutional network (RDGCN) that improves the importance calculation of dependencies in both distance and type views.
Under the criterion, we design a distance-importance function that leverages reinforcement learning for weight distribution search and dissimilarity control.
Comprehensive experiments on three popular datasets demonstrate the effectiveness of the criterion and importance functions.
arXiv Detail & Related papers (2023-11-08T05:37:49Z) - On Certified Generalization in Structured Prediction [1.0152838128195467]
In structured prediction, target objects have rich internal structure which does not factorize into independent components.
We present a novel PAC-Bayesian risk bound for structured prediction wherein the rate of generalization scales not only with the number of structured examples but also with their size.
arXiv Detail & Related papers (2023-06-15T13:15:26Z) - Generalization Analysis for Contrastive Representation Learning [80.89690821916653]
Existing generalization error bounds depend linearly on the number $k$ of negative examples.
We establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms.
arXiv Detail & Related papers (2023-02-24T01:03:56Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Measuring Generalization with Optimal Transport [111.29415509046886]
We develop margin-based generalization bounds, where the margins are normalized with optimal transport costs.
Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets.
arXiv Detail & Related papers (2021-06-07T03:04:59Z) - Learning Output Embeddings in Structured Prediction [73.99064151691597]
A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension.
A prediction in the original space is computed by solving a pre-image problem.
In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space.
arXiv Detail & Related papers (2020-07-29T09:32:53Z) - A General Framework for Consistent Structured Prediction with Implicit
Loss Embeddings [113.15416137912399]
We propose and analyze a novel theoretical and algorithmic framework for structured prediction.
We study a large class of loss functions that implicitly defines a suitable geometry on the problem.
When dealing with output spaces with infinite cardinality, a suitable implicit formulation of the estimator is shown to be crucial.
arXiv Detail & Related papers (2020-02-13T10:30:04Z) - Structural Decompositions of Epistemic Logic Programs [29.23726484912091]
Epistemic logic programs (ELPs) are a popular generalization of standard Answer Set Programming (ASP)
We show that central problems can be solved in linear time for ELPs exhibiting structural properties in terms of bounded treewidth.
We also provide a full dynamic programming algorithm that adheres to these bounds.
arXiv Detail & Related papers (2020-01-13T13:16:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.