Predicting Generalization in Deep Learning via Metric Learning -- PGDL
Shared task
- URL: http://arxiv.org/abs/2012.09117v1
- Date: Wed, 16 Dec 2020 17:59:13 GMT
- Title: Predicting Generalization in Deep Learning via Metric Learning -- PGDL
Shared task
- Authors: Sebastian Me\v{z}nar and Bla\v{z} \v{S}krlj
- Abstract summary: The report presents the solution that was submitted by the user emphsmeznar which achieved the eight place in the competition.
In the proposed approach, we create simple metrics and find their best combination with automatic testing on the provided dataset, exploring how combinations of various properties of the input neural network architectures can be used for the prediction of their generalization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The competition "Predicting Generalization in Deep Learning (PGDL)" aims to
provide a platform for rigorous study of generalization of deep learning models
and offer insight into the progress of understanding and explaining these
models. This report presents the solution that was submitted by the user
\emph{smeznar} which achieved the eight place in the competition. In the
proposed approach, we create simple metrics and find their best combination
with automatic testing on the provided dataset, exploring how combinations of
various properties of the input neural network architectures can be used for
the prediction of their generalization.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Deep Companion Learning: Enhancing Generalization Through Historical Consistency [35.5237083057451]
We propose a novel training method for Deep Neural Networks (DNNs) that enhances generalization by penalizing inconsistent model predictions.
We train a deep-companion model (DCM) by using previous versions of the model to provide forecasts on new inputs.
This companion model deciphers a meaningful latent semantic structure within the data, thereby providing targeted supervision.
arXiv Detail & Related papers (2024-07-26T15:31:13Z) - Self-Supervised Representation Learning with Meta Comprehensive
Regularization [11.387994024747842]
We introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks.
We update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features.
We provide theoretical support for our proposed method from information theory and causal counterfactual perspective.
arXiv Detail & Related papers (2024-03-03T15:53:48Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Sparsity-aware generalization theory for deep neural networks [12.525959293825318]
We present a new approach to analyzing generalization for deep feed-forward ReLU networks.
We show fundamental trade-offs between sparsity and generalization.
arXiv Detail & Related papers (2023-07-01T20:59:05Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Predicting Generalization in Deep Learning via Local Measures of
Distortion [7.806155368334511]
We study generalization in deep learning by appealing to complexity measures originally developed in approximation and information theory.
We show that simple vector quantization approaches such as PCA, GMMs, and SVMs capture their spirit when applied layer-wise to deep extracted features.
arXiv Detail & Related papers (2020-12-13T05:46:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.