Related papers: Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm

Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm

URL: http://arxiv.org/abs/2207.04612v1
Date: Mon, 11 Jul 2022 04:08:21 GMT
Title: Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Authors: Lechao Xiao, Jeffrey Pennington
Abstract summary: We study the triplet (D, M, I) as an integrated system and identify important synergies that help mitigate the curse of dimensionality. We find that learning is most efficient when these symmetries are compatible with those of the data distribution.
Score: 33.59320315666675
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although learning in high dimensions is commonly believed to suffer from the curse of dimensionality, modern machine learning methods often exhibit an astonishing power to tackle a wide range of challenging real-world learning problems without using abundant amounts of data. How exactly these methods break this curse remains a fundamental open question in the theory of deep learning. While previous efforts have investigated this question by studying the data (D), model (M), and inference algorithm (I) as independent modules, in this paper, we analyze the triplet (D, M, I) as an integrated system and identify important synergies that help mitigate the curse of dimensionality. We first study the basic symmetries associated with various learning algorithms (M, I), focusing on four prototypical architectures in deep learning: fully-connected networks (FCN), locally-connected networks (LCN), and convolutional networks with and without pooling (GAP/VEC). We find that learning is most efficient when these symmetries are compatible with those of the data distribution and that performance significantly deteriorates when any member of the (D, M, I) triplet is inconsistent or suboptimal.

Related papers

Three Things to Know about Deep Metric Learning [34.16300515811057]
This paper addresses supervised deep metric learning for open-set image retrieval. It focuses on three key aspects: the loss function, mixup regularization, and model initialization. Through a systematic study of these components, we demonstrate that their synergy enables large models to nearly solve popular benchmarks.
arXiv Detail & Related papers (2024-12-17T00:49:12Z)
Simple Ingredients for Offline Reinforcement Learning [86.1988266277766]
offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task. We show that existing methods struggle with diverse data: their performance considerably deteriorates as data collected for related but different tasks is simply added to the offline buffer. We show that scale, more than algorithmic considerations, is the key factor influencing performance.
arXiv Detail & Related papers (2024-03-19T18:57:53Z)
Homological Convolutional Neural Networks [4.615338063719135]
We propose a novel deep learning architecture that exploits the data structural organization through topologically constrained network representations. We test our model on 18 benchmark datasets against 5 classic machine learning and 3 deep learning models.
arXiv Detail & Related papers (2023-08-26T08:48:51Z)
Matching DNN Compression and Cooperative Training with Resources and Data Availability [20.329698347331075]
How much and when an ML model should be compressed, and em where its training should be executed, are hard decisions to make. We model the network system focusing on the training of DNNs, formalize the multi-dimensional problem, and formulate an approximate dynamic programming problem. We prove that PACT's solutions can get as close to the optimum as desired, at the cost of an increased time complexity.
arXiv Detail & Related papers (2022-12-02T09:52:18Z)
Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks. Data augmentation induces these symmetries during training by applying multiple transformations to the input data. This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z)
Deep Efficient Continuous Manifold Learning for Time Series Modeling [11.876985348588477]
A symmetric positive definite matrix is being studied in computer vision, signal processing, and medical image analysis. In this paper, we propose a framework to exploit a diffeomorphism mapping between Riemannian manifold and a Cholesky space. For dynamic modeling of time-series data, we devise a continuous manifold learning method by systematically integrating a manifold ordinary differential equation and a gated recurrent neural network.
arXiv Detail & Related papers (2021-12-03T01:38:38Z)
Enhancing ensemble learning and transfer learning in multimodal data analysis by adaptive dimensionality reduction [10.646114896709717]
In multimodal data analysis, not all observations would show the same level of reliability or information quality. We propose an adaptive approach for dimensionality reduction to overcome this issue. We test our approach on multimodal datasets acquired in diverse research fields.
arXiv Detail & Related papers (2021-05-08T11:53:12Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
ATOM3D: Tasks On Molecules in Three Dimensions [91.72138447636769]
Deep neural networks have recently gained significant attention. In this work we present ATOM3D, a collection of both novel and existing datasets spanning several key classes of biomolecules. We develop three-dimensional molecular learning networks for each of these tasks, finding that they consistently improve performance.
arXiv Detail & Related papers (2020-12-07T20:18:23Z)
Eigendecomposition-Free Training of Deep Networks for Linear Least-Square Problems [107.3868459697569]
We introduce an eigendecomposition-free approach to training a deep network. We show that our approach is much more robust than explicit differentiation of the eigendecomposition. Our method has better convergence properties and yields state-of-the-art results.
arXiv Detail & Related papers (2020-04-15T04:29:34Z)
Learning Similarity Metrics for Numerical Simulations [29.39625644221578]
We propose a neural network-based approach that computes a stable and generalizing metric (LSiM) to compare data from a variety of numerical simulation sources. Our method employs a Siamese network architecture that is motivated by the mathematical properties of a metric.
arXiv Detail & Related papers (2020-02-18T20:11:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.