Data-induced multiscale losses and efficient multirate gradient descent
schemes
- URL: http://arxiv.org/abs/2402.03021v2
- Date: Tue, 6 Feb 2024 12:50:12 GMT
- Title: Data-induced multiscale losses and efficient multirate gradient descent
schemes
- Authors: Juncai He, Liangchen Liu, and Yen-Hsi Richard Tsai
- Abstract summary: This paper reveals multiscale structures in the loss landscape, including its gradients and Hessians inherited from the data.
It introduces a novel gradient descent approach, drawing inspiration from multiscale algorithms used in scientific computing.
- Score: 6.299435779277399
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the impact of multiscale data on machine learning
algorithms, particularly in the context of deep learning. A dataset is
multiscale if its distribution shows large variations in scale across different
directions. This paper reveals multiscale structures in the loss landscape,
including its gradients and Hessians inherited from the data. Correspondingly,
it introduces a novel gradient descent approach, drawing inspiration from
multiscale algorithms used in scientific computing. This approach seeks to
transcend empirical learning rate selection, offering a more systematic,
data-informed strategy to enhance training efficiency, especially in the later
stages.
Related papers
- MixUp-MIL: A Study on Linear & Multilinear Interpolation-Based Data
Augmentation for Whole Slide Image Classification [1.5810132476010594]
We investigate a data augmentation technique for classifying digital whole slide images.
The results show an extraordinarily high variability in the effect of the method.
We identify several interesting aspects to bring light into the darkness and identified novel promising fields of research.
arXiv Detail & Related papers (2023-11-06T12:00:53Z) - Enhancing Deep Learning Models through Tensorization: A Comprehensive
Survey and Framework [0.0]
This paper explores the steps involved in multidimensional data sources, various multiway analysis methods employed, and the benefits of these approaches.
A small example of Blind Source Separation (BSS) is presented comparing 2-dimensional algorithms and a multiway algorithm in Python.
Results indicate that multiway analysis is more expressive.
arXiv Detail & Related papers (2023-09-05T17:56:22Z) - Domain Generalization for Mammographic Image Analysis with Contrastive
Learning [62.25104935889111]
The training of an efficacious deep learning model requires large data with diverse styles and qualities.
A novel contrastive learning is developed to equip the deep learning models with better style generalization capability.
The proposed method has been evaluated extensively and rigorously with mammograms from various vendor style domains and several public datasets.
arXiv Detail & Related papers (2023-04-20T11:40:21Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Convolutional Learning on Multigraphs [153.20329791008095]
We develop convolutional information processing on multigraphs and introduce convolutional multigraph neural networks (MGNNs)
To capture the complex dynamics of information diffusion within and across each of the multigraph's classes of edges, we formalize a convolutional signal processing model.
We develop a multigraph learning architecture, including a sampling procedure to reduce computational complexity.
The introduced architecture is applied towards optimal wireless resource allocation and a hate speech localization task, offering improved performance over traditional graph neural networks.
arXiv Detail & Related papers (2022-09-23T00:33:04Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Multiscale Laplacian Learning [3.24029503704305]
This paper presents two innovative multiscale Laplacian learning approaches for machine learning tasks.
The first approach, called multi Kernel manifold learning (MML), integrates manifold learning with multi Kernel information.
The second approach, called the multiscale MBO (MMBO) method, introduces multiscale Laplacians to a modification of the famous classical Merriman-Bence-Osher scheme.
arXiv Detail & Related papers (2021-09-08T15:25:32Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Siloed Federated Learning for Multi-Centric Histopathology Datasets [0.17842332554022694]
This paper proposes a novel federated learning approach for deep learning architectures in the medical domain.
Local-statistic batch normalization (BN) layers are introduced, resulting in collaboratively-trained, yet center-specific models.
We benchmark the proposed method on the classification of tumorous histopathology image patches extracted from the Camelyon16 and Camelyon17 datasets.
arXiv Detail & Related papers (2020-08-17T15:49:30Z) - Dataset Condensation with Gradient Matching [36.14340188365505]
We propose a training set synthesis technique for data-efficient learning, called dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch.
We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-10T16:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.