Multi-Resolution Online Deterministic Annealing: A Hierarchical and
Progressive Learning Architecture
- URL: http://arxiv.org/abs/2212.08189v3
- Date: Tue, 21 Mar 2023 04:13:48 GMT
- Title: Multi-Resolution Online Deterministic Annealing: A Hierarchical and
Progressive Learning Architecture
- Authors: Christos Mavridis and John Baras
- Abstract summary: We introduce a general-purpose hierarchical learning architecture that is based on the progressive partitioning of a possibly multi-resolution data space.
We show that the solution of each optimization problem can be estimated online using gradient-free approximation updates.
Asymptotic convergence analysis and experimental results are provided for supervised and unsupervised learning problems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hierarchical learning algorithms that gradually approximate a solution to a
data-driven optimization problem are essential to decision-making systems,
especially under limitations on time and computational resources. In this
study, we introduce a general-purpose hierarchical learning architecture that
is based on the progressive partitioning of a possibly multi-resolution data
space. The optimal partition is gradually approximated by solving a sequence of
optimization sub-problems that yield a sequence of partitions with increasing
number of subsets. We show that the solution of each optimization problem can
be estimated online using gradient-free stochastic approximation updates. As a
consequence, a function approximation problem can be defined within each subset
of the partition and solved using the theory of two-timescale stochastic
approximation algorithms. This simulates an annealing process and defines a
robust and interpretable heuristic method to gradually increase the complexity
of the learning architecture in a task-agnostic manner, giving emphasis to
regions of the data space that are considered more important according to a
predefined criterion. Finally, by imposing a tree structure in the progression
of the partitions, we provide a means to incorporate potential multi-resolution
structure of the data space into this approach, significantly reducing its
complexity, while introducing hierarchical variable-rate feature extraction
properties similar to certain classes of deep learning architectures.
Asymptotic convergence analysis and experimental results are provided for
supervised and unsupervised learning problems.
Related papers
- Quantized Hierarchical Federated Learning: A Robust Approach to
Statistical Heterogeneity [3.8798345704175534]
We present a novel hierarchical federated learning algorithm that incorporates quantization for communication-efficiency.
We offer a comprehensive analytical framework to evaluate its optimality gap and convergence rate.
Our findings reveal that our algorithm consistently achieves high learning accuracy over a range of parameters.
arXiv Detail & Related papers (2024-03-03T15:40:24Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms.
With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem.
We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent
Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies.
A neural network is employed to act as structure prior and reveal the underlying signal interdependencies.
Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z) - A Differentiable Approach to Combinatorial Optimization using Dataless
Neural Networks [20.170140039052455]
We propose a radically different approach in that no data is required for training the neural networks that produce the solution.
In particular, we reduce the optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest.
arXiv Detail & Related papers (2022-03-15T19:21:31Z) - Towards the One Learning Algorithm Hypothesis: A System-theoretic
Approach [0.0]
The existence of a universal learning architecture in human cognition is a widely spread conjecture supported by experimental findings from neuroscience.
We develop a closed-loop system with three main components: (i) a multi-resolution analysis pre-processor, (ii) a group-invariant feature extractor, and (iii) a progressive knowledge-based learning module.
We introduce a novel learning algorithm that constructs progressively growing knowledge representations in multiple resolutions.
arXiv Detail & Related papers (2021-12-04T05:54:33Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - A Forward Backward Greedy approach for Sparse Multiscale Learning [0.0]
We propose a feature driven Reproducing Kernel Hilbert space (RKHS) for which the associated kernel has a weighted multiscale structure.
For generating approximations in this space, we provide a practical forward-backward algorithm that is shown to greedily construct a set of basis functions having a multiscale structure.
We analyze the performance of the approach on a variety of simulation and real data sets.
arXiv Detail & Related papers (2021-02-14T04:22:52Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.