Related papers: A Unified Framework for Lifted Training and Inversion Approaches

A Unified Framework for Lifted Training and Inversion Approaches

URL: http://arxiv.org/abs/2510.09796v1
Date: Fri, 10 Oct 2025 19:00:34 GMT
Title: A Unified Framework for Lifted Training and Inversion Approaches
Authors: Xiaoyu Wang, Alexandra Valavanis, Azhir Mahmood, Andreas Mang, Martin Benning, Audrey Repetti,
Abstract summary: This chapter introduces a unified framework that encapsulates various lifted training strategies.<n>We discuss the implementation of these methods using block-coordinate descent strategies.<n> Numerical results on standard imaging tasks validate the effectiveness and stability of the lifted Bregman approach.
Score: 42.951318906669506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The training of deep neural networks predominantly relies on a combination of gradient-based optimisation and back-propagation for the computation of the gradient. While incredibly successful, this approach faces challenges such as vanishing or exploding gradients, difficulties with non-smooth activations, and an inherently sequential structure that limits parallelisation. Lifted training methods offer an alternative by reformulating the nested optimisation problem into a higher-dimensional, constrained optimisation problem where the constraints are no longer enforced directly but penalised with penalty terms. This chapter introduces a unified framework that encapsulates various lifted training strategies, including the Method of Auxiliary Coordinates, Fenchel Lifted Networks, and Lifted Bregman Training, and demonstrates how diverse architectures, such as Multi-Layer Perceptrons, Residual Neural Networks, and Proximal Neural Networks fit within this structure. By leveraging tools from convex optimisation, particularly Bregman distances, the framework facilitates distributed optimisation, accommodates non-differentiable proximal activations, and can improve the conditioning of the training landscape. We discuss the implementation of these methods using block-coordinate descent strategies, including deterministic implementations enhanced by accelerated and adaptive optimisation techniques, as well as implicit stochastic gradient methods. Furthermore, we explore the application of this framework to inverse problems, detailing methodologies for both the training of specialised networks (e.g., unrolled architectures) and the stable inversion of pre-trained networks. Numerical results on standard imaging tasks validate the effectiveness and stability of the lifted Bregman approach compared to conventional training, particularly for architectures employing proximal activations.

Related papers

Multilevel Training for Kolmogorov Arnold Networks [1.3299507495084417]
Kolmogorov-Arnold networks (KANs) provide more structure by expanding learned activations in a specified basis.<n>This paper exploits this structure to develop practical algorithms and theoretical insights, yielding training speedup via multilevel training for KANs.
arXiv Detail & Related papers (2026-03-05T05:20:03Z)
Neural Network Training via Stochastic Alternating Minimization with Trainable Step Sizes [3.246129789918632]
The training of deep neural networks is inherently a non- optimization problem.<n>Standard approaches such as gradient descent (SGD) require simultaneous updates to parameters.<n>We propose a novel method Alternating Train Miniization with tailored step sizes (SAMT)<n>SAMT achieves better performance with fewer parameter updates compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-08-06T08:23:38Z)
Hierarchical Feature-level Reverse Propagation for Post-Training Neural Networks [24.442592456755698]
End-to-end autonomous driving has emerged as a dominant paradigm, yet its highly entangled black-box models pose challenges in terms of interpretability and safety assurance.<n>This paper proposes a hierarchical and decoupled post-training framework tailored for pretrained neural networks.
arXiv Detail & Related papers (2025-06-08T15:19:03Z)
Contextually Entangled Gradient Mapping for Optimized LLM Comprehension [0.0]
Entually Entangled Gradient Mapping (CEGM) introduces a new approach to gradient optimization.<n>It treats gradients as dynamic carriers of contextual dependencies rather than isolated numerical entities.<n>The proposed methodology bridges critical gaps in existing optimization strategies.
arXiv Detail & Related papers (2025-01-28T11:50:35Z)
Component-based Sketching for Deep ReLU Nets [55.404661149594375]
We develop a sketching scheme based on deep net components for various tasks. We transform deep net training into a linear empirical risk minimization problem. We show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions.
arXiv Detail & Related papers (2024-09-21T15:30:43Z)
Robust Stochastically-Descending Unrolled Networks [85.6993263983062]
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network.<n>We show that convergence guarantees and generalizability of the unrolled networks are still open theoretical problems.<n>We numerically assess unrolled architectures trained under the proposed constraints in two different applications.
arXiv Detail & Related papers (2023-12-25T18:51:23Z)
Multiplicative update rules for accelerating deep learning training and increasing robustness [69.90473612073767]
We propose an optimization framework that fits to a wide range of machine learning algorithms and enables one to apply alternative update rules. We claim that the proposed framework accelerates training, while leading to more robust models in contrast to traditionally used additive update rule.
arXiv Detail & Related papers (2023-07-14T06:44:43Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
Better Training using Weight-Constrained Stochastic Dynamics [0.0]
We employ constraints to control the parameter space of deep neural networks throughout training. The use of customized, appropriately designed constraints can reduce the vanishing/exploding problem. We provide a general approach to efficiently incorporate constraints into a gradient Langevin framework.
arXiv Detail & Related papers (2021-06-20T14:41:06Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.