Learning to Evolve with Convergence Guarantee via Neural Unrolling
- URL: http://arxiv.org/abs/2512.11453v1
- Date: Fri, 12 Dec 2025 10:46:25 GMT
- Title: Learning to Evolve with Convergence Guarantee via Neural Unrolling
- Authors: Jiaxin Gao, Yaohua Liu, Ran Cheng, Kay Chen Tan,
- Abstract summary: We introduce Learning to Evolve (L2E), a unified bilevel meta-optimization framework.<n>L2E reformulates evolutionary search as a Neural Unrolling process grounded in Krasnosel'skii-Mann (KM) fixed-point theory.<n>Experiments demonstrate the scalability of L2E in high-dimensional spaces and its robust zero-shot generalization across synthetic and real-world control tasks.
- Score: 37.99564850768798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The transition from hand-crafted heuristics to data-driven evolutionary algorithms faces a fundamental dilemma: achieving neural plasticity without sacrificing mathematical stability. Emerging learned optimizers demonstrate high adaptability. However, they often lack rigorous convergence guarantees. This deficiency results in unpredictable behaviors on unseen landscapes. To address this challenge, we introduce Learning to Evolve (L2E), a unified bilevel meta-optimization framework. This method reformulates evolutionary search as a Neural Unrolling process grounded in Krasnosel'skii-Mann (KM) fixed-point theory. First, L2E models a coupled dynamic system in which the inner loop enforces a strict contractive trajectory via a structured Mamba-based neural operator. Second, the outer loop optimizes meta-parameters to align the fixed point of the operator with the target objective minimizers. Third, we design a gradient-derived composite solver that adaptively fuses learned evolutionary proposals with proxy gradient steps, thereby harmonizing global exploration with local refinement. Crucially, this formulation provides the learned optimizer with provable convergence guarantees. Extensive experiments demonstrate the scalability of L2E in high-dimensional spaces and its robust zero-shot generalization across synthetic and real-world control tasks. These results confirm that the framework learns a generic optimization manifold that extends beyond specific training distributions.
Related papers
- Towards a Unified Analysis of Neural Networks in Nonparametric Instrumental Variable Regression: Optimization and Generalization [66.08522228989634]
We establish the first global convergence result of neural networks for two stage least squares (2SLS) approach in nonparametric instrumental variable regression (NPIV)<n>This is achieved by adopting a lifted perspective through mean-field Langevin dynamics (MFLD)
arXiv Detail & Related papers (2025-11-18T17:51:17Z) - Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics [13.621560002904873]
Learnable SMPLify is a neural framework that replaces the iterative fitting process in SMPLify with a single-pass regression model.<n>It achieves nearly 200x faster runtime compared to SMPLify, generalizes well to unseen 3DPW and RICH, and operates as a model-agnostic manner when used as a plug-in tool on LucidAction.
arXiv Detail & Related papers (2025-08-19T06:53:57Z) - On the Convergence of Adam-Type Algorithm for Bilevel Optimization under Unbounded Smoothness [15.656614304616006]
We introduce AdamBO, a single-loop Adam-type method that achieves $wide.<n>We conduct experiments on various machine learning tasks involving bilevel.<n> formulations with recurrent neural networks (RNNs) and transformers.
arXiv Detail & Related papers (2025-03-05T21:16:59Z) - Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [55.95767828747407]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z) - Learning Provably Improves the Convergence of Gradient Descent [6.777975824808536]
Learn quadratic to optimize (L2O) deep network-based solvers for optimization.<n>We show that L2O lacks theoretical backing for its own convergence framework.<n>We propose a deterministic strategy to support our theoretical results.
arXiv Detail & Related papers (2025-01-30T02:03:30Z) - A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.