Primitive Agentic First-Order Optimization
- URL: http://arxiv.org/abs/2406.04841v1
- Date: Fri, 7 Jun 2024 11:13:38 GMT
- Title: Primitive Agentic First-Order Optimization
- Authors: R. Sala,
- Abstract summary: This work presents a proof-of-concept study combining primitive state representations and agent-environment interactions as first-order reinforcement learning.
The results show that elementary RL methods combined with succinct partial state representations can be used as optimizeds manage complexity in RL-based optimization.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient numerical optimization methods can improve performance and reduce the environmental impact of computing in many applications. This work presents a proof-of-concept study combining primitive state representations and agent-environment interactions as first-order optimizers in the setting of budget-limited optimization. Through reinforcement learning (RL) over a set of training instances of an optimization problem class, optimal policies for sequential update selection of algorithmic iteration steps are approximated in generally formulated low-dimensional partial state representations that consider aspects of progress and resource use. For the investigated case studies, deployment of the trained agents to unseen instances of the quadratic optimization problem classes outperformed conventional optimal algorithms with optimized hyperparameters. The results show that elementary RL methods combined with succinct partial state representations can be used as heuristics to manage complexity in RL-based optimization, paving the way for agentic optimization approaches.
Related papers
- Learning Joint Models of Prediction and Optimization [56.04498536842065]
Predict-Then-Then framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving.
This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by joint predictive models.
arXiv Detail & Related papers (2024-09-07T19:52:14Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning [5.325297567945828]
We propose a new method for two-time-scale optimization that achieves significantly faster convergence than the prior arts.
We characterize the proposed algorithm under various conditions and show how it specializes on online sample-based methods.
arXiv Detail & Related papers (2024-05-15T19:03:08Z) - Towards Efficient Exact Optimization of Language Model Alignment [93.39181634597877]
Direct preference optimization (DPO) was proposed to directly optimize the policy from preference data.
We show that DPO derived based on the optimal solution of problem leads to a compromised mean-seeking approximation of the optimal solution in practice.
We propose efficient exact optimization (EXO) of the alignment objective.
arXiv Detail & Related papers (2024-02-01T18:51:54Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Teaching Networks to Solve Optimization Problems [13.803078209630444]
We propose to replace the iterative solvers altogether with a trainable parametric set function.
We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems.
arXiv Detail & Related papers (2022-02-08T19:13:13Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Tiering as a Stochastic Submodular Optimization Problem [5.659969270836789]
Tiering is an essential technique for building large-scale information retrieval systems.
We show that the optimal tiering as an optimization problem can be cast as a submodular minimization problem with a submodular knapsack constraint.
arXiv Detail & Related papers (2020-05-16T07:39:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.