Related papers: FORGE: Foundational Optimization Representations from Graph Embeddings

FORGE: Foundational Optimization Representations from Graph Embeddings

URL: http://arxiv.org/abs/2508.20330v4
Date: Wed, 24 Sep 2025 15:30:04 GMT
Title: FORGE: Foundational Optimization Representations from Graph Embeddings
Authors: Zohair Shafi, Serdar Kadioglu,
Abstract summary: Existing learning-based methods require training dedicated models for each problem distribution.<n>We introduce Forge: Foundational Optimization Representations from Graph Embeddings, a framework that pre-trains a vector-quantized graph autoencoder.<n>In both tasks, we improve the performance of a commercial optimization solver and outperform state-of-the-art learning-based methods.
Score: 3.9124823111588163
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Combinatorial optimization problems are ubiquitous in science and engineering. Still, learning-based approaches to accelerate combinatorial optimization often require solving a large number of difficult instances to collect training data, incurring significant computational cost. Existing learning-based methods require training dedicated models for each problem distribution, for each downstream task, severely limiting their scalability and generalization. We introduce Forge: Foundational Optimization Representations from Graph Embeddings, a framework that pre-trains a vector-quantized graph autoencoder on a large, diverse collection of mixed-integer programming (MIP) instances in an unsupervised manner, without relying on optimization solvers or optimal solutions. Vector quantization produces discrete code assignments that serve as a vocabulary for representing optimization instances. We evaluate Forge in both unsupervised and supervised settings. In the unsupervised setting, Forge embeddings effectively cluster unseen instances across problem domains and sizes. In the supervised setting, we fine-tune Forge embeddings and show that a single pre-trained model helps predicting both the integrality gap for cut-generation and variable hints for search guidance across multiple problem and size distributions. In both tasks, we improve the performance of a commercial optimization solver and outperform state-of-the-art learning-based methods. Finally, we open-source our training code, pre-trained Forge weights, and embeddings for multiple MIP distributions to foster further research in representation learning for optimization problems.

Related papers

A Distributed Training Architecture For Combinatorial Optimization [0.0]
We propose a distributed graph neural network (GNN)-based training framework for optimization.<n>Experiments are conducted on both real large-scale social network datasets and synthetically generated high-complexity graphs.<n>Our framework outperforms state-of-the-art approaches in both solution quality and computational efficiency.
arXiv Detail & Related papers (2025-11-12T12:22:10Z)
Self-Supervised Pre-Training with Equilibrium Constraints [64.15709757611369]
We propose a new self-supervised pre-training approach to dealing with heterogeneous data.<n>We impose additional equilibrium constraints to ensure that the models optimize each source of heterogeneous data to its local optima.<n> Experiments are carried out on self-supervised pre-training using multi-domain and multilingual datasets.
arXiv Detail & Related papers (2025-08-27T15:48:50Z)
Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data [23.661713049508375]
We propose an algorithm that learns over a submanifold in the setting of a client. We show that our proposed algorithm converges sub-ly to a neighborhood of a first-order optimal solution by using a novel analysis.
arXiv Detail & Related papers (2024-06-12T17:53:28Z)
Model Ensembling for Constrained Optimization [7.4351710906830375]
We consider a setting in which we wish to ensemble models for multidimensional output predictions that are in turn used for downstream optimization. More precisely, we imagine we are given a number of models mapping a state space to multidimensional real-valued predictions. These predictions form the coefficients of a linear objective that we would like to optimize under specified constraints. We apply multicalibration techniques that lead to two provably efficient and convergent algorithms.
arXiv Detail & Related papers (2024-05-27T01:48:07Z)
Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form. This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method. A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z)
Learning to Branch in Combinatorial Optimization with Graph Pointer Networks [17.729352126574902]
This paper proposes a graph pointer network model for learning the variable selection policy in the branch-and-bound. The proposed model, which combines the graph neural network and the pointer mechanism, can effectively map from the solver state to the branching variable decisions.
arXiv Detail & Related papers (2023-07-04T01:56:07Z)
Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver. This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z)
Unsupervised Learning for Combinatorial Optimization Needs Meta-Learning [14.86600327306136]
A general framework of unsupervised learning for optimization (CO) is to train a neural network (NN) whose output gives a problem solution by directly optimizing the CO objective. We propose a new objective of unsupervised learning for CO where the goal of learning is to search for good initialization for future problem instances rather than give direct solutions. We observe that even just the initial solution given by our model before fine-tuning can significantly outperform the baselines under various evaluation settings.
arXiv Detail & Related papers (2023-01-08T22:14:59Z)
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems. We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z)
A Novel Plug-and-Play Approach for Adversarially Robust Generalization [38.72514422694518]
We propose a robust framework that employs adversarially robust training to safeguard the ML models against perturbed testing data. Our contributions can be seen from both computational and statistical perspectives.
arXiv Detail & Related papers (2022-08-19T17:02:55Z)
On a class of geodesically convex optimization problems solved via Euclidean MM methods [50.428784381385164]
We show how a difference of Euclidean convexization functions can be written as a difference of different types of problems in statistics and machine learning. Ultimately, we helps the broader broader the broader the broader the broader the work.
arXiv Detail & Related papers (2022-06-22T23:57:40Z)
A Differentiable Approach to Combinatorial Optimization using Dataless Neural Networks [20.170140039052455]
We propose a radically different approach in that no data is required for training the neural networks that produce the solution. In particular, we reduce the optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest.
arXiv Detail & Related papers (2022-03-15T19:21:31Z)
Tutorial on amortized optimization [13.812741777376711]
This tutorial presents an introduction to the amortized optimization foundations behind these advancements.<n>It overviews their applications in variational inference, sparse coding, gradient-based meta-learning, control, reinforcement learning, convex optimization, optimal transport, and deep equilibrium networks.
arXiv Detail & Related papers (2022-02-01T18:58:33Z)
Pretrained Cost Model for Distributed Constraint Optimization Problems [37.79733538931925]
Distributed Constraint Optimization Problems (DCOPs) are an important subclass of optimization problems. We propose a novel directed acyclic graph schema representation for DCOPs and leverage the Graph Attention Networks (GATs) to embed graph representations. Our model, GAT-PCM, is then pretrained with optimally labelled data in an offline manner, so as to boost a broad range of DCOP algorithms.
arXiv Detail & Related papers (2021-12-08T09:24:10Z)
Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances. Online descent (OGD) is a popular approach to handle streaming data in pairwise learning. In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z)
Memory Clustering using Persistent Homology for Multimodality- and Discontinuity-Sensitive Learning of Optimal Control Warm-starts [24.576214898129823]
Shooting methods are an efficient approach to solving nonlinear optimal control problems. Recent work has focused on providing an initial guess from a learned model trained on samples generated during an offline exploration of the problem space. In this work, we apply tools from algebraic topology to extract information on the underlying structure of the solution space.
arXiv Detail & Related papers (2020-10-02T14:24:59Z)
Automatically Learning Compact Quality-aware Surrogates for Optimization Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values. Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making. We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z)
Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points. The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding. In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
Dynamic Scale Training for Object Detection [111.33112051962514]
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection. Experimental results demonstrate the efficacy of our proposed DST towards scale variation handling. It does not introduce inference overhead and could serve as a free lunch for general detection configurations.
arXiv Detail & Related papers (2020-04-26T16:48:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.