A General Neural Backbone for Mixed-Integer Linear Optimization via Dual Attention
- URL: http://arxiv.org/abs/2601.04509v1
- Date: Thu, 08 Jan 2026 02:23:47 GMT
- Title: A General Neural Backbone for Mixed-Integer Linear Optimization via Dual Attention
- Authors: Peixin Huang, Yaoxin Wu, Yining Ma, Cathy Wu, Wen Song, Wei Zhang,
- Abstract summary: Mixed-integer linear programming (MILP) is a widely used modeling framework for optimization.<n>Recent advances in deep learning address this challenge by representing MILP instances as variable-constraint bipartite graphs.<n>We present an attention-driven neural architecture that learns expressive representations beyond the pure graph view.
- Score: 33.27281529953169
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mixed-integer linear programming (MILP), a widely used modeling framework for combinatorial optimization, are central to many scientific and engineering applications, yet remains computationally challenging at scale. Recent advances in deep learning address this challenge by representing MILP instances as variable-constraint bipartite graphs and applying graph neural networks (GNNs) to extract latent structural patterns and enhance solver efficiency. However, this architecture is inherently limited by the local-oriented mechanism, leading to restricted representation power and hindering neural approaches for MILP. Here we present an attention-driven neural architecture that learns expressive representations beyond the pure graph view. A dual-attention mechanism is designed to perform parallel self- and cross-attention over variables and constraints, enabling global information exchange and deeper representation learning. We apply this general backbone to various downstream tasks at the instance level, element level, and solving state level. Extensive experiments across widely used benchmarks show consistent improvements of our approach over state-of-the-art baselines, highlighting attention-based neural architectures as a powerful foundation for learning-enhanced mixed-integer linear optimization.
Related papers
- Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs [10.433254687685038]
Graph neural networks (GNNs) have emerged as powerful tools for learning protein structures.<n>We propose an efficient multiscale graph-based learning framework tailored to proteins.
arXiv Detail & Related papers (2026-01-31T18:50:24Z) - Multiscale Aggregated Hierarchical Attention (MAHA): A Game Theoretic and Optimization Driven Approach to Efficient Contextual Modeling in Large Language Models [0.0]
Multiscale Aggregated Hierarchical Attention (MAHA) is a novel architectural framework that reformulates the attention mechanism through hierarchical decomposition and mathematically rigorous aggregation.<n>MAHA dynamically partitions the input sequence into hierarchical scales via learnable downsampling operators.<n> Experimental evaluations demonstrate that MAHA achieves superior scalability; empirical FLOPs analysis confirms an 81% reduction in computational cost at a sequence length of 4096 compared to standard attention.
arXiv Detail & Related papers (2025-12-16T21:27:21Z) - Efficient Attention Mechanisms for Large Language Models: A Survey [18.86171225316892]
Transformer-based architectures have become the prevailing computation backbone of large language models.<n>Recent research has introduced two principal categories of efficient attention mechanisms.<n>Sparse attention techniques, in contrast, limit attention to selected subsets of tokens based on fixed patterns, block-wise routing, or clustering strategies.
arXiv Detail & Related papers (2025-07-25T18:08:10Z) - Self-Supervised Graph Learning via Spectral Bootstrapping and Laplacian-Based Augmentations [1.0377683220196872]
We present LaplaceGNN, a novel self-supervised graph learning framework.<n>Our method integrates Laplacian-based signals into the learning process.<n>LaplaceGNN achieves superior performance compared to state-of-the-art self-supervised graph methods.
arXiv Detail & Related papers (2025-06-25T12:23:23Z) - ScaleGNN: Towards Scalable Graph Neural Networks via Adaptive High-order Neighboring Feature Fusion [73.85920403511706]
We propose ScaleGNN, a novel framework that adaptively fuses multi-hop node features for scalable and effective graph learning.<n>We show that ScaleGNN consistently outperforms state-of-the-art GNNs in both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-04-22T14:05:11Z) - Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers [0.0]
We reformulate the Transformer's attention mechanism as a graph operation.<n>We introduce Sparse GIN-Attention, a fine-tuning approach that employs sparse GINs.
arXiv Detail & Related papers (2025-01-04T22:30:21Z) - Efficient High-Resolution Visual Representation Learning with State Space Model for Human Pose Estimation [60.80423207808076]
Capturing long-range dependencies while preserving high-resolution visual representations is crucial for dense prediction tasks such as human pose estimation.<n>We propose the Dynamic Visual State Space (DVSS) block, which augments visual state space models with multi-scale convolutional operations.<n>We build HRVMamba, a novel model for efficient high-resolution representation learning.
arXiv Detail & Related papers (2024-10-04T06:19:29Z) - Spatiotemporal Graph Learning with Direct Volumetric Information Passing and Feature Enhancement [62.91536661584656]
We propose a dual-module framework, Cell-embedded and Feature-enhanced Graph Neural Network (aka, CeFeGNN) for learning.<n>We embed learnable cell attributions to the common node-edge message passing process, which better captures the spatial dependency of regional features.<n>Experiments on various PDE systems and one real-world dataset demonstrate that CeFeGNN achieves superior performance compared with other baselines.
arXiv Detail & Related papers (2024-09-26T16:22:08Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - GraphLearner: Graph Node Clustering with Fully Learnable Augmentation [76.63963385662426]
Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters.
We propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner.
It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC.
arXiv Detail & Related papers (2022-12-07T10:19:39Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent
Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies.
A neural network is employed to act as structure prior and reveal the underlying signal interdependencies.
Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.