Related papers: How hard is learning to cut? Trade-offs and sample complexity

How hard is learning to cut? Trade-offs and sample complexity

URL: http://arxiv.org/abs/2506.00252v1
Date: Fri, 30 May 2025 21:41:01 GMT
Title: How hard is learning to cut? Trade-offs and sample complexity
Authors: Sammy Khalife, Andrea Lodi,
Abstract summary: We present new sample complexity lower bounds, valid for both scores.<n>We show that for a wide family of classes $mathcalF$ that maps an instance to a cut, learning over an unknown distribution of the instances to minimize those scores requires at least (up to multiplicative constants) as many samples as learning from the same class function.<n>Our results also extend to the case of learning from a restricted set of cuts, namely those from the Simplex tableau.
Score: 9.721177829831316
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the recent years, branch-and-cut algorithms have been the target of data-driven approaches designed to enhance the decision making in different phases of the algorithm such as branching, or the choice of cutting planes (cuts). In particular, for cutting plane selection two score functions have been proposed in the literature to evaluate the quality of a cut: branch-and-cut tree size and gap closed. In this paper, we present new sample complexity lower bounds, valid for both scores. We show that for a wide family of classes $\mathcal{F}$ that maps an instance to a cut, learning over an unknown distribution of the instances to minimize those scores requires at least (up to multiplicative constants) as many samples as learning from the same class function $\mathcal{F}$ any generic target function (using square loss). Our results also extend to the case of learning from a restricted set of cuts, namely those from the Simplex tableau. To the best of our knowledge, these constitute the first lower bounds for the learning-to-cut framework. We compare our bounds to known upper bounds in the case of neural networks and show they are nearly tight. We illustrate our results with a graph neural network selection evaluated on set covering and facility location integer programming models and we empirically show that the gap closed score is an effective proxy to minimize the branch-and-cut tree size. Although the gap closed score has been extensively used in the integer programming literature, this is the first principled analysis discussing both scores at the same time both theoretically and computationally.

Related papers

Learning Cut Generating Functions for Integer Programming [1.1510009152620668]
The branch-and-cut algorithm is the method of choice to solve large scale integer programming problems. Recent advances have employed a data-driven approach to select optimal cutting planes from a parameterized family. We show that the selected CGF can outperform the GMI cuts for certain distributions.
arXiv Detail & Related papers (2024-05-22T20:56:34Z)
Sample Complexity of Algorithm Selection Using Neural Networks and Its Applications to Branch-and-Cut [1.4624458429745086]
We build upon recent work in this line of research by considering the setup where, instead of selecting a single algorithm that has the best performance, we allow the possibility of selecting an algorithm based on the instance to be solved. In particular, given a representative sample of instances, we learn a neural network that maps an instance of the problem to the most appropriate algorithm for that instance. In other words, the neural network will take as input a mixed-integer optimization instance and output a decision that will result in a small branch-and-cut tree for that instance.
arXiv Detail & Related papers (2024-02-04T03:03:27Z)
Lookback for Learning to Branch [77.32867454769936]
Bipartite Graph Neural Networks (GNNs) have been shown to be an important component of deep learning based Mixed-Integer Linear Program (MILP) solvers. Recent works have demonstrated the effectiveness of such GNNs in replacing the branching (variable selection) in branch-and-bound (B&B) solvers.
arXiv Detail & Related papers (2022-06-30T02:33:32Z)
A Unified Framework for Implicit Sinkhorn Differentiation [58.56866763433335]
We propose an algorithm that obtains analytical gradients of a Sinkhorn layer via implicit differentiation. We show that it is computationally more efficient, particularly when resources like GPU memory are scarce.
arXiv Detail & Related papers (2022-05-13T14:45:31Z)
Structural Analysis of Branch-and-Cut and the Learnability of Gomory Mixed Integer Cuts [88.94020638263467]
The incorporation of cutting planes within the branch-and-bound algorithm, known as branch-and-cut, forms the backbone of modern integer programming solvers. We conduct a novel structural analysis of branch-and-cut that pins down how every step of the algorithm is affected by changes in the parameters defining the cutting planes added to the input integer program. Our main application of this analysis is to derive sample complexity guarantees for using machine learning to determine which cutting planes to apply during branch-and-cut.
arXiv Detail & Related papers (2022-04-15T03:32:40Z)
Sample Complexity of Tree Search Configuration: Cutting Planes and Beyond [98.92725321081994]
State-of-the-art solvers integrate a myriad of cutting-plane techniques to speed up the underlying tree-search algorithm. We prove the first guarantees for learning high-performing cut-selection policies tailored to the instance distribution.
arXiv Detail & Related papers (2021-06-08T00:57:59Z)
Graph Partitioning and Sparse Matrix Ordering using Reinforcement Learning [0.13999481573773068]
We present a novel method for graph partitioning based on reinforcement learning and graph convolutional neural networks. The proposed method achieves similar partitioning quality than METIS and Scotch. The method generalizes from one class of graphs to another, and works well on a variety of graphs from the SuiteSparse sparse matrix collection.
arXiv Detail & Related papers (2021-04-08T06:54:24Z)
Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives. A ranked range is a consecutive sequence of sorted values of a set of real numbers. We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z)
Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery. We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability. We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z)
Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent [25.589302381660453]
We show that any trained using gradient descent with respect to square-loss distribution will fail to achieve small test error in time. For classification, we give a stronger result, namely that any statistical query (SQ) will fail to achieve small test error in time.
arXiv Detail & Related papers (2020-06-22T05:15:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.