Partition-based formulations for mixed-integer optimization of trained
ReLU neural networks
- URL: http://arxiv.org/abs/2102.04373v1
- Date: Mon, 8 Feb 2021 17:27:34 GMT
- Title: Partition-based formulations for mixed-integer optimization of trained
ReLU neural networks
- Authors: Calvin Tsay and Jan Kronqvist and Alexander Thebelt and Ruth Misener
- Abstract summary: This paper introduces a class of mixed-integer formulations for trained ReLU neural networks.
At one extreme, one partition per input recovers the convex hull of a node, i.e., the tightest possible formulation for each node.
- Score: 66.88252321870085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a class of mixed-integer formulations for trained ReLU
neural networks. The approach balances model size and tightness by partitioning
node inputs into a number of groups and forming the convex hull over the
partitions via disjunctive programming. At one extreme, one partition per input
recovers the convex hull of a node, i.e., the tightest possible formulation for
each node. For fewer partitions, we develop smaller relaxations that
approximate the convex hull, and show that they outperform existing
formulations. Specifically, we propose strategies for partitioning variables
based on theoretical motivations and validate these strategies using extensive
computational experiments. Furthermore, the proposed scheme complements known
algorithmic approaches, e.g., optimization-based bound tightening captures
dependencies within a partition.
Related papers
- Achieving Sample and Computational Efficient Reinforcement Learning by
Action Space Reduction via Grouping [7.691755449724638]
Reinforcement learning often needs to deal with the exponential growth of states and actions in high-dimensional spaces.
We learn the inherent structure of action-wise similar MDP to appropriately balance the performance degradation versus sample/computational complexity.
arXiv Detail & Related papers (2023-06-22T15:40:10Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Probabilistic partition of unity networks for high-dimensional
regression problems [1.0227479910430863]
We explore the partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems.
We propose a general framework focusing on adaptive dimensionality reduction.
The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments.
arXiv Detail & Related papers (2022-10-06T06:01:36Z) - P-split formulations: A class of intermediate formulations between big-M and convex hull for disjunctive constraints [6.258016078705562]
"P-split" formulations are derived for disjunctive constraints with convex constraints within each disjuct.
We show that the P-split formulations form a hierarchy starting on a big-M equivalent and converging to the convex hull.
We computationally compare the P-split formulations against big-M and convex hull formulations on 344 test instances.
arXiv Detail & Related papers (2022-02-10T18:02:16Z) - A Parallelizable Lattice Rescoring Strategy with Neural Language Models [62.20538383769179]
A posterior-based lattice expansion algorithm is proposed for efficient lattice rescoring with neural language models (LMs) for automatic speech recognition.
Experiments on the Switchboard dataset show that the proposed rescoring strategy obtains comparable recognition performance.
The parallel rescoring method offers more flexibility by simplifying the integration of PyTorch-trained neural LMs for lattice rescoring with Kaldi.
arXiv Detail & Related papers (2021-03-08T21:23:12Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - FedSplit: An algorithmic framework for fast federated optimization [40.42352500741025]
We introduce FedSplit, a class of algorithms for solving distributed convex minimization with additive structure.
Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
arXiv Detail & Related papers (2020-05-11T16:30:09Z) - Model Reduction and Neural Networks for Parametric PDEs [9.405458160620533]
We develop a framework for data-driven approximation of input-output maps between infinite-dimensional spaces.
The proposed approach is motivated by the recent successes of neural networks and deep learning.
For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology.
arXiv Detail & Related papers (2020-05-07T00:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.