Related papers: Partition-based formulations for mixed-integer optimization of trained ReLU neural networks

Partition-based formulations for mixed-integer optimization of trained ReLU neural networks

URL: http://arxiv.org/abs/2102.04373v1
Date: Mon, 8 Feb 2021 17:27:34 GMT
Title: Partition-based formulations for mixed-integer optimization of trained ReLU neural networks
Authors: Calvin Tsay and Jan Kronqvist and Alexander Thebelt and Ruth Misener
Abstract summary: This paper introduces a class of mixed-integer formulations for trained ReLU neural networks. At one extreme, one partition per input recovers the convex hull of a node, i.e., the tightest possible formulation for each node.
Score: 66.88252321870085
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a class of mixed-integer formulations for trained ReLU neural networks. The approach balances model size and tightness by partitioning node inputs into a number of groups and forming the convex hull over the partitions via disjunctive programming. At one extreme, one partition per input recovers the convex hull of a node, i.e., the tightest possible formulation for each node. For fewer partitions, we develop smaller relaxations that approximate the convex hull, and show that they outperform existing formulations. Specifically, we propose strategies for partitioning variables based on theoretical motivations and validate these strategies using extensive computational experiments. Furthermore, the proposed scheme complements known algorithmic approaches, e.g., optimization-based bound tightening captures dependencies within a partition.

Related papers

Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping [7.691755449724638]
Reinforcement learning often needs to deal with the exponential growth of states and actions in high-dimensional spaces. We learn the inherent structure of action-wise similar MDP to appropriately balance the performance degradation versus sample/computational complexity.
arXiv Detail & Related papers (2023-06-22T15:40:10Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver. This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z)
Probabilistic partition of unity networks for high-dimensional regression problems [1.0227479910430863]
We explore the partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems. We propose a general framework focusing on adaptive dimensionality reduction. The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments.
arXiv Detail & Related papers (2022-10-06T06:01:36Z)
P-split formulations: A class of intermediate formulations between big-M and convex hull for disjunctive constraints [6.258016078705562]
"P-split" formulations are derived for disjunctive constraints with convex constraints within each disjuct. We show that the P-split formulations form a hierarchy starting on a big-M equivalent and converging to the convex hull. We computationally compare the P-split formulations against big-M and convex hull formulations on 344 test instances.
arXiv Detail & Related papers (2022-02-10T18:02:16Z)
A Parallelizable Lattice Rescoring Strategy with Neural Language Models [62.20538383769179]
A posterior-based lattice expansion algorithm is proposed for efficient lattice rescoring with neural language models (LMs) for automatic speech recognition. Experiments on the Switchboard dataset show that the proposed rescoring strategy obtains comparable recognition performance. The parallel rescoring method offers more flexibility by simplifying the integration of PyTorch-trained neural LMs for lattice rescoring with Kaldi.
arXiv Detail & Related papers (2021-03-08T21:23:12Z)
Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z)
FedSplit: An algorithmic framework for fast federated optimization [40.42352500741025]
We introduce FedSplit, a class of algorithms for solving distributed convex minimization with additive structure. Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
arXiv Detail & Related papers (2020-05-11T16:30:09Z)
Model Reduction and Neural Networks for Parametric PDEs [9.405458160620533]
We develop a framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning. For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology.
arXiv Detail & Related papers (2020-05-07T00:09:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.