Related papers: CDiNN -Convex Difference Neural Networks

CDiNN -Convex Difference Neural Networks

URL: http://arxiv.org/abs/2103.17231v1
Date: Wed, 31 Mar 2021 17:31:16 GMT
Title: CDiNN -Convex Difference Neural Networks
Authors: Parameswaran Sankaranarayanan and Raghunathan Rengaswamy
Abstract summary: Neural networks with ReLU activation function have been shown to be universal function approximators learn function mapping as non-smooth functions. New neural network architecture called ICNNs learn the output as a convex input.
Score: 0.8122270502556374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks with ReLU activation function have been shown to be universal function approximators and learn function mapping as non-smooth functions. Recently, there is considerable interest in the use of neural networks in applications such as optimal control. It is well-known that optimization involving non-convex, non-smooth functions are computationally intensive and have limited convergence guarantees. Moreover, the choice of optimization hyper-parameters used in gradient descent/ascent significantly affect the quality of the obtained solutions. A new neural network architecture called the Input Convex Neural Networks (ICNNs) learn the output as a convex function of inputs thereby allowing the use of efficient convex optimization methods. Use of ICNNs for determining the input for minimizing output has two major problems: learning of a non-convex function as a convex mapping could result in significant function approximation error, and we also note that the existing representations cannot capture simple dynamic structures like linear time delay systems. We attempt to address the above problems by introduction of a new neural network architecture, which we call the CDiNN, which learns the function as a difference of polyhedral convex functions from data. We also discuss that, in some cases, the optimal input can be obtained from CDiNN through difference of convex optimization with convergence guarantees and that at each iteration, the problem is reduced to a linear programming problem.

Related papers

Convexity in ReLU Neural Networks: beyond ICNNs? [17.01649106055384]
We show that every convex function implemented by a 1-hidden-layer ReLU network can be expressed by an ICNN with the same architecture.<n>We also provide a numerical procedure that allows an exact check of convexity for ReLU neural networks with a large number of affine regions.
arXiv Detail & Related papers (2025-01-06T13:53:59Z)
Enhancing GNNs Performance on Combinatorial Optimization by Recurrent Feature Update [0.09986418756990156]
We introduce a novel algorithm, denoted hereafter as QRF-GNN, leveraging the power of GNNs to efficiently solve Combinatorial optimization (CO) problems. It relies on unsupervised learning by minimizing the loss function derived from QUBO relaxation. Results of experiments show that QRF-GNN drastically surpasses existing learning-based approaches and is comparable to the state-of-the-art conventionals.
arXiv Detail & Related papers (2024-07-23T13:34:35Z)
Nonlinear functional regression by functional deep neural network with kernel embedding [20.306390874610635]
We propose a functional deep neural network with an efficient and fully data-dependent dimension reduction method. The architecture of our functional net consists of a kernel embedding step, a projection step, and a deep ReLU neural network for the prediction. The utilization of smooth kernel embedding enables our functional net to be discretization invariant, efficient, and robust to noisy observations.
arXiv Detail & Related papers (2024-01-05T16:43:39Z)
The limitation of neural nets for approximation and optimization [0.0]
We are interested in assessing the use of neural networks as surrogate models to approximate and minimize objective functions in optimization problems. Our study begins by determining the best activation function for approximating the objective functions of popular nonlinear optimization test problems.
arXiv Detail & Related papers (2023-11-21T00:21:15Z)
Linearization of ReLU Activation Function for Neural Network-Embedded Optimization:Optimal Day-Ahead Energy Scheduling [0.2900810893770134]
In some applications such as battery degradation neural network-based microgrid day-ahead energy scheduling, the input features of the trained learning model are variables to be solved in optimization models. The use of nonlinear activation functions in the neural network will make such problems extremely hard to solve if not unsolvable. This paper investigated different methods for linearizing the nonlinear activation functions with a particular focus on the widely used rectified linear unit (ReLU) function.
arXiv Detail & Related papers (2023-10-03T02:47:38Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions. Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost. We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z)
Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data. We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way. We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z)
Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent. We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network. We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z)
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks [23.038631072178735]
We consider a broad class of optimization algorithms that are commonly used in practice. As a consequence, we can leverage the convergence behavior of neural networks. We believe our approach can also be extended to other optimization algorithms and network theory.
arXiv Detail & Related papers (2020-10-25T17:10:22Z)
Channel Assignment in Uplink Wireless Communication using Machine Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.