Related papers: Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks

Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks

URL: http://arxiv.org/abs/2504.14356v1
Date: Sat, 19 Apr 2025 16:50:17 GMT
Title: Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Authors: Masoud Ataei, Edrin Hasaj, Jacob Gipp, Sepideh Forouzi,
Abstract summary: This paper presents a unified mixed-integer programming framework for training sparse and interpretable neural networks.<n>By incorporating considerations of interpretability, sparsity, and verifiability directly into the training process, the proposed framework bridges a range of research areas.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a unified mixed-integer programming framework for training sparse and interpretable neural networks. We develop exact formulations for both fully connected and convolutional architectures by modeling nonlinearities such as ReLU activations through binary variables and encoding structural sparsity via filter- and layer-level pruning constraints. The resulting models integrate parameter learning, architecture selection, and structural regularization within a single optimization problem, yielding globally optimal solutions with respect to a composite objective that balances prediction accuracy, weight sparsity, and architectural compactness. The mixed-integer programming formulation accommodates piecewise-linear operations, including max pooling and activation gating, and permits precise enforcement of logic-based or domain-specific constraints. By incorporating considerations of interpretability, sparsity, and verifiability directly into the training process, the proposed framework bridges a range of research areas including explainable artificial intelligence, symbolic reasoning, and formal verification.

Related papers

STRCMP: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization [18.162186876640764]
Combinatorial optimization (CO) problems, central to operation research and theoretical computer science, present significant computational challenges due to their NP-hard nature.<n>We propose STRCMP, a novel structure-aware algorithm discovery framework that systematically integrates structure priors to enhance solution quality and solving efficiency.<n>Our framework combines a graph neural network (GNN) for extracting structural embeddings from CO instances with an LLM conditioned on these embeddings to identify high-performing algorithms in the form of solver-specific codes.
arXiv Detail & Related papers (2025-05-22T15:37:42Z)
Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach [6.449961842220686]
We propose a bi-level solution framework balancing optimality with computational efficiency.<n>Our framework achieves comparable or superior performance, reducing computational bubbles by half under the same memory constraints.<n>Such capabilities position our solution as both a valuable research tool for exploring optimal parallelization strategies and a practical industrial solution for large-scale AI deployment.
arXiv Detail & Related papers (2025-03-12T13:00:29Z)
Context-Aware Neural Gradient Mapping for Fine-Grained Instruction Processing [0.0]
This paper introduces a dynamic gradient adjustment mechanism, incorporating contextual embeddings directly into the optimization process. The proposed framework consistently outperforms baseline models across various metrics, including accuracy, robustness to noise, and computational efficiency. The integration of context-specific embeddings allows for a more complex understanding of language, thereby improving the model's ability to handle diverse linguistic phenomena.
arXiv Detail & Related papers (2025-01-24T21:49:24Z)
A domain decomposition-based autoregressive deep learning model for unsteady and nonlinear partial differential equations [2.7755345520127936]
We propose a domain-decomposition-based deep learning (DL) framework, named CoMLSim, for accurately modeling unsteady and nonlinear partial differential equations (PDEs) The framework consists of two key components: (a) a convolutional neural network (CNN)-based autoencoder architecture and (b) an autoregressive model composed of fully connected layers.
arXiv Detail & Related papers (2024-08-26T17:50:47Z)
A Neural Rewriting System to Solve Algorithmic Problems [47.129504708849446]
We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies.
arXiv Detail & Related papers (2024-02-27T10:57:07Z)
SymbolicAI: A framework for logic-based approaches combining generative models and solvers [9.841285581456722]
We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes. We treat large language models (LLMs) as semantic solvers that execute tasks based on both natural and formal language instructions.
arXiv Detail & Related papers (2024-02-01T18:50:50Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task.<n>Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Annealing Optimization for Progressive Learning with Stochastic Approximation [0.0]
We introduce a learning model designed to meet the needs of applications in which computational resources are limited. We develop an online prototype-based learning algorithm that is formulated as an online-free gradient approximation algorithm. The learning model can be viewed as an interpretable and progressively growing competitive neural network model to be used for supervised, unsupervised, and reinforcement learning.
arXiv Detail & Related papers (2022-09-06T21:31:01Z)
Semantic Probabilistic Layers for Neuro-Symbolic Learning [83.25785999205932]
We design a predictive layer for structured-output prediction (SOP) It can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space.
arXiv Detail & Related papers (2022-06-01T12:02:38Z)
Neural Combinatorial Optimization: a New Player in the Field [69.23334811890919]
This paper presents a critical analysis on the incorporation of algorithms based on neural networks into the classical optimization framework. A comprehensive study is carried out to analyse the fundamental aspects of such algorithms, including performance, transferability, computational cost and to larger-sized instances.
arXiv Detail & Related papers (2022-05-03T07:54:56Z)
Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies. A neural network is employed to act as structure prior and reveal the underlying signal interdependencies. Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z)
Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices. Previous unstructured or structured weight pruning methods can hardly truly accelerate inference. We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z)
Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn. We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
Local Propagation in Constraint-based Neural Network [77.37829055999238]
We study a constraint-based representation of neural network architectures. We investigate a simple optimization procedure that is well suited to fulfil the so-called architectural constraints.
arXiv Detail & Related papers (2020-02-18T16:47:38Z)
Semi-Structured Distributional Regression -- Extending Structured Additive Models by Arbitrary Deep Neural Networks and Data Modalities [0.0]
We propose a general framework to combine structured regression models and deep neural networks into a unifying network architecture. We demonstrate the framework's efficacy in numerical experiments and illustrate its special merits in benchmarks and real-world applications.
arXiv Detail & Related papers (2020-02-13T21:01:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.