Related papers: Efficient and Interpretable Neural Networks Using Complex Lehmer Transform

Efficient and Interpretable Neural Networks Using Complex Lehmer Transform

URL: http://arxiv.org/abs/2501.15223v1
Date: Sat, 25 Jan 2025 14:08:30 GMT
Title: Efficient and Interpretable Neural Networks Using Complex Lehmer Transform
Authors: Masoud Ataei, Xiaogang Wang,
Abstract summary: We propose an efficient and interpretable neural network with a novel activation function called the weighted Lehmer transform.<n>We analyze the mathematical properties of both real-valued and complex-valued Lehmer activation units.<n> Empirical evaluations demonstrate that our proposed neural network achieves competitive accuracy on benchmark datasets.
Score: 11.095723123836965
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose an efficient and interpretable neural network with a novel activation function called the weighted Lehmer transform. This new activation function enables adaptive feature selection and extends to the complex domain, capturing phase-sensitive and hierarchical relationships within data. Notably, it provides greater interpretability and transparency compared to existing machine learning models, facilitating a deeper understanding of its functionality and decision-making processes. We analyze the mathematical properties of both real-valued and complex-valued Lehmer activation units and demonstrate their applications in modeling nonlinear interactions. Empirical evaluations demonstrate that our proposed neural network achieves competitive accuracy on benchmark datasets with significantly improved computational efficiency. A single layer of real-valued or complex-valued Lehmer activation units is shown to deliver state-of-the-art performance, balancing efficiency with interpretability.

Related papers

Kolmogorov Arnold Networks and Multi-Layer Perceptrons: A Paradigm Shift in Neural Modelling [1.6998720690708842]
The research undertakes a comprehensive comparative analysis of Kolmogorov-Arnold Networks (KAN) and Multi-Layer Perceptrons (MLP)<n>KANs utilize spline-based activation functions and grid-based structures, providing a transformative approach compared to traditional neural network frameworks.<n>The proposed study highlights the transformative capabilities of KANs in progressing intelligent systems.
arXiv Detail & Related papers (2026-01-15T16:26:49Z)
High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [51.90920900332569]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z)
LeMoRe: Learn More Details for Lightweight Semantic Segmentation [48.81126061219231]
We introduce an efficient paradigm by synergizing explicit and implicit modeling to balance computational efficiency with representational fidelity.<n>Our method combines well-defined Cartesian directions with explicitly modeled views and implicitly inferred intermediate representations, efficiently capturing global dependencies.
arXiv Detail & Related papers (2025-05-29T04:55:10Z)
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks. We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge. Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions [0.0]
We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions.<n>Our framework introduces a fractional exponent in the activation functions, allowing adaptive non-linear approximations with improved accuracy.
arXiv Detail & Related papers (2025-01-17T14:24:25Z)
A Survey on Kolmogorov-Arnold Network [0.0]
Review explores the theoretical foundations, evolution, applications, and future potential of Kolmogorov-Arnold Networks (KAN) KANs distinguish themselves from traditional neural networks by using learnable, spline- parameterized functions instead of fixed activation functions. This paper highlights KAN's role in modern neural architectures and outlines future directions to improve its computational efficiency, interpretability, and scalability in data-intensive applications.
arXiv Detail & Related papers (2024-11-09T05:54:17Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks [0.0]
Local Interaction Basis aims to identify computational features by removing irrelevant activations and interactions. We evaluate the effectiveness of LIB on modular addition and CIFAR-10 models. We conclude that LIB is a promising theory-driven approach for analyzing neural networks, but in its current form is not applicable to large language models.
arXiv Detail & Related papers (2024-05-17T17:27:19Z)
Peridynamic Neural Operators: A Data-Driven Nonlocal Constitutive Model for Complex Material Responses [12.454290779121383]
We introduce a novel integral neural operator architecture called the Peridynamic Neural Operator (PNO) that learns a nonlocal law from data. This neural operator provides a forward model in the form of state-based peridynamics, with objectivity and momentum balance laws automatically guaranteed. We show that, owing to its ability to capture complex responses, our learned neural operator achieves improved accuracy and efficiency compared to baseline models.
arXiv Detail & Related papers (2024-01-11T17:37:20Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence. We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN) We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Efficient Non-Local Contrastive Attention for Image Super-Resolution [48.093500219958834]
Non-Local Attention (NLA) brings significant improvement for Single Image Super-Resolution (SISR) by leveraging intrinsic feature correlation in natural images. We propose a novel Efficient Non-Local Contrastive Attention (ENLCA) to perform long-range visual modeling and leverage more relevant non-local features.
arXiv Detail & Related papers (2022-01-11T05:59:09Z)
Structure-Regularized Attention for Deformable Object Representation [17.120035855774344]
Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks. Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by enabling unconstrained pairwise interactions between elements. We consider learning representations for deformable objects which can benefit from context exploitation by modeling the structural dependencies that the data intrinsically possesses.
arXiv Detail & Related papers (2021-06-12T03:10:17Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.