Related papers: CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions

CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions

URL: http://arxiv.org/abs/2506.05586v1
Date: Thu, 05 Jun 2025 21:01:06 GMT
Title: CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions
Authors: Isha Puri, Amit Dhurandhar, Tejaswini Pedapati, Kartikeyan Shanmugam, Dennis Wei, Kush R. Varshney,
Abstract summary: We present a novel neural architecture, CoFrNet, inspired by the form of continued fractions.<n>We show that CoFrNets can be efficiently trained as well as interpreted leveraging their particular functional form.
Score: 33.582840818840594
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years there has been a considerable amount of research on local post hoc explanations for neural networks. However, work on building interpretable neural architectures has been relatively sparse. In this paper, we present a novel neural architecture, CoFrNet, inspired by the form of continued fractions which are known to have many attractive properties in number theory, such as fast convergence of approximations to real numbers. We show that CoFrNets can be efficiently trained as well as interpreted leveraging their particular functional form. Moreover, we prove that such architectures are universal approximators based on a proof strategy that is different than the typical strategy used to prove universal approximation results for neural networks based on infinite width (or depth), which is likely to be of independent interest. We experiment on nonlinear synthetic functions and are able to accurately model as well as estimate feature attributions and even higher order terms in some cases, which is a testament to the representational power as well as interpretability of such architectures. To further showcase the power of CoFrNets, we experiment on seven real datasets spanning tabular, text and image modalities, and show that they are either comparable or significantly better than other interpretable models and multilayer perceptrons, sometimes approaching the accuracies of state-of-the-art models.

Related papers

When Representations Align: Universality in Representation Learning Dynamics [8.188549368578704]
We derive an effective theory of representation learning under the assumption that the encoding map from input to hidden representation and the decoding map from representation to output are arbitrary smooth functions. We show through experiments that the effective theory describes aspects of representation learning dynamics across a range of deep networks with different activation functions and architectures.
arXiv Detail & Related papers (2024-02-14T12:48:17Z)
Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at Scale [54.15522908057831]
We propose an adapted version of the computationally-Mixer for STTD forecast at scale. Our results surprisingly show that this simple-yeteffective solution can rival SOTA baselines when tested on several traffic benchmarks. Our findings contribute to the exploration of simple-yet-effective models for real-world STTD forecasting.
arXiv Detail & Related papers (2023-07-04T05:19:19Z)
Weisfeiler and Leman Go Relational [4.29881872550313]
We investigate the limitations in the expressive power of the well-known GCN and Composition GCN architectures. We introduce the $k$-RN architecture that provably overcomes the limitations of the above two architectures.
arXiv Detail & Related papers (2022-11-30T15:56:46Z)
On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet) We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z)
Polynomial Networks in Deep Classifiers [55.90321402256631]
We cast the study of deep neural networks under a unifying framework. Our framework provides insights on the inductive biases of each model. The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.
arXiv Detail & Related papers (2021-04-16T06:41:20Z)
Reframing Neural Networks: Deep Structure in Overcomplete Representations [41.84502123663809]
We introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames. We quantify structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design.
arXiv Detail & Related papers (2021-03-10T01:15:14Z)
A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures. A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.