Related papers: Adaptive wavelet distillation from neural networks through interpretations

Adaptive wavelet distillation from neural networks through interpretations

URL: http://arxiv.org/abs/2107.09145v1
Date: Mon, 19 Jul 2021 20:40:35 GMT
Title: Adaptive wavelet distillation from neural networks through interpretations
Authors: Wooseok Ha, Chandan Singh, Francois Lanusse, Eli Song, Song Dang, Kangmin He, Srigokul Upadhyayula, Bin Yu
Abstract summary: Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted. We propose adaptive wavelet distillation (AWD), a method which aims to distill information from a trained neural network into a wavelet transform. We showcase how AWD addresses challenges in two real-world settings: cosmological parameter inference and molecular-partner prediction.
Score: 10.923598153317567
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent deep-learning models have achieved impressive prediction performance, but often sacrifice interpretability and computational efficiency. Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted or where interpretation is the goal itself. Moreover, interpretable models are concise and often yield computational efficiency. Here, we propose adaptive wavelet distillation (AWD), a method which aims to distill information from a trained neural network into a wavelet transform. Specifically, AWD penalizes feature attributions of a neural network in the wavelet domain to learn an effective multi-resolution wavelet transform. The resulting model is highly predictive, concise, computationally efficient, and has properties (such as a multi-scale structure) which make it easy to interpret. In close collaboration with domain experts, we showcase how AWD addresses challenges in two real-world settings: cosmological parameter inference and molecular-partner prediction. In both cases, AWD yields a scientifically interpretable and concise model which gives predictive performance better than state-of-the-art neural networks. Moreover, AWD identifies predictive features that are scientifically meaningful in the context of respective domains. All code and models are released in a full-fledged package available on Github (https://github.com/Yu-Group/adaptive-wavelets).

Related papers

Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony [0.0]
Kolmogorov-Arnold Networks (KANs) combine high accuracy with interpretability, making them valuable for scientific modeling.<n>Here we introduce multi-exit KANs, where each layer includes its own prediction branch, enabling the network to make accurate predictions at multiple depths simultaneously.<n>This architecture provides deep supervision that improves training while discovering the right level of model complexity for each task.
arXiv Detail & Related papers (2025-06-03T18:41:30Z)
Brain-like variational inference [0.0]
Inference in both brains and machines can be formalized by maximizing the evidence lower bound (ELBO) in machine learning, or minimizing variational free energy (F) in neuroscience (ELBO = -F)<n>Here, we show that online natural gradient descent on F, under Poisson assumptions, leads to a recurrent spiking neural network that performs variational inference via membrane potential dynamics.<n>The resulting model -- the iterative Poisson variational autoencoder (iP-VAE) -- replaces the encoder network with local updates derived from natural gradient descent on F.
arXiv Detail & Related papers (2024-10-25T06:00:18Z)
WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference. Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner. We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z)
An Innovative Networks in Federated Learning [3.38220960870904]
This paper presents the development and application of Wavelet Kolmogorov-Arnold Networks (Wav-KAN) in federated learning. We have considered both continuous wavelet transform (CWT) and also discrete wavelet transform (DWT) to enable multiresolution capabaility. Extensive experiments were conducted on different datasets, demonstrating Wav-KAN's superior performance in terms of interpretability, computational speed, training and test accuracy.
arXiv Detail & Related papers (2024-05-28T05:20:01Z)
Wav-KAN: Wavelet Kolmogorov-Arnold Networks [3.38220960870904]
Wav-KAN is an innovative neural network architecture that leverages the Wavelet Kolmogorov-Arnold Networks (Wav-KAN) framework to enhance interpretability and performance. Our results highlight the potential of Wav-KAN as a powerful tool for developing interpretable and high-performance neural networks.
arXiv Detail & Related papers (2024-05-21T14:36:16Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Distributionally Robust Recurrent Decoders with Random Network Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Improved Protein-ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference [3.761791311908692]
Predicting accurate protein-ligand binding affinity is important in drug discovery. Recent advances in the deep convolutional and graph neural network based approaches, the model performance depends on the input data representation. We present fusion models to benefit from different feature representations of two neural network models to improve the binding affinity prediction.
arXiv Detail & Related papers (2020-05-17T22:26:27Z)
Adaptive Explainable Neural Networks (AxNNs) [8.949704905866888]
We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability. For predictive performance, we build a structured neural network made up of ensembles of generalized additive model networks and additive index models. For interpretability, we show how to decompose the results of AxNN into main effects and higher-order interaction effects.
arXiv Detail & Related papers (2020-04-05T23:40:57Z)
GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions [5.8010446129208155]
An explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is proposed to pursue a good balance between prediction accuracy and model interpretability. GAMI-Net is a disentangled feedforward network with multiple additiveworks. Numerical experiments on both synthetic functions and real-world datasets show that the proposed model enjoys superior interpretability.
arXiv Detail & Related papers (2020-03-16T11:51:38Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.