Adaptive wavelet distillation from neural networks through
interpretations
- URL: http://arxiv.org/abs/2107.09145v1
- Date: Mon, 19 Jul 2021 20:40:35 GMT
- Title: Adaptive wavelet distillation from neural networks through
interpretations
- Authors: Wooseok Ha, Chandan Singh, Francois Lanusse, Eli Song, Song Dang,
Kangmin He, Srigokul Upadhyayula, Bin Yu
- Abstract summary: Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted.
We propose adaptive wavelet distillation (AWD), a method which aims to distill information from a trained neural network into a wavelet transform.
We showcase how AWD addresses challenges in two real-world settings: cosmological parameter inference and molecular-partner prediction.
- Score: 10.923598153317567
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent deep-learning models have achieved impressive prediction performance,
but often sacrifice interpretability and computational efficiency.
Interpretability is crucial in many disciplines, such as science and medicine,
where models must be carefully vetted or where interpretation is the goal
itself. Moreover, interpretable models are concise and often yield
computational efficiency. Here, we propose adaptive wavelet distillation (AWD),
a method which aims to distill information from a trained neural network into a
wavelet transform. Specifically, AWD penalizes feature attributions of a neural
network in the wavelet domain to learn an effective multi-resolution wavelet
transform. The resulting model is highly predictive, concise, computationally
efficient, and has properties (such as a multi-scale structure) which make it
easy to interpret. In close collaboration with domain experts, we showcase how
AWD addresses challenges in two real-world settings: cosmological parameter
inference and molecular-partner prediction. In both cases, AWD yields a
scientifically interpretable and concise model which gives predictive
performance better than state-of-the-art neural networks. Moreover, AWD
identifies predictive features that are scientifically meaningful in the
context of respective domains. All code and models are released in a
full-fledged package available on Github
(https://github.com/Yu-Group/adaptive-wavelets).
Related papers
- WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference.
Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner.
We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z) - An Innovative Networks in Federated Learning [3.38220960870904]
This paper presents the development and application of Wavelet Kolmogorov-Arnold Networks (Wav-KAN) in federated learning.
We have considered both continuous wavelet transform (CWT) and also discrete wavelet transform (DWT) to enable multiresolution capabaility.
Extensive experiments were conducted on different datasets, demonstrating Wav-KAN's superior performance in terms of interpretability, computational speed, training and test accuracy.
arXiv Detail & Related papers (2024-05-28T05:20:01Z) - Wav-KAN: Wavelet Kolmogorov-Arnold Networks [3.38220960870904]
Wav-KAN is an innovative neural network architecture that leverages the Wavelet Kolmogorov-Arnold Networks (Wav-KAN) framework to enhance interpretability and performance.
Our results highlight the potential of Wav-KAN as a powerful tool for developing interpretable and high-performance neural networks.
arXiv Detail & Related papers (2024-05-21T14:36:16Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Distributionally Robust Recurrent Decoders with Random Network
Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference.
We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Improved Protein-ligand Binding Affinity Prediction with Structure-Based
Deep Fusion Inference [3.761791311908692]
Predicting accurate protein-ligand binding affinity is important in drug discovery.
Recent advances in the deep convolutional and graph neural network based approaches, the model performance depends on the input data representation.
We present fusion models to benefit from different feature representations of two neural network models to improve the binding affinity prediction.
arXiv Detail & Related papers (2020-05-17T22:26:27Z) - Adaptive Explainable Neural Networks (AxNNs) [8.949704905866888]
We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability.
For predictive performance, we build a structured neural network made up of ensembles of generalized additive model networks and additive index models.
For interpretability, we show how to decompose the results of AxNN into main effects and higher-order interaction effects.
arXiv Detail & Related papers (2020-04-05T23:40:57Z) - GAMI-Net: An Explainable Neural Network based on Generalized Additive
Models with Structured Interactions [5.8010446129208155]
An explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is proposed to pursue a good balance between prediction accuracy and model interpretability.
GAMI-Net is a disentangled feedforward network with multiple additiveworks.
Numerical experiments on both synthetic functions and real-world datasets show that the proposed model enjoys superior interpretability.
arXiv Detail & Related papers (2020-03-16T11:51:38Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.