On the Post-hoc Explainability of Deep Echo State Networks for Time
Series Forecasting, Image and Video Classification
- URL: http://arxiv.org/abs/2102.08634v1
- Date: Wed, 17 Feb 2021 08:56:33 GMT
- Title: On the Post-hoc Explainability of Deep Echo State Networks for Time
Series Forecasting, Image and Video Classification
- Authors: Alejandro Barredo Arrieta, Sergio Gil-Lopez, Ibai La\~na, Miren Nekane
Bilbao, Javier Del Ser
- Abstract summary: echo state networks have attracted many stares through time, mainly due to the simplicity and computational efficiency of their learning algorithm.
This work addresses this issue by conducting an explainability study of Echo State Networks when applied to learning tasks with time series, image and video data.
Specifically, the study proposes three different techniques capable of eliciting understandable information about the knowledge grasped by these recurrent models.
- Score: 63.716247731036745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since their inception, learning techniques under the Reservoir Computing
paradigm have shown a great modeling capability for recurrent systems without
the computing overheads required for other approaches. Among them, different
flavors of echo state networks have attracted many stares through time, mainly
due to the simplicity and computational efficiency of their learning algorithm.
However, these advantages do not compensate for the fact that echo state
networks remain as black-box models whose decisions cannot be easily explained
to the general audience. This work addresses this issue by conducting an
explainability study of Echo State Networks when applied to learning tasks with
time series, image and video data. Specifically, the study proposes three
different techniques capable of eliciting understandable information about the
knowledge grasped by these recurrent models, namely, potential memory, temporal
patterns and pixel absence effect. Potential memory addresses questions related
to the effect of the reservoir size in the capability of the model to store
temporal information, whereas temporal patterns unveils the recurrent
relationships captured by the model over time. Finally, pixel absence effect
attempts at evaluating the effect of the absence of a given pixel when the echo
state network model is used for image and video classification. We showcase the
benefits of our proposed suite of techniques over three different domains of
applicability: time series modeling, image and, for the first time in the
related literature, video classification. Our results reveal that the proposed
techniques not only allow for a informed understanding of the way these models
work, but also serve as diagnostic tools capable of detecting issues inherited
from data (e.g. presence of hidden bias).
Related papers
- PointMoment:Mixed-Moment-based Self-Supervised Representation Learning
for 3D Point Clouds [11.980787751027872]
We propose PointMoment, a novel framework for point cloud self-supervised representation learning.
Our framework does not require any special techniques such as asymmetric network architectures, gradient stopping, etc.
arXiv Detail & Related papers (2023-12-06T08:49:55Z) - Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis [20.316056261749946]
We propose an end-to-end vision and language model incorporating explicit knowledge graphs.
We also introduce an interactive out-of-distribution layer using implicit network operator.
In practice, we apply our model on several vision and language downstream tasks including visual question answering, visual reasoning, and image-text retrieval.
arXiv Detail & Related papers (2023-02-11T05:46:21Z) - From Actions to Events: A Transfer Learning Approach Using Improved Deep
Belief Networks [1.0554048699217669]
This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model.
Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process.
arXiv Detail & Related papers (2022-11-30T14:47:10Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - STIP: A SpatioTemporal Information-Preserving and Perception-Augmented
Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems.
The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions.
Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z) - Learning Discriminative Shrinkage Deep Networks for Image Deconvolution [122.79108159874426]
We propose an effective non-blind deconvolution approach by learning discriminative shrinkage functions to implicitly model these terms.
Experimental results show that the proposed method performs favorably against the state-of-the-art ones in terms of efficiency and accuracy.
arXiv Detail & Related papers (2021-11-27T12:12:57Z) - Efficient Modelling Across Time of Human Actions and Interactions [92.39082696657874]
We argue that current fixed-sized-temporal kernels in 3 convolutional neural networks (CNNDs) can be improved to better deal with temporal variations in the input.
We study how we can better handle between classes of actions, by enhancing their feature differences over different layers of the architecture.
The proposed approaches are evaluated on several benchmark action recognition datasets and show competitive results.
arXiv Detail & Related papers (2021-10-05T15:39:11Z) - Understanding invariance via feedforward inversion of discriminatively
trained classifiers [30.23199531528357]
Past research has discovered that some extraneous visual detail remains in the output logits.
We develop a feedforward inversion model that produces remarkably high fidelity reconstructions.
Our approach is based on BigGAN, with conditioning on logits instead of one-hot class labels.
arXiv Detail & Related papers (2021-03-15T17:56:06Z) - Collaborative Distillation in the Parameter and Spectrum Domains for
Video Action Recognition [79.60708268515293]
This paper explores how to train small and efficient networks for action recognition.
We propose two distillation strategies in the frequency domain, namely the feature spectrum and parameter distribution distillations respectively.
Our method can achieve higher performance than state-of-the-art methods with the same backbone.
arXiv Detail & Related papers (2020-09-15T07:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.