Demystifying Deep Neural Networks Through Interpretation: A Survey
- URL: http://arxiv.org/abs/2012.07119v2
- Date: Tue, 5 Jan 2021 20:41:36 GMT
- Title: Demystifying Deep Neural Networks Through Interpretation: A Survey
- Authors: Giang Dao and Minwoo Lee
- Abstract summary: Modern deep learning algorithms tend to optimize an objective metric, such as minimize a cross entropy loss on a training dataset, to be able to learn.
The problem is that the single metric is an incomplete description of the real world tasks.
There are works done to tackle the problem of interpretability to provide insights into neural networks behavior and thought process.
- Score: 3.566184392528658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern deep learning algorithms tend to optimize an objective metric, such as
minimize a cross entropy loss on a training dataset, to be able to learn. The
problem is that the single metric is an incomplete description of the real
world tasks. The single metric cannot explain why the algorithm learn. When an
erroneous happens, the lack of interpretability causes a hardness of
understanding and fixing the error. Recently, there are works done to tackle
the problem of interpretability to provide insights into neural networks
behavior and thought process. The works are important to identify potential
bias and to ensure algorithm fairness as well as expected performance.
Related papers
- SGD method for entropy error function with smoothing l0 regularization for neural networks [3.108634881604788]
entropy error function has been widely used in neural networks.
We propose a novel entropy function with smoothing l0 regularization for feed-forward neural networks.
Our work is novel as it enables neural networks to learn effectively, producing more accurate predictions.
arXiv Detail & Related papers (2024-05-28T19:54:26Z) - The Lattice Overparametrization Paradigm for the Machine Learning of
Lattice Operators [0.0]
We discuss a learning paradigm in which, by overparametrizing a class via elements in a lattice, an algorithm for minimizing functions in a lattice is applied to learn.
This learning paradigm has three properties that modern methods based on neural networks lack: control, transparency and interpretability.
arXiv Detail & Related papers (2023-10-10T14:00:03Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - How does unlabeled data improve generalization in self-training? A
one-hidden-layer theoretical analysis [93.37576644429578]
This work establishes the first theoretical analysis for the known iterative self-training paradigm.
We prove the benefits of unlabeled data in both training convergence and generalization ability.
Experiments from shallow neural networks to deep neural networks are also provided to justify the correctness of our established theoretical insights on self-training.
arXiv Detail & Related papers (2022-01-21T02:16:52Z) - Knowledge accumulating: The general pattern of learning [5.174379158867218]
In solving real world tasks, we still need to adjust algorithms to fit task unique features.
A single algorithm, no matter how we improve it, can only solve dense feedback tasks or specific sparse feedback tasks.
This paper first analyses how sparse feedback affects algorithm perfomance, and then proposes a pattern that explains how to accumulate knowledge to solve sparse feedback problems.
arXiv Detail & Related papers (2021-08-09T12:41:28Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training.
At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Improving Bayesian Network Structure Learning in the Presence of
Measurement Error [11.103936437655575]
This paper describes an algorithm that can be added as an additional learning phase at the end of any structure learning algorithm.
The proposed correction algorithm successfully improves the graphical score of four well-established structure learning algorithms.
arXiv Detail & Related papers (2020-11-19T11:27:47Z) - An Empirical Study of Incremental Learning in Neural Network with Noisy
Training Set [0.0]
We numerically show that the accuracy of the algorithm is dependent more on the location of the error than the percentage of error.
Results show that the dependence of the accuracy with the location of error is independent of the algorithm.
arXiv Detail & Related papers (2020-05-07T06:09:31Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.