Building and Interpreting Deep Similarity Models
- URL: http://arxiv.org/abs/2003.05431v1
- Date: Wed, 11 Mar 2020 17:46:55 GMT
- Title: Building and Interpreting Deep Similarity Models
- Authors: Oliver Eberle, Jochen B\"uttner, Florian Kr\"autli, Klaus-Robert
M\"uller, Matteo Valleriani, Gr\'egoire Montavon
- Abstract summary: We propose to make similarities interpretable by augmenting them with an explanation in terms of input features.
We develop BiLRP, a scalable and theoretically founded method to systematically decompose similarity scores on pairs of input features.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many learning algorithms such as kernel machines, nearest neighbors,
clustering, or anomaly detection, are based on the concept of 'distance' or
'similarity'. Before similarities are used for training an actual machine
learning model, we would like to verify that they are bound to meaningful
patterns in the data. In this paper, we propose to make similarities
interpretable by augmenting them with an explanation in terms of input
features. We develop BiLRP, a scalable and theoretically founded method to
systematically decompose similarity scores on pairs of input features. Our
method can be expressed as a composition of LRP explanations, which were shown
in previous works to scale to highly nonlinear functions. Through an extensive
set of experiments, we demonstrate that BiLRP robustly explains complex
similarity models, e.g. built on VGG-16 deep neural network features.
Additionally, we apply our method to an open problem in digital humanities:
detailed assessment of similarity between historical documents such as
astronomical tables. Here again, BiLRP provides insight and brings
verifiability into a highly engineered and problem-specific similarity model.
Related papers
- Measuring similarity between embedding spaces using induced neighborhood graphs [10.056989400384772]
We propose a metric to evaluate the similarity between paired item representations.
Our results show that accuracy in both analogy and zero-shot classification tasks correlates with the embedding similarity.
arXiv Detail & Related papers (2024-11-13T15:22:33Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric
Learning [1.4293924404819704]
We shed new light on the traditional nearest neighbors algorithm from the perspective of information theory.
We propose a robust and interpretable framework for tasks such as classification, regression, density estimation, and anomaly detection using a single model.
Our work showcases the architecture's versatility by achieving state-of-the-art results in classification and anomaly detection.
arXiv Detail & Related papers (2023-11-17T00:35:38Z) - A Recursive Bateson-Inspired Model for the Generation of Semantic Formal
Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data.
The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept.
The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Do Neural Networks Trained with Topological Features Learn Different
Internal Representations? [1.418465438044804]
We investigate whether a model trained with topological features learns internal representations of data that are fundamentally different than those learned by a model trained with the original raw data.
We find that structurally, the hidden representations of models trained and evaluated on topological features differ substantially compared to those trained and evaluated on the corresponding raw data.
We conjecture that this means that neural networks trained on raw data may extract some limited topological features in the process of making predictions.
arXiv Detail & Related papers (2022-11-14T19:19:04Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Towards Model Agnostic Federated Learning Using Knowledge Distillation [9.947968358822951]
In this work, we initiate a theoretical study of model agnostic communication protocols.
We focus on the setting where the two agents are attempting to perform kernel regression using different kernels.
Our study yields a surprising result -- the most natural algorithm of using alternating knowledge distillation (AKD) imposes overly strong regularization.
arXiv Detail & Related papers (2021-10-28T15:27:51Z) - Few-shot Visual Reasoning with Meta-analogical Contrastive Learning [141.2562447971]
We propose to solve a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning.
We extract structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning.
We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce.
arXiv Detail & Related papers (2020-07-23T14:00:34Z) - The data-driven physical-based equations discovery using evolutionary
approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data.
The algorithm combines genetic programming with the sparse regression.
It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.