Learning similarity measures from data
- URL: http://arxiv.org/abs/2001.05312v1
- Date: Wed, 15 Jan 2020 13:29:48 GMT
- Title: Learning similarity measures from data
- Authors: Bj{\o}rn Magnus Mathisen, Agnar Aamodt, Kerstin Bach, Helge Langseth
- Abstract summary: Defining similarity measures is a requirement for some machine learning methods.
Data sets are typically gathered as part of constructing a CBR or machine learning system.
Our objective is to investigate how to apply machine learning to effectively learn a similarity measure.
- Score: 1.4766350834632755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Defining similarity measures is a requirement for some machine learning
methods. One such method is case-based reasoning (CBR) where the similarity
measure is used to retrieve the stored case or set of cases most similar to the
query case. Describing a similarity measure analytically is challenging, even
for domain experts working with CBR experts. However, data sets are typically
gathered as part of constructing a CBR or machine learning system. These
datasets are assumed to contain the features that correctly identify the
solution from the problem features, thus they may also contain the knowledge to
construct or learn such a similarity measure. The main motivation for this work
is to automate the construction of similarity measures using machine learning,
while keeping training time as low as possible. Our objective is to investigate
how to apply machine learning to effectively learn a similarity measure. Such a
learned similarity measure could be used for CBR systems, but also for
clustering data in semi-supervised learning, or one-shot learning tasks. Recent
work has advanced towards this goal, relies on either very long training times
or manually modeling parts of the similarity measure. We created a framework to
help us analyze current methods for learning similarity measures. This analysis
resulted in two novel similarity measure designs. One design using a
pre-trained classifier as basis for a similarity measure. The second design
uses as little modeling as possible while learning the similarity measure from
data and keeping training time low. Both similarity measures were evaluated on
14 different datasets. The evaluation shows that using a classifier as basis
for a similarity measure gives state of the art performance. Finally the
evaluation shows that our fully data-driven similarity measure design
outperforms state of the art methods while keeping training time low.
Related papers
- Measuring similarity between embedding spaces using induced neighborhood graphs [10.056989400384772]
We propose a metric to evaluate the similarity between paired item representations.
Our results show that accuracy in both analogy and zero-shot classification tasks correlates with the embedding similarity.
arXiv Detail & Related papers (2024-11-13T15:22:33Z) - Differentiable Optimization of Similarity Scores Between Models and Brains [1.5391321019692434]
Similarity measures such as linear regression, Centered Kernel Alignment (CKA), Normalized Bures Similarity (NBS), and angular Procrustes distance are often used to quantify this similarity.
Here, we introduce a novel tool to investigate what drives high similarity scores and what constitutes a "good" score.
Surprisingly, we find that high similarity scores do not guarantee encoding task-relevant information in a manner consistent with neural data.
arXiv Detail & Related papers (2024-07-09T17:31:47Z) - Matched Machine Learning: A Generalized Framework for Treatment Effect
Inference With Learned Metrics [87.05961347040237]
We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching.
Our framework uses machine learning to learn an optimal metric for matching units and estimating outcomes.
We show empirically that instances of Matched Machine Learning perform on par with black-box machine learning methods and better than existing matching methods for similar problems.
arXiv Detail & Related papers (2023-04-03T19:32:30Z) - Similarity between Units of Natural Language: The Transition from Coarse
to Fine Estimation [0.0]
Capturing the similarities between human language units is crucial for explaining how humans associate different objects.
My research goal in this thesis is to develop regression models that account for similarities between language units in a more refined way.
arXiv Detail & Related papers (2022-10-25T18:54:32Z) - Attributable Visual Similarity Learning [90.69718495533144]
This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images.
Motivated by the human semantic similarity cognition, we propose a generalized similarity learning paradigm to represent the similarity between two images with a graph.
Experiments on the CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate significant improvements over existing deep similarity learning methods.
arXiv Detail & Related papers (2022-03-28T17:35:31Z) - Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining.
We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z) - A Taxonomy of Similarity Metrics for Markov Decision Processes [62.997667081978825]
In recent years, transfer learning has succeeded in making Reinforcement Learning (RL) algorithms more efficient.
In this paper, we propose a categorization of these metrics and analyze the definitions of similarity proposed so far.
arXiv Detail & Related papers (2021-03-08T12:36:42Z) - Few-shot Visual Reasoning with Meta-analogical Contrastive Learning [141.2562447971]
We propose to solve a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning.
We extract structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning.
We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce.
arXiv Detail & Related papers (2020-07-23T14:00:34Z) - CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements.
In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data.
For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.