Related papers: Learning similarity measures from data

Learning similarity measures from data

URL: http://arxiv.org/abs/2001.05312v1
Date: Wed, 15 Jan 2020 13:29:48 GMT
Title: Learning similarity measures from data
Authors: Bj{\o}rn Magnus Mathisen, Agnar Aamodt, Kerstin Bach, Helge Langseth
Abstract summary: Defining similarity measures is a requirement for some machine learning methods. Data sets are typically gathered as part of constructing a CBR or machine learning system. Our objective is to investigate how to apply machine learning to effectively learn a similarity measure.
Score: 1.4766350834632755
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Defining similarity measures is a requirement for some machine learning methods. One such method is case-based reasoning (CBR) where the similarity measure is used to retrieve the stored case or set of cases most similar to the query case. Describing a similarity measure analytically is challenging, even for domain experts working with CBR experts. However, data sets are typically gathered as part of constructing a CBR or machine learning system. These datasets are assumed to contain the features that correctly identify the solution from the problem features, thus they may also contain the knowledge to construct or learn such a similarity measure. The main motivation for this work is to automate the construction of similarity measures using machine learning, while keeping training time as low as possible. Our objective is to investigate how to apply machine learning to effectively learn a similarity measure. Such a learned similarity measure could be used for CBR systems, but also for clustering data in semi-supervised learning, or one-shot learning tasks. Recent work has advanced towards this goal, relies on either very long training times or manually modeling parts of the similarity measure. We created a framework to help us analyze current methods for learning similarity measures. This analysis resulted in two novel similarity measure designs. One design using a pre-trained classifier as basis for a similarity measure. The second design uses as little modeling as possible while learning the similarity measure from data and keeping training time low. Both similarity measures were evaluated on 14 different datasets. The evaluation shows that using a classifier as basis for a similarity measure gives state of the art performance. Finally the evaluation shows that our fully data-driven similarity measure design outperforms state of the art methods while keeping training time low.

Related papers

A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs [50.982315553104975]
Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms. Most SMMs are manually built by human experts using bottom-up procedures. We propose a novel graph-based algorithm that automatically generates conceptual spaces and SMMs in a top-down manner.
arXiv Detail & Related papers (2024-12-02T12:06:41Z)
Measuring similarity between embedding spaces using induced neighborhood graphs [10.056989400384772]
We propose a metric to evaluate the similarity between paired item representations. Our results show that accuracy in both analogy and zero-shot classification tasks correlates with the embedding similarity.
arXiv Detail & Related papers (2024-11-13T15:22:33Z)
Differentiable Optimization of Similarity Scores Between Models and Brains [1.5391321019692434]
Similarity measures such as linear regression, Centered Kernel Alignment (CKA), Normalized Bures Similarity (NBS), and angular Procrustes distance are often used to quantify this similarity. Here, we introduce a novel tool to investigate what drives high similarity scores and what constitutes a "good" score. Surprisingly, we find that high similarity scores do not guarantee encoding task-relevant information in a manner consistent with neural data.
arXiv Detail & Related papers (2024-07-09T17:31:47Z)
Matched Machine Learning: A Generalized Framework for Treatment Effect Inference With Learned Metrics [87.05961347040237]
We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching. Our framework uses machine learning to learn an optimal metric for matching units and estimating outcomes. We show empirically that instances of Matched Machine Learning perform on par with black-box machine learning methods and better than existing matching methods for similar problems.
arXiv Detail & Related papers (2023-04-03T19:32:30Z)
Similarity between Units of Natural Language: The Transition from Coarse to Fine Estimation [0.0]
Capturing the similarities between human language units is crucial for explaining how humans associate different objects. My research goal in this thesis is to develop regression models that account for similarities between language units in a more refined way.
arXiv Detail & Related papers (2022-10-25T18:54:32Z)
Attributable Visual Similarity Learning [90.69718495533144]
This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images. Motivated by the human semantic similarity cognition, we propose a generalized similarity learning paradigm to represent the similarity between two images with a graph. Experiments on the CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate significant improvements over existing deep similarity learning methods.
arXiv Detail & Related papers (2022-03-28T17:35:31Z)
Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining. We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z)
A Taxonomy of Similarity Metrics for Markov Decision Processes [62.997667081978825]
In recent years, transfer learning has succeeded in making Reinforcement Learning (RL) algorithms more efficient. In this paper, we propose a categorization of these metrics and analyze the definitions of similarity proposed so far.
arXiv Detail & Related papers (2021-03-08T12:36:42Z)
Few-shot Visual Reasoning with Meta-analogical Contrastive Learning [141.2562447971]
We propose to solve a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning. We extract structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning. We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce.
arXiv Detail & Related papers (2020-07-23T14:00:34Z)
CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements. In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data. For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.