Knowledge Elicitation using Deep Metric Learning and Psychometric
Testing
- URL: http://arxiv.org/abs/2004.06353v1
- Date: Tue, 14 Apr 2020 08:33:42 GMT
- Title: Knowledge Elicitation using Deep Metric Learning and Psychometric
Testing
- Authors: Lu Yin, Vlado Menkovski, Mykola Pechenizkiy
- Abstract summary: We provide a method for efficient hierarchical knowledge elicitation from experts working with high-dimensional data such as images or videos.
The developed models embed the high-dimensional data in a metric space where distances are semantically meaningful, and the data can be organized in a hierarchical structure.
- Score: 15.989397781243225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge present in a domain is well expressed as relationships between
corresponding concepts. For example, in zoology, animal species form complex
hierarchies; in genomics, the different (parts of) molecules are organized in
groups and subgroups based on their functions; plants, molecules, and
astronomical objects all form complex taxonomies. Nevertheless, when applying
supervised machine learning (ML) in such domains, we commonly reduce the
complex and rich knowledge to a fixed set of labels, and induce a model shows
good generalization performance with respect to these labels. The main reason
for such a reductionist approach is the difficulty in eliciting the domain
knowledge from the experts. Developing a label structure with sufficient
fidelity and providing comprehensive multi-label annotation can be exceedingly
labor-intensive in many real-world applications. In this paper, we provide a
method for efficient hierarchical knowledge elicitation (HKE) from experts
working with high-dimensional data such as images or videos. Our method is
based on psychometric testing and active deep metric learning. The developed
models embed the high-dimensional data in a metric space where distances are
semantically meaningful, and the data can be organized in a hierarchical
structure. We provide empirical evidence with a series of experiments on a
synthetically generated dataset of simple shapes, and Cifar 10 and
Fashion-MNIST benchmarks that our method is indeed successful in uncovering
hierarchical structures.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - End-to-End Ontology Learning with Large Language Models [11.755755139228219]
Large language models (LLMs) have been applied to solve various subtasks of ontology learning.
We address this gap by OLLM, a general and scalable method for building the taxonomic backbone of an ontology from scratch.
In contrast to standard metrics, our metrics use deep learning techniques to define more robust structural distance measures between graphs.
Our model can be effectively adapted to new domains, like arXiv, needing only a small number of training examples.
arXiv Detail & Related papers (2024-10-31T02:52:39Z) - Tree-based variational inference for Poisson log-normal models [47.82745603191512]
hierarchical trees are often used to organize entities based on proximity criteria.
Current count-data models do not leverage this structured information.
We introduce the PLN-Tree model as an extension of the PLN model for modeling hierarchical count data.
arXiv Detail & Related papers (2024-06-25T08:24:35Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Unsupervised hierarchical clustering using the learning dynamics of RBMs [0.0]
We present a new and general method for building relational data trees by exploiting the learning dynamics of the Restricted Boltzmann Machine (RBM)
Our method is based on the mean-field approach, derived from the Plefka expansion, and developed in context of disordered systems.
We tested our method in an artificially hierarchical dataset and on three different real-world datasets (images of digits, mutations in the human genome, and a family of proteins)
arXiv Detail & Related papers (2023-02-03T16:53:32Z) - Classification of Consumer Belief Statements From Social Media [0.0]
We study how complex expert annotations can be leveraged successfully for classification.
We find that automated class abstraction approaches perform remarkably well against domain expert baseline on text classification tasks.
arXiv Detail & Related papers (2021-06-29T15:25:33Z) - Joint Geometric and Topological Analysis of Hierarchical Datasets [7.098759778181621]
In this paper, we focus on high-dimensional data that are organized into several hierarchical datasets.
The main novelty in this work lies in the combination of two powerful data-analytic approaches: topological data analysis and geometric manifold learning.
We show that our new method gives rise to superior classification results compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-04-03T13:02:00Z) - Latent Feature Representation via Unsupervised Learning for Pattern
Discovery in Massive Electron Microscopy Image Volumes [4.278591555984395]
In particular, we give an unsupervised deep learning approach to learning a latent representation that captures semantic similarity in the data set.
We demonstrate the utility of our method applied to nano-scale electron microscopy data, where even relatively small portions of animal brains can require terabytes of image data.
arXiv Detail & Related papers (2020-12-22T17:14:19Z) - Predicting Themes within Complex Unstructured Texts: A Case Study on
Safeguarding Reports [66.39150945184683]
We focus on the problem of automatically identifying the main themes in a safeguarding report using supervised classification approaches.
Our results show the potential of deep learning models to simulate subject-expert behaviour even for complex tasks with limited labelled data.
arXiv Detail & Related papers (2020-10-27T19:48:23Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.