To Classify is to Interpret: Building Taxonomies from Heterogeneous Data
through Human-AI Collaboration
- URL: http://arxiv.org/abs/2307.16481v1
- Date: Mon, 31 Jul 2023 08:24:29 GMT
- Title: To Classify is to Interpret: Building Taxonomies from Heterogeneous Data
through Human-AI Collaboration
- Authors: Sebastian Meier and Katrin Glinka
- Abstract summary: We explore how taxonomy building can be supported with systems that integrate machine learning (ML)
We propose an approach that allows the user to iteratively take into account multiple model's outputs as part of their sensemaking process.
- Score: 0.39160947065896795
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Taxonomy building is a task that requires interpreting and classifying data
within a given frame of reference, which comes to play in many areas of
application that deal with knowledge and information organization. In this
paper, we explore how taxonomy building can be supported with systems that
integrate machine learning (ML). However, relying only on black-boxed ML-based
systems to automate taxonomy building would sideline the users' expertise. We
propose an approach that allows the user to iteratively take into account
multiple model's outputs as part of their sensemaking process. We implemented
our approach in two real-world use cases. The work is positioned in the context
of HCI research that investigates the design of ML-based systems with an
emphasis on enabling human-AI collaboration.
Related papers
- LLM-assisted Explicit and Implicit Multi-interest Learning Framework for Sequential Recommendation [50.98046887582194]
We propose an explicit and implicit multi-interest learning framework to model user interests on two levels: behavior and semantics.
The proposed EIMF framework effectively and efficiently combines small models with LLM to improve the accuracy of multi-interest modeling.
arXiv Detail & Related papers (2024-11-14T13:00:23Z) - Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision.
We construct a taxonomy and review the most prominent papers in recent years.
We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z) - Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph [1.7418328181959968]
The proposed research aims to develop an innovative semantic query processing system.
It enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University.
arXiv Detail & Related papers (2024-05-24T09:19:45Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - TaBIIC: Taxonomy Building through Iterative and Interactive Clustering [2.817412580574242]
In this paper, we explore a method that takes inspiration from both approaches in an iterative and interactive process.
We show that this method is applicable on a variety of data sources and leads to that can be more directly integrated into an ontology.
arXiv Detail & Related papers (2023-12-10T12:17:43Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Model-Driven Engineering Method to Support the Formalization of Machine
Learning using SysML [0.0]
This work introduces a method supporting the collaborative definition of machine learning tasks by leveraging model-based engineering.
The method supports the identification and integration of various data sources, the required definition of semantic connections between data attributes, and the definition of data processing steps.
arXiv Detail & Related papers (2023-07-10T11:33:46Z) - Large-scale Taxonomy Induction Using Entity and Word Embeddings [13.30719395448771]
We propose TIEmb, an approach for automatic subsumption extraction from knowledge using entity and text embeddings.
We apply the approach on the WebIsA database, a database of classes subsumption relations extracted from the large portion of Wide Web, to extract hierarchies in the Person and Place domain.
arXiv Detail & Related papers (2021-05-04T05:53:12Z) - Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets.
We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy.
Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Knowledge Elicitation using Deep Metric Learning and Psychometric
Testing [15.989397781243225]
We provide a method for efficient hierarchical knowledge elicitation from experts working with high-dimensional data such as images or videos.
The developed models embed the high-dimensional data in a metric space where distances are semantically meaningful, and the data can be organized in a hierarchical structure.
arXiv Detail & Related papers (2020-04-14T08:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.