Learn to Explore: on Bootstrapping Interactive Data Exploration with
Meta-learning
- URL: http://arxiv.org/abs/2212.03423v1
- Date: Wed, 7 Dec 2022 03:12:41 GMT
- Title: Learn to Explore: on Bootstrapping Interactive Data Exploration with
Meta-learning
- Authors: Yukun Cao, Xike Xie, and Kexin Huang
- Abstract summary: We propose a learning-to-explore framework, based on meta-learning, which learns how to learn a classifier with automatically generated meta-tasks.
Our proposal outperforms existing explore-by-example solutions in terms of accuracy and efficiency.
- Score: 8.92180350317399
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interactive data exploration (IDE) is an effective way of comprehending big
data, whose volume and complexity are beyond human abilities. The main goal of
IDE is to discover user interest regions from a database through multi-rounds
of user labelling. Existing IDEs adopt active-learning framework, where users
iteratively discriminate or label the interestingness of selected tuples. The
process of data exploration can be viewed as the process of training a
classifier, which determines whether a database tuple is interesting to a user.
An efficient exploration thus takes very few iterations of user labelling to
reach the data region of interest. In this work, we consider the data
exploration as the process of few-shot learning, where the classifier is
learned with only a few training examples, or exploration iterations. To this
end, we propose a learning-to-explore framework, based on meta-learning, which
learns how to learn a classifier with automatically generated meta-tasks, so
that the exploration process can be much shortened. Extensive experiments on
real datasets show that our proposal outperforms existing explore-by-example
solutions in terms of accuracy and efficiency.
Related papers
- DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Actively Discovering New Slots for Task-oriented Conversation [19.815466126158785]
We propose a general new slot task in an information extraction fashion to realize human-in-the-loop learning.
We leverage existing language tools to extract value candidates where the corresponding labels are leveraged as weak supervision signals.
We conduct extensive experiments on several public datasets and compare with a bunch of competitive baselines to demonstrate our method.
arXiv Detail & Related papers (2023-05-06T13:33:33Z) - Demonstration of InsightPilot: An LLM-Empowered Automated Data
Exploration System [48.62158108517576]
We introduce InsightPilot, an automated data exploration system designed to simplify the data exploration process.
InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining.
In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts.
arXiv Detail & Related papers (2023-04-02T07:27:49Z) - ALBench: A Framework for Evaluating Active Learning in Object Detection [102.81795062493536]
This paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection.
Developed on an automatic deep model training system, this ALBench framework is easy-to-use, compatible with different active learning algorithms, and ensures the same training and testing protocols.
arXiv Detail & Related papers (2022-07-27T07:46:23Z) - Active metric learning and classification using similarity queries [21.589707834542338]
We show that a novel unified query framework can be applied to any problem in which a key component is learning a representation of the data that reflects similarity.
We demonstrate the effectiveness of the proposed strategy on two tasks -- active metric learning and active classification.
arXiv Detail & Related papers (2022-02-04T03:34:29Z) - AstronomicAL: An interactive dashboard for visualisation, integration
and classification of data using Active Learning [0.0]
AstronomicAL is a human-in-the-loop interactive labelling and training dashboard.
It allows users to create reliable datasets and robust classifiers using active learning.
System allows users to visualise and integrate data from different sources.
arXiv Detail & Related papers (2021-09-11T07:32:26Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Mining Implicit Entity Preference from User-Item Interaction Data for
Knowledge Graph Completion via Adversarial Learning [82.46332224556257]
We propose a novel adversarial learning approach by leveraging user interaction data for the Knowledge Graph Completion task.
Our generator is isolated from user interaction data, and serves to improve the performance of the discriminator.
To discover implicit entity preference of users, we design an elaborate collaborative learning algorithms based on graph neural networks.
arXiv Detail & Related papers (2020-03-28T05:47:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.