INODE: Building an End-to-End Data Exploration System in Practice
[Extended Vision]
- URL: http://arxiv.org/abs/2104.04194v1
- Date: Fri, 9 Apr 2021 05:04:04 GMT
- Title: INODE: Building an End-to-End Data Exploration System in Practice
[Extended Vision]
- Authors: Sihem Amer-Yahia (2), Georgia Koutrika (1), Frederic Bastian (7),
Theofilos Belmpas (1), Martin Braschler (9), Ursin Brunner (9), Diego
Calvanese (8), Maximilian Fabricius (5), Orest Gkini (1), Catherine Kosten
(9), Davide Lanti (8), Antonis Litke (6), Hendrik L\"ucke-Tieke (3),
Francesco Alessandro Massucci (6), Tarcisio Mendes de Farias (7), Alessandro
Mosca (8), Francesco Multari (6), Nikolaos Papadakis (4), Dimitris
Papadopoulos (4), Yogendra Patil (2), Aur\'elien Personnaz (2), Guillem Rull
(6), Ana Sima (7), Ellery Smith (9), Dimitrios Skoutas (1), Srividya
Subramanian (5), Guohui Xiao (8), Kurt Stockinger (9) ((1) Athena Research
Center, Greece, (2) CNRS, University Grenoble Alpes, France, (3) Fraunhofer
IGD, Germany, (4) Infili, Greece, (5) Max Planck Institute, Germany, (6)
SIRIS Academic, Spain, (7) SIB Swiss Institute of Bioinformatics,
Switzerland, (8) Free University of Bozen-Bolzano, Italy, (9) ZHAW Zurich
University of Applied Sciences, Switzerland)
- Abstract summary: INODE is an end-to-end data exploration system.
We demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics.
- Score: 30.411996388471817
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A full-fledged data exploration system must combine different access
modalities with a powerful concept of guiding the user in the exploration
process, by being reactive and anticipative both for data discovery and for
data linking. Such systems are a real opportunity for our community to cater to
users with different domain and data science expertise. We introduce INODE --
an end-to-end data exploration system -- that leverages, on the one hand,
Machine Learning and, on the other hand, semantics for the purpose of Data
Management (DM). Our vision is to develop a classic unified, comprehensive
platform that provides extensive access to open datasets, and we demonstrate it
in three significant use cases in the fields of Cancer Biomarker Reearch,
Research and Innovation Policy Making, and Astrophysics. INODE offers
sustainable services in (a) data modeling and linking, (b) integrated query
processing using natural language, (c) guidance, and (d) data exploration
through visualization, thus facilitating the user in discovering new insights.
We demonstrate that our system is uniquely accessible to a wide range of users
from larger scientific communities to the public. Finally, we briefly
illustrate how this work paves the way for new research opportunities in DM.
Related papers
- DISCOVER: A Data-driven Interactive System for Comprehensive Observation, Visualization, and ExploRation of Human Behaviour [6.716560115378451]
We introduce a modular, flexible, yet user-friendly software framework specifically developed to streamline computational-driven data exploration for human behavior analysis.
Our primary objective is to democratize access to advanced computational methodologies, thereby enabling researchers across disciplines to engage in detailed behavioral analysis without the need for extensive technical proficiency.
arXiv Detail & Related papers (2024-07-18T11:28:52Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - Data-driven Discovery with Large Generative Models [47.324203863823335]
This position paper urges the Machine Learning (ML) community to exploit the capabilities of large generative models (LGMs)
We demonstrate how LGMs fulfill several desideratas for an ideal data-driven discovery system.
We advocate for fail-proof tool integration, along with active user moderation through feedback mechanisms.
arXiv Detail & Related papers (2024-02-21T08:26:43Z) - Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data.
We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z) - Conversational Data Exploration: A Game-Changer for Designing Data
Science Pipelines [3.63971675629768]
This paper proposes a conversational approach implemented by the system Chatin for driving an intuitive data exploration experience.
Chatin is a cutting-edge tool that democratises access to AI-driven solutions, empowering non-technical users from various disciplines to explore data and extract knowledge from it.
arXiv Detail & Related papers (2023-11-12T00:22:09Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - SGED: A Benchmark dataset for Performance Evaluation of Spiking Gesture
Emotion Recognition [12.396844568607522]
We label a new homogeneous multimodal gesture emotion recognition dataset based on the analysis of the existing data sets.
We propose a pseudo dual-flow network based on this dataset, and verify the application potential of this dataset in the affective computing community.
arXiv Detail & Related papers (2023-04-28T09:32:09Z) - Learn to Explore: on Bootstrapping Interactive Data Exploration with
Meta-learning [8.92180350317399]
We propose a learning-to-explore framework, based on meta-learning, which learns how to learn a classifier with automatically generated meta-tasks.
Our proposal outperforms existing explore-by-example solutions in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-12-07T03:12:41Z) - Vision+X: A Survey on Multimodal Learning in the Light of Data [64.03266872103835]
multimodal machine learning that incorporates data from various sources has become an increasingly popular research area.
We analyze the commonness and uniqueness of each data format mainly ranging from vision, audio, text, and motions.
We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels.
arXiv Detail & Related papers (2022-10-05T13:14:57Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.