A user-centered approach to designing an experimental laboratory data
platform
- URL: http://arxiv.org/abs/2007.14443v1
- Date: Tue, 28 Jul 2020 19:26:28 GMT
- Title: A user-centered approach to designing an experimental laboratory data
platform
- Authors: Ha-Kyung Kwon, Chirranjeevi Balaji Gopal, Jared Kirschner, Santiago
Caicedo, and Brian D. Storey
- Abstract summary: We take a user-centered approach to understand what essential elements of design and functionality researchers want in an experimental data platform.
We find that having the capability to contextualize rich, complex experimental datasets is the primary user requirement.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While automated experiments and high-throughput methods are becoming more
mainstream in the age of data, empowering individual researchers to capture,
collate, and contextualize their data faster and more reproducibly still
remains a challenge in science. Despite the abundance of software products to
help digitize and organize scientific information, their broader adoption in
the scientific community has been hindered by the lack of a holistic
understanding of the diverse needs of researchers and their experimental
processes. In this work, we take a user-centered approach to understand what
essential elements of design and functionality researchers (in chemical and
materials science) want in an experimental data platform to address the problem
of data capture in their experimental processes. We found that having the
capability to contextualize rich, complex experimental datasets is the primary
user requirement. We synthesize this and other key findings into design
criteria for a potential solution.
Related papers
- TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents [49.74065769505137]
We introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's ability to perform complete cycles of novel scientific discovery.
It includes 120 different challenge tasks spanning eight topics each with three levels of difficulty and several parametric variations.
We find that strong baseline agents, that perform well in prior published environments, struggle on most DISCOVERYWORLD tasks.
arXiv Detail & Related papers (2024-06-10T20:08:44Z) - To democratize research with sensitive data, we should make synthetic data more accessible [0.7770029179741429]
Erik-Jan van Kesteren argues that in order to progress towards widespread adoption of synthetic data as a privacy enhancing technology, the data science research community should shift focus away from developing better methods.
arXiv Detail & Related papers (2024-04-26T09:18:54Z) - Toward a Team of AI-made Scientists for Scientific Discovery from Gene
Expression Data [9.767546641019862]
We introduce a novel framework, a Team of AI-made Scientists (TAIS), designed to streamline the scientific discovery pipeline.
TAIS comprises simulated roles, including a project manager, data engineer, and domain expert, each represented by a Large Language Model (LLM)
These roles collaborate to replicate the tasks typically performed by data scientists, with a specific focus on identifying disease-predictive genes.
arXiv Detail & Related papers (2024-02-15T06:30:12Z) - Towards Controlled Table-to-Text Generation with Scientific Reasoning [46.87189607486007]
We present a new task for generating fluent and logical descriptions that match user preferences over scientific data, aiming to automate scientific document analysis.
We construct a new challenging dataset,SciTab, consisting of table-description pairs extracted from the scientific literature, with highlighted cells and corresponding domain-specific knowledge base.
The results showed that large models struggle to produce accurate content that aligns with user preferences. As the first of its kind, our work should motivate further research in scientific domains.
arXiv Detail & Related papers (2023-12-08T22:57:35Z) - A Backend Platform for Supporting the Reproducibility of Computational
Experiments [2.1485350418225244]
It is challenging to recreate the same environment using the same frameworks, code, data sources, programming languages, dependencies, and so on.
In this work, we propose an Integrated Development Environment allowing the share, configuration, packaging and execution of an experiment.
We have been able to successfully reproduce 20 (80%) of these experiments achieving the results reported in such works with minimum effort.
arXiv Detail & Related papers (2023-06-29T10:29:11Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - GFlowNets for AI-Driven Scientific Discovery [74.27219800878304]
We present a new probabilistic machine learning framework called GFlowNets.
GFlowNets can be applied in the modeling, hypotheses generation and experimental design stages of the experimental science loop.
We argue that GFlowNets can become a valuable tool for AI-driven scientific discovery.
arXiv Detail & Related papers (2023-02-01T17:29:43Z) - Scientific Machine Learning Benchmarks [0.17205106391379021]
The breakthrough in Deep Learning neural networks has transformed the use of AI and machine learning technologies for the analysis of very large experimental datasets.
Identifying the most appropriate machine learning algorithm for the analysis of any given scientific dataset is still a challenge for scientists.
We describe our approach to the development of scientific machine learning benchmarks and review other approaches to benchmarking scientific machine learning.
arXiv Detail & Related papers (2021-10-25T10:05:11Z) - Federated Learning of Molecular Properties in a Heterogeneous Setting [79.00211946597845]
We introduce federated heterogeneous molecular learning to address these challenges.
Federated learning allows end-users to build a global model collaboratively while preserving the training data distributed over isolated clients.
FedChem should enable a new type of collaboration for improving AI in chemistry that mitigates concerns about valuable chemical data.
arXiv Detail & Related papers (2021-09-15T12:49:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.