ChatCell: Facilitating Single-Cell Analysis with Natural Language
- URL: http://arxiv.org/abs/2402.08303v4
- Date: Tue, 20 Feb 2024 02:26:39 GMT
- Title: ChatCell: Facilitating Single-Cell Analysis with Natural Language
- Authors: Yin Fang, Kangwei Liu, Ningyu Zhang, Xinle Deng, Penghui Yang, Zhuo
Chen, Xiangru Tang, Mark Gerstein, Xiaohui Fan, Huajun Chen
- Abstract summary: ChatCell is a tool for facilitating single-cell analysis with natural language.
ChatCell has acquired profound expertise in single-cell biology.
Our project homepage is available at https://zjunlp.io/project/ChatCell.
- Score: 40.4429032376233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As Large Language Models (LLMs) rapidly evolve, their influence in science is
becoming increasingly prominent. The emerging capabilities of LLMs in task
generalization and free-form dialogue can significantly advance fields like
chemistry and biology. However, the field of single-cell biology, which forms
the foundational building blocks of living organisms, still faces several
challenges. High knowledge barriers and limited scalability in current methods
restrict the full exploitation of LLMs in mastering single-cell data, impeding
direct accessibility and rapid iteration. To this end, we introduce ChatCell,
which signifies a paradigm shift by facilitating single-cell analysis with
natural language. Leveraging vocabulary adaptation and unified sequence
generation, ChatCell has acquired profound expertise in single-cell biology and
the capability to accommodate a diverse range of analysis tasks. Extensive
experiments further demonstrate ChatCell's robust performance and potential to
deepen single-cell insights, paving the way for more accessible and intuitive
exploration in this pivotal field. Our project homepage is available at
https://zjunlp.github.io/project/ChatCell.
Related papers
- How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities [46.671834972945874]
We propose a vision of leveraging advances in AI to construct virtual cells.
We discuss desired capabilities of such AI Virtual Cells, including generating universal representations of biological entities.
We envision a future where AI Virtual Cells help identify new drug targets, predict cellular responses to perturbations, as well as scale hypothesis exploration.
arXiv Detail & Related papers (2024-09-18T02:41:50Z) - LangCell: Language-Cell Pre-training for Cell Identity Understanding [3.6518971609937068]
We introduce LangCell, a unified representation of single-cell data and natural language during the pre-training phase.
Results show that LangCell is the only single-cell PLM that can work effectively in zero-shot cell identity understanding scenarios.
arXiv Detail & Related papers (2024-05-09T10:04:05Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Prediction of Cellular Identities from Trajectory and Cell Fate
Information [0.40964539027092917]
We propose an innovative approach to cell identification during early $textitC. elegansgenesis using machine learning.
We employ random forest, embryo, and LSTM models, and tested cell classification accuracy on 3D time-lapse datasets spanning the first 4 hours of embryogenesis.
Our research demonstrates the success of predicting cell identities in time-lapse imaging sequences directly from simple spatial-temporal features.
arXiv Detail & Related papers (2024-01-11T03:28:13Z) - RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence
Learning [75.61681328968714]
We propose recurrent independent Grid LSTM (RigLSTM) to exploit the underlying modular structure of the target task.
Our model adopts cell selection, input feature selection, hidden state selection, and soft state updating to achieve a better generalization ability.
arXiv Detail & Related papers (2023-11-03T07:40:06Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - Revolutionizing Single Cell Analysis: The Power of Large Language Models
for Cell Type Annotation [0.0]
Large language models such as ChatGPT and New Bing provide accurate annotations of cell types.
By using ChatGPT to annotate single cell data, we can relate rare cell type to their function.
This can have important applications in understanding cancer progression, mammalian development, and stem cell differentiation.
arXiv Detail & Related papers (2023-04-05T18:45:54Z) - Deep Learning in Single-Cell Analysis [34.08722045363822]
Single-cell technologies are revolutionizing the entire field of biology.
Deep learning often demonstrates superior performance compared to traditional machine learning methods.
This survey will serve as a reference for biologists and computer scientists, encouraging collaborations.
arXiv Detail & Related papers (2022-10-22T08:26:41Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.