Machine Learning Approach for Cancer Entities Association and
Classification
- URL: http://arxiv.org/abs/2306.00013v2
- Date: Sat, 24 Jun 2023 07:12:20 GMT
- Title: Machine Learning Approach for Cancer Entities Association and
Classification
- Authors: G. Jeyakodi, Arkadeep Pal, Debapratim Gupta, K. Sarukeswari, V. Amouda
- Abstract summary: The study uses the two most non-trivial NLP, Natural Language Processing functions, Entity Recognition, and text classification to discover knowledge from biomedical literature.
Named Entity Recognition (NER) recognizes and extracts the predefined entities related to cancer from unstructured text with the support of a user-friendly interface and built-in dictionaries.
Text classification helps to explore the insights into the text and simplifies data categorization, querying, and article screening.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: According to the World Health Organization (WHO), cancer is the second
leading cause of death globally. Scientific research on different types of
cancers grows at an ever-increasing rate, publishing large volumes of research
articles every year. The insight information and the knowledge of the drug,
diagnostics, risk, symptoms, treatments, etc., related to genes are significant
factors that help explore and advance the cancer research progression. Manual
screening of such a large volume of articles is very laborious and
time-consuming to formulate any hypothesis. The study uses the two most
non-trivial NLP, Natural Language Processing functions, Entity Recognition, and
text classification to discover knowledge from biomedical literature. Named
Entity Recognition (NER) recognizes and extracts the predefined entities
related to cancer from unstructured text with the support of a user-friendly
interface and built-in dictionaries. Text classification helps to explore the
insights into the text and simplifies data categorization, querying, and
article screening. Machine learning classifiers are also used to build the
classification model and Structured Query Languages (SQL) is used to identify
the hidden relations that may lead to significant predictions.
Related papers
- Diagnostic Reasoning in Natural Language: Computational Model and Application [68.47402386668846]
We investigate diagnostic abductive reasoning (DAR) in the context of language-grounded tasks (NL-DAR)
We propose a novel modeling framework for NL-DAR based on Pearl's structural causal models.
We use the resulting dataset to investigate the human decision-making process in NL-DAR.
arXiv Detail & Related papers (2024-09-09T06:55:37Z) - Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge [2.2814097119704058]
Large language models (LLMs) are transforming the way information is retrieved with vast amounts of knowledge being summarized and presented.
LLMs are prone to highlight the most frequently seen pieces of information from the training set and to neglect the rare ones.
We introduce a novel information-retrieval method that leverages a knowledge graph to downsample these clusters and mitigate the information overload problem.
arXiv Detail & Related papers (2024-02-19T18:31:11Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - From Large Language Models to Knowledge Graphs for Biomarker Discovery
in Cancer [0.9437165725355702]
A challenging scenarios for artificial intelligence (AI) is using biomedical data to provide diagnosis and treatment recommendations for cancerous conditions.
A large-scale knowledge graph (KG) can be constructed by integrating and extracting facts about semantically interrelated entities and relations.
In this paper, we develop a domain KG to leverage cancer-specific biomarker discovery and interactive QA.
arXiv Detail & Related papers (2023-10-12T14:36:13Z) - Detecting Throat Cancer from Speech Signals using Machine Learning: A Scoping Literature Review [0.30723404270319693]
Artificial intelligence (AI) and machine learning (ML) have the potential to detect throat cancer from patient speech.
Cases of throat cancer are rising worldwide.
No comprehensive review has explored the use of AI and ML for detecting throat cancer from speech.
arXiv Detail & Related papers (2023-07-18T13:06:17Z) - Data-Driven Information Extraction and Enrichment of Molecular Profiling
Data for Cancer Cell Lines [1.1999555634662633]
This work presents the design, implementation and application of a novel data extraction and exploration system.
We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities.
Our system is publicly available on the web at https://cancercelllines.org.
arXiv Detail & Related papers (2023-07-03T11:15:42Z) - Understanding Breast Cancer Survival: Using Causality and Language
Models on Multi-omics Data [23.850817918011863]
We exploit causal discovery algorithms to investigate how perturbations in the genome can affect the survival of patients diagnosed with breast cancer.
Our findings reveal important factors related to the vital status of patients using causal discovery algorithms.
Results are validated through language models trained on biomedical literature.
arXiv Detail & Related papers (2023-05-28T17:07:46Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark.
It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification.
We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.