Hierarchy exploitation to detect missing annotations on hierarchical
multi-label classification
- URL: http://arxiv.org/abs/2207.06237v1
- Date: Wed, 13 Jul 2022 14:32:50 GMT
- Title: Hierarchy exploitation to detect missing annotations on hierarchical
multi-label classification
- Authors: Miguel Romero, Felipe Kenji Nakano, Jorge Finke, Camilo Rocha, Celine
Vens
- Abstract summary: We present a method to detect missing annotations in hierarchical multi-label classification datasets.
We propose a method that exploits the class hierarchy by computing aggregated probabilities to the paths of classes from the leaves to the root for each instance.
The experiments on Oriza sativa Japonica, a variety of rice, showcase that incorporating the hierarchy of classes into the method often improves the predictive performance.
- Score: 0.1749935196721634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The availability of genomic data has grown exponentially in the last decade,
mainly due to the development of new sequencing technologies. Based on the
interactions between genes (and gene products) extracted from the increasing
genomic data, numerous studies have focused on the identification of
associations between genes and functions. While these studies have shown great
promise, the problem of annotating genes with functions remains an open
challenge. In this work, we present a method to detect missing annotations in
hierarchical multi-label classification datasets. We propose a method that
exploits the class hierarchy by computing aggregated probabilities to the paths
of classes from the leaves to the root for each instance. The proposed method
is presented in the context of predicting missing gene function annotations,
where these aggregated probabilities are further used to select a set of
annotations to be verified through in vivo experiments. The experiments on
Oriza sativa Japonica, a variety of rice, showcase that incorporating the
hierarchy of classes into the method often improves the predictive performance
and our proposed method yields superior results when compared to competitor
methods from the literature.
Related papers
- Robust Multi-view Co-expression Network Inference [8.697303234009528]
Inferring gene co-expression networks from transcriptome data presents many challenges.
We introduce a robust method for high-dimensional graph inference from multiple independent studies.
arXiv Detail & Related papers (2024-09-30T06:30:09Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - A Comparative Analysis of Gene Expression Profiling by Statistical and
Machine Learning Approaches [1.8954222800767324]
We discuss the biological and the methodological limitations of machine learning models to classify cancer samples.
Gene rankings are obtained from explainability methods adapted to these models.
We observe that the information learned by black-box neural networks is related to the notion of differential expression.
arXiv Detail & Related papers (2024-02-01T18:17:36Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Feature extraction using Spectral Clustering for Gene Function
Prediction [0.4492444446637856]
This paper presents a novel in silico approach for to the annotation problem that combines cluster analysis and hierarchical multi-label classification.
The proposed approach is applied to a case study on Zea mays, one of the most dominant and productive crops in the world.
arXiv Detail & Related papers (2022-03-25T10:17:36Z) - Self-Supervised Graph Representation Learning for Neuronal Morphologies [75.38832711445421]
We present GraphDINO, a data-driven approach to learn low-dimensional representations of 3D neuronal morphologies from unlabeled datasets.
We show, in two different species and across multiple brain areas, that this method yields morphological cell type clusterings on par with manual feature-based classification by experts.
Our method could potentially enable data-driven discovery of novel morphological features and cell types in large-scale datasets.
arXiv Detail & Related papers (2021-12-23T12:17:47Z) - Multivariate feature ranking of gene expression data [62.997667081978825]
We propose two new multivariate feature ranking methods based on pairwise correlation and pairwise consistency.
We statistically prove that the proposed methods outperform the state of the art feature ranking methods Clustering Variation, Chi Squared, Correlation, Information Gain, ReliefF and Significance.
arXiv Detail & Related papers (2021-11-03T17:19:53Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor
Classification [0.0]
We present a new technique for gene selection using a discernibility matrix of fuzzy-rough sets.
The proposed technique takes into account the similarity of those instances that have the same and different class labels to improve the gene selection results.
Experimental results demonstrate that this technique provides better efficiency compared to the state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-26T13:43:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.