EBIC.JL -- an Efficient Implementation of Evolutionary Biclustering
Algorithm in Julia
- URL: http://arxiv.org/abs/2105.01196v1
- Date: Mon, 3 May 2021 22:30:38 GMT
- Title: EBIC.JL -- an Efficient Implementation of Evolutionary Biclustering
Algorithm in Julia
- Authors: Pawe{\l} Renc, Patryk Orzechowski, Aleksander Byrski, Jaros{\l}aw
W\k{a}s, and Jason H. Moore
- Abstract summary: We introduce EBIC.JL - an implementation of one of the most accurate biclustering algorithms in Julia.
We show that the new version maintains comparable accuracy to its predecessor EBIC while converging faster for the majority of the problems.
- Score: 59.422301529692454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biclustering is a data mining technique which searches for local patterns in
numeric tabular data with main application in bioinformatics. This technique
has shown promise in multiple areas, including development of biomarkers for
cancer, disease subtype identification, or gene-drug interactions among others.
In this paper we introduce EBIC.JL - an implementation of one of the most
accurate biclustering algorithms in Julia, a modern highly parallelizable
programming language for data science. We show that the new version maintains
comparable accuracy to its predecessor EBIC while converging faster for the
majority of the problems. We hope that this open source software in a
high-level programming language will foster research in this promising field of
bioinformatics and expedite development of new biclustering methods for big
data.
Related papers
- From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models [63.188607839223046]
This survey focuses on the benefits of scaling compute during inference.
We explore three areas under a unified mathematical formalism: token-level generation algorithms, meta-generation algorithms, and efficient generation.
arXiv Detail & Related papers (2024-06-24T17:45:59Z) - Hyperdimensional computing: a fast, robust and interpretable paradigm
for biological data [9.094234519404907]
New algorithms for processing diverse biological data sources have revolutionized bioinformatics.
Deep learning has substantially transformed bioinformatics, addressing sequence, structure, and functional analyses.
Hyperdimensional computing has emerged as an intriguing alternative.
arXiv Detail & Related papers (2024-02-27T15:09:20Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate
Nearest Neighbor Search Algorithms [5.478671305092084]
We introduce ParlayANN, a library of deterministic and parallel graph-based approximate nearest neighbor search algorithms.
We develop novel parallel implementations for four state-of-the-art graph-based ANNS algorithms that scale to billion-scale datasets.
arXiv Detail & Related papers (2023-05-07T19:28:23Z) - Clustering with minimum spanning trees: How good can it be? [1.9999259391104391]
We quantify the extent to which minimum spanning trees are meaningful in low-dimensional partitional data clustering tasks.
We review, study, extend, and generalise a few existing, state-of-the-art MST-based partitioning schemes.
Overall, the Genie and the information-theoretic methods often outperform the non-MST algorithms.
arXiv Detail & Related papers (2023-03-10T03:18:03Z) - 2021 BEETL Competition: Advancing Transfer Learning for Subject
Independence & Heterogenous EEG Data Sets [89.84774119537087]
We design two transfer learning challenges around diagnostics and Brain-Computer-Interfacing (BCI)
Task 1 is centred on medical diagnostics, addressing automatic sleep stage annotation across subjects.
Task 2 is centred on Brain-Computer Interfacing (BCI), addressing motor imagery decoding across both subjects and data sets.
arXiv Detail & Related papers (2022-02-14T12:12:20Z) - Bioinspired Cortex-based Fast Codebook Generation [0.09449650062296822]
We introduce a feature extraction method inspired by sensory cortical networks in the brain.
Dubbed as bioinspired cortex, the algorithm provides convergence to features from streaming signals with superior computational efficiency.
We show herein the superior performance of the cortex model in clustering and vector quantization.
arXiv Detail & Related papers (2022-01-28T18:37:43Z) - Improving EEG Decoding via Clustering-based Multi-task Feature Learning [27.318646122939537]
Machine learning provides a promising technique to optimize EEG patterns toward better decoding accuracy.
Existing algorithms do not effectively explore the underlying data structure capturing the true EEG sample distribution.
We propose a clustering-based multi-task feature learning algorithm for improved EEG pattern decoding.
arXiv Detail & Related papers (2020-12-12T13:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.