Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models
- URL: http://arxiv.org/abs/2405.06724v4
- Date: Fri, 06 Jun 2025 07:46:24 GMT
- Title: Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models
- Authors: Lun Ai, Stephen H. Muggleton, Shi-Shun Liang, Geoff S. Baldwin,
- Abstract summary: Genome-scale metabolic network models (GEMs) are widely used to evaluate genetic engineering of biological systems.<n>GEMs often fail to accurately predict the behaviour of genetically engineered cells, primarily due to incomplete annotations of gene interactions.<n>We develop a new system, $BMLP_active$, which guides cost-effective experimentation and uses interpretable logic programs to encode a state-of-the-art GEM of a model bacterial organism.
- Score: 4.762323642506732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reasoning about hypotheses and updating knowledge through empirical observations are central to scientific discovery. In this work, we applied logic-based machine learning methods to drive biological discovery by guiding experimentation. Genome-scale metabolic network models (GEMs) - comprehensive representations of metabolic genes and reactions - are widely used to evaluate genetic engineering of biological systems. However, GEMs often fail to accurately predict the behaviour of genetically engineered cells, primarily due to incomplete annotations of gene interactions. The task of learning the intricate genetic interactions within GEMs presents computational and empirical challenges. To efficiently predict using GEM, we describe a novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging Boolean matrices to evaluate large logic programs. We developed a new system, $BMLP_{active}$, which guides cost-effective experimentation and uses interpretable logic programs to encode a state-of-the-art GEM of a model bacterial organism. Notably, $BMLP_{active}$ successfully learned the interaction between a gene pair with fewer training examples than random experimentation, overcoming the increase in experimental design space. $BMLP_{active}$ enables rapid optimisation of metabolic models to reliably engineer biological systems for producing useful compounds. It offers a realistic approach to creating a self-driving lab for biological discovery, which would then facilitate microbial engineering for practical applications.
Related papers
- BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning [49.487327661584686]
We introduce BioMaze, a dataset with 5.1K complex pathway problems from real research.
Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning.
To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation.
arXiv Detail & Related papers (2025-02-23T17:38:10Z) - GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.<n>Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks.<n>It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - Active learning of digenic functions with boolean matrix logic programming [4.762323642506732]
We apply logic-based machine learning techniques to facilitate cellular engineering and drive biological discovery.
Learn the intricate genetic interactions within genome-scale metabolic network models.
A new system, $BMLP_active$, efficiently explores the genomic hypothesis space by guiding informative experimentation.
arXiv Detail & Related papers (2024-08-19T18:47:07Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - A Review of Neuroscience-Inspired Machine Learning [58.72729525961739]
Bio-plausible credit assignment is compatible with practically any learning condition and is energy-efficient.
In this paper, we survey several vital algorithms that model bio-plausible rules of credit assignment in artificial neural networks.
We conclude by discussing the future challenges that will need to be addressed in order to make such algorithms more useful in practical applications.
arXiv Detail & Related papers (2024-02-16T18:05:09Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - Human Comprehensible Active Learning of Genome-Scale Metabolic Networks [7.838090421892651]
A comprehensible machine learning approach that efficiently explores the hypothesis space and guides experimental design is urgently needed.
We introduce a novel machine learning framework ILP-iML1515 based on Inductive Logic Programming (ILP)
ILP-iML1515 is built on comprehensible logical representations of a genome-scale metabolic model and can update the model by learning new logical structures from auxotrophic mutant trials.
arXiv Detail & Related papers (2023-08-24T12:42:00Z) - An Optimal Likelihood Free Method for Biological Model Selection [0.0]
Systems biology seeks to create math models of biological systems to reduce inherent biological complexity.
We present an algorithm for automated biological model selection using mathematical models of systems biology and likelihood free inference methods.
arXiv Detail & Related papers (2022-08-03T21:05:20Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - Evolving spiking neuron cellular automata and networks to emulate in
vitro neuronal activity [0.0]
We produce spiking neural systems that emulate the patterns of behavior of biological neurons in vitro.
Our models were able to produce a level of network-wide synchrony.
The genomes of the top-performing models indicate the excitability and density of connections in the model play an important role in determining the complexity of the produced activity.
arXiv Detail & Related papers (2021-10-15T17:55:04Z) - Mapping and Validating a Point Neuron Model on Intel's Neuromorphic
Hardware Loihi [77.34726150561087]
We investigate the potential of Intel's fifth generation neuromorphic chip - Loihi'
Loihi is based on the novel idea of Spiking Neural Networks (SNNs) emulating the neurons in the brain.
We find that Loihi replicates classical simulations very efficiently and scales notably well in terms of both time and energy performance as the networks get larger.
arXiv Detail & Related papers (2021-09-22T16:52:51Z) - Automated Biodesign Engineering by Abductive Meta-Interpretive Learning [8.788941848262786]
We propose an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning ($Meta_Abd$)
In this work, we propose an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning ($Meta_Abd$)
arXiv Detail & Related papers (2021-05-17T12:10:26Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.