Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models
- URL: http://arxiv.org/abs/2405.06724v3
- Date: Sun, 11 Aug 2024 17:54:22 GMT
- Title: Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models
- Authors: Lun Ai, Stephen H. Muggleton, Shi-Shun Liang, Geoff S. Baldwin,
- Abstract summary: We seek to apply logic-based machine learning techniques to facilitate cellular engineering and drive biological discovery.
We introduce a new system, $BMLP_active$, which efficiently explores the genomic hypothesis space by guiding informative experimentation.
$BMLP_active$ can successfully learn the interaction between a gene pair with fewer training examples than random experimentation.
- Score: 4.762323642506732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Techniques to autonomously drive research have been prominent in Computational Scientific Discovery, while Synthetic Biology is a field of science that focuses on designing and constructing new biological systems for useful purposes. Here we seek to apply logic-based machine learning techniques to facilitate cellular engineering and drive biological discovery. Comprehensive databases of metabolic processes called genome-scale metabolic network models (GEMs) are often used to evaluate cellular engineering strategies to optimise target compound production. However, predicted host behaviours are not always correctly described by GEMs, often due to errors in the models. The task of learning the intricate genetic interactions within GEMs presents computational and empirical challenges. To address these, we describe a novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging boolean matrices to evaluate large logic programs. We introduce a new system, $BMLP_{active}$, which efficiently explores the genomic hypothesis space by guiding informative experimentation through active learning. In contrast to sub-symbolic methods, $BMLP_{active}$ encodes a state-of-the-art GEM of a widely accepted bacterial host in an interpretable and logical representation using datalog logic programs. Notably, $BMLP_{active}$ can successfully learn the interaction between a gene pair with fewer training examples than random experimentation, overcoming the increase in experimental design space. $BMLP_{active}$ enables rapid optimisation of metabolic models to reliably engineer biological systems for producing useful compounds. It offers a realistic approach to creating a self-driving lab for microbial engineering.
Related papers
- Active learning of digenic functions with boolean matrix logic programming [4.762323642506732]
We apply logic-based machine learning techniques to facilitate cellular engineering and drive biological discovery.
Learn the intricate genetic interactions within genome-scale metabolic network models.
A new system, $BMLP_active$, efficiently explores the genomic hypothesis space by guiding informative experimentation.
arXiv Detail & Related papers (2024-08-19T18:47:07Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - A Review of Neuroscience-Inspired Machine Learning [58.72729525961739]
Bio-plausible credit assignment is compatible with practically any learning condition and is energy-efficient.
In this paper, we survey several vital algorithms that model bio-plausible rules of credit assignment in artificial neural networks.
We conclude by discussing the future challenges that will need to be addressed in order to make such algorithms more useful in practical applications.
arXiv Detail & Related papers (2024-02-16T18:05:09Z) - Human Comprehensible Active Learning of Genome-Scale Metabolic Networks [7.838090421892651]
A comprehensible machine learning approach that efficiently explores the hypothesis space and guides experimental design is urgently needed.
We introduce a novel machine learning framework ILP-iML1515 based on Inductive Logic Programming (ILP)
ILP-iML1515 is built on comprehensible logical representations of a genome-scale metabolic model and can update the model by learning new logical structures from auxotrophic mutant trials.
arXiv Detail & Related papers (2023-08-24T12:42:00Z) - An Optimal Likelihood Free Method for Biological Model Selection [0.0]
Systems biology seeks to create math models of biological systems to reduce inherent biological complexity.
We present an algorithm for automated biological model selection using mathematical models of systems biology and likelihood free inference methods.
arXiv Detail & Related papers (2022-08-03T21:05:20Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - Evolving spiking neuron cellular automata and networks to emulate in
vitro neuronal activity [0.0]
We produce spiking neural systems that emulate the patterns of behavior of biological neurons in vitro.
Our models were able to produce a level of network-wide synchrony.
The genomes of the top-performing models indicate the excitability and density of connections in the model play an important role in determining the complexity of the produced activity.
arXiv Detail & Related papers (2021-10-15T17:55:04Z) - Mapping and Validating a Point Neuron Model on Intel's Neuromorphic
Hardware Loihi [77.34726150561087]
We investigate the potential of Intel's fifth generation neuromorphic chip - Loihi'
Loihi is based on the novel idea of Spiking Neural Networks (SNNs) emulating the neurons in the brain.
We find that Loihi replicates classical simulations very efficiently and scales notably well in terms of both time and energy performance as the networks get larger.
arXiv Detail & Related papers (2021-09-22T16:52:51Z) - Automated Biodesign Engineering by Abductive Meta-Interpretive Learning [8.788941848262786]
We propose an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning ($Meta_Abd$)
In this work, we propose an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning ($Meta_Abd$)
arXiv Detail & Related papers (2021-05-17T12:10:26Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.