Please login first
Prediction of Antibiotic Activity against Burkholderia cenocepacia Using a Machine Learning Model
1 , 2 , 3 , 1 , 4 , 2, 5 , * 1, 6
1  Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada
2  Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB, Canada
3  Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada
4  Department of Chemistry, University of Manitoba, Winnipeg, MB, Canada
5  Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada
6  Department of Medical Microbiology & Infectious Diseases, University of Manitoba, Winnipeg, Canada

https://doi.org/10.3390/ECA2021-09644 (registering DOI)
Abstract:

A fundamental challenge in antibiotic discovery is finding new bioactive compound classes. Due to the longer timeframe and higher cost associated with conventional approaches, it has become imperative to adopt alternative antibiotic discovery paradigms. Advances in computational processing capacity enabled expanding chemical space. The expanded space allowed generation of vast, chemically diverse virtual compound libraries containing billions of compounds. In this study, we exploited the machine learning (ML) model’s ability to make predictive models and applied it to predict growth inhibitory activity in chemical scaffolds outside the training dataset. We employed a Directed-Message Passing Neural Network (D-MPNN) approach to train binary classification and regression ML models on a high-throughput screening dataset performed against Burkholderia cenocepacia previously in our laboratory. The D-MPNN belongs to Spatial-based Convolutional Graph Neural Networks (ConvGNNs), an end-to-end neural network that generates the graph representation of a molecule after iterative message passing process through molecular bonds. To avoid over-fitting and enhance the accuracy of the prediction, we additionally fed the model with 200 global molecular descriptors. The model was further optimized using Bayesian hyperparameter optimization and ensembling. The trained model attained a receiver operating characteristic curve-area under the curve (ROC-AUC) of 0.823. As a proof of principle, we employed the trained ML model to predict the bioactivity of 1,615 FDA-approved compounds and tested the bioactivity of the top 100 ranked compounds in vitro. We found 17 growth-inhibitory compounds with a linear correlation between the predicted rank and the activity. This work highlights the application of ML approaches to rapidly explore chemically diverse, ultra-large compound libraries and discern potential compounds in an inexpensive fashion, thus increasing the chance to discover early lead compounds.

Keywords: Machine learning, Burkholderia, High throughput screening, D-MPNN

 
 
Top