Apart from the pharmacodynamics of drugs and the resistance of the Plasmodium falciparum parasite to existing antimalarial drugs, pharmacokinetic-related properties of drugs also hamper their translation. The need to develop novel drugs with optimum solubility profiles necessitated the training of an efficient machine learning regression model for the prediction of the solubility of a series of compounds. Four descriptors: octanol-water partition coefficient, molecular weight, number of rotatable bonds and aromatic proportion from the simplified molecular-input line-entry system (SMILES) of 11,478 antiplasmodial molecules were used. This was trained using five regression models; multiple linear regression, k-nearest neighbors, LASSO regression, support vector regressor and random forest regressor (RFR)) to predict the solubility of molecules. The evaluation metrics (R2, mean squared error (MSE), mean absolute error (MAE) and root mean squared error (RMSE)) were used to assess the model performance. Of the performed algorithms, the RFR produced a robust model with model statistics of MSE 0.54, R2 0.85, MAE 0.41 and RMSE 0.73. The F-statistic for the model was 7214, showing a strong correlation between the descriptors and solubility of molecules. This could efficiently predict the antimalarial activity for untested molecules to select promising ligands as leads for further optimization.
Previous Article in event
Previous Article in session
Next Article in event
Next Article in session
A Robust SMILES-Based Prediction of Aqueous Solubility of Diverse Antiplasmodial Compounds using Machine Learning Algorithms
Published:
01 November 2023
by MDPI
in 9th International Electronic Conference on Medicinal Chemistry
session Emerging technologies in drug discovery
https://doi.org/10.3390/ECMC2023-15697
(registering DOI)
Abstract:
Keywords: Antimalarial; machine learning; molecule descriptors; regression models; solubility