Please login first
Application of Molecular Similarity and Artificial Neural Networks for PD-L1 inhibitors Virtual Screening
, , , , , , , , *
1  Faculty of Pharmacy, University of Medicine and Pharmacy at Ho Chi Minh City, 41-43 Dinh Tien Hoang street, Ho Chi Minh city, Vietnam
Academic Editor: Osvaldo Santos-Filho


Purpose This study aimed to develop a neural network model (ANN) and molecular similarity (MS) to screen PD-L1 inhibitors, which helps the immune system reactivate tumor destruction.

Methods This research collected 2,044 substances from Google Patents, splitting them into training, validation, and test sets. These sets were used to build MS and ANN models. MS model utilized five fingerprints (AVALON, MACCS, ECFP4, RDK5, and MAP4), and BMS-1166 was a query molecule. The decoys were generated by the DeepCoy library. The ANN model was employed using SECFP fingerprint. A support vector machine (SVC) and Random Forest (RF) were implemented in a benchmarking analysis, which was based on Wilcoxon signed rank test, to compare performance with the ANN model. F1 score and average precision were evaluation metrics due to the imbalanced dataset. Subsequently, 15235 compounds from the Drugbank database underwent screening through medicinal chemistry filters, MS, and the ANN model.

Results The decoy generation achieved promising results, with AUC-ROC 1NN of 0.52, AUC-ROC RF of 0.65, Doppelganger scores mean of 0.24, and Doppelganger scores max of 0.346, indicating that the decoys closely resemble the active set. In MS establishment, the AVALON fingerprint was the best nominee for similarity searching, with EF1% of 10.99%, AUC-ROC of 0.963, and a similarity threshold of 0.32. The ANN model attained average precision of 0.863±0.032 and F1 score of 0.745±0.039 in cross-validation, higher than those of the SVC and RF models, although without a significant difference. In external evaluation, the ANN model exhibited average precision of 0.854 and F1 score of 0.799, also higher than those of the SVC and RF models. Finally, only 7 molecules from the Drugbank database fulfilled three filters, with CHEM431 emerging as the most optimistic candidate, possessing active probability of 75% and Tanimoto coefficient of 0.34.

Conclusions Virtual screening pinpointed CHEM431 as the most potential candidate, suggesting follow-up steps, including molecular docking, molecular dynamics, synthesis, and bioactivity testing.

Keywords: PD-L1 inhibitors, molecular similarity, machine learning, artificial neural network, virtual screening
Comments on this paper
daisy maria
Molecular similarity-based virtual screening involves comparing the structural and physicochemical properties of compounds to identify molecules with similar characteristics to known inhibitors Uno Online